On Small Sample Corrections

The fixest R package provides various options for small sample corrections. While it has an excellent vignette on the topic, reproducing its behavior in pyfixest took more time than expected. So that future developers (and my future self) can stay sane, I’ve compiled all of my hard-earned understanding of how small sample adjustments work in fixest and how they are implemented in pyfixest in this document.

In both fixest and pyfixest, small sample corrections are controlled by the ssc function. In pyfixest, ssc accepts four arguments: k_adj, G_adj, k_fixef and G_df.

Based on these inputs, the adjusted variance-covariance matrix is computed as:

vcov_k_adj = adj_val(N, df_k) if k_adj else 1
          * G_adj_val(G, G_df) if G_adj else 1
          * vcov

Where:

Outside of this formula, we have df_t, which is the degrees of freedom used for p-values and confidence intervals:


Small Sample Adjustments

k_adj = True

If k_adj = True, the adjustment factor is:

adj_val = (N - 1) / (N - df_k)

If k_adj = False, no adjustment is applied.

Note that for k_adj = True and heteroskedastic errors, the applied correction is N / (N-k).


k_fixef

The k_fixef argument controls how fixed effects contribute to df_k, and thus to adj_val. It supports three options:

  • "none"
  • "full"
  • "nonnested"

k_fixef = "none"

Fixed effects are ignored when counting parameters:

  • Example:
    • Y ~ X1 | f1k = 1
    • Y ~ X1 + X2 | f1k = 2

k_fixef = "full"

Fixed effects are fully counted. For n_fe total fixed effects and each fixed effect f_i, we set df_k = k + k_fe,

  • If there is more than one fixed effect, we drop one level from each fixed effects except the first (to avoid multicollinearity) k_fe = sum_{i=1}^{n_fe} levels(f_i) - (n_fe - 1)

  • If there is only one fixed effect: k_fe = sum_{i=1}^{n_fe} levels(f_i) = levels(f_1)

k_fixef = "nonnested"

Fixed effects may be nested within cluster variables (e.g., district FEs nested in state clusters). If k_fixef = "nonnested", nested fixed effects do not count toward k_fe:

k_fe = sum_{i=1}^{n_fe} levels(f_i) - k_fe_nested - (n_fe - 1)

where k_fe_nested is the count of nested fixed effects. For cluster fixed effects, k_fe_nested = G, the number of clusters.

⚠️ Note: If you already subtracted a level from a nested FE, you may need to add it back.


G_adj

This argument is only relevant for clustered errors.

If G_adj = True, we apply a second correction:

G_df_val = G / (G - 1)

Where:

  • G is the number of clusters for clustered errors, or N for heteroskedastic errors.
  • This follows the approach in R’s sandwich package, interpreting heteroskedastic errors as “singleton clusters.”

Tip: If G_adj = True for IID errors, G_df_val defaults to 1. For heteroskedastic erros, despite its name, G_adj=True will apply an adjustment of (N-1) / N, as there are \(G = N\) singleton clusters.


G_df

Relevant only for multi-way clustering. Two-way clustering, for example, can be written as:

vcov = ssc_A * vcov_A + ssc_B * vcov_B - ssc_AB * vcov_AB

where A and B are clustering dimensions, with G_AB > G_A > G_B.

  • If G_df = "min", then G is set to the minimum value of G_A, G_B, and G_AB.
  • If G_df = "conventional", each clustering dimension uses its own cluster count (G_A, G_B, etc.) for its respective adjustment.

More on Inference

For computing critical values:

  • OLS and IV: use t-statistics with df_t = N - df_k (non-clustered) or df_t = G - 1 (clustered).
  • GLMs: use z-statistics (normal approximation).

For multi-way clustering:

  • Two-way: df_t = min(G_1 - 1, G_2 - 1)
  • Three-way: df_t = min(G_1 - 1, G_2 - 1, G_3 - 1) (not currently supported)

See this implementation for details.

In Code

All of the above logic is implemented here.