On Small Sample Corrections

The fixest R package provides various options for small sample corrections. While it has an excellent vignette on the topic, reproducing its behavior in pyfixest took more time than expected. So that future developers (and my future self) can stay sane, I’ve compiled all of my hard-earned understanding of how small sample adjustments work in fixest and how they are implemented in pyfixest in this document.

In both fixest and pyfixest, small sample corrections are controlled by the ssc function. In pyfixest, ssc accepts four arguments: adj, cluster_adj, fixef_k and cluster_df.

Based on these inputs, the adjusted variance-covariance matrix is computed as:

vcov_adj = adj_val(N, dof_k) if adj else 1
          * cluster_adj_val(G, cluster_df) if cluster_adj else 1
          * vcov

Where:

Outside of this formula, we have df_t, which is the degrees of freedom used for p-values and confidence intervals:


Small Sample Adjustments

adj = True

If adj = True, the adjustment factor is:

adj_val = (N - 1) / (N - dof_k)

If adj = False, no adjustment is applied.


fixef_k

The fixef_k argument controls how fixed effects contribute to dof_k, and thus to adj_val. It supports three options:

  • "none"
  • "full"
  • "nested"

fixef_k = "none"

Fixed effects are ignored when counting parameters:

  • Example:
    • Y ~ X1 | f1k = 1
    • Y ~ X1 + X2 | f1k = 2

fixef_k = "full"

Fixed effects are fully counted. For n_fe total fixed effects and each fixed effect f_i, we set dof_k = k + k_fe,

  • If there is more than one fixed effect, we drop one level from each fixed effects except the first (to avoid multicollinearity) k_fe = sum_{i=1}^{n_fe} levels(f_i) - (n_fe - 1)

  • If there is only one fixed effect: k_fe = sum_{i=1}^{n_fe} levels(f_i) = levels(f_1)

fixef_k = "nested"

Fixed effects may be nested within cluster variables (e.g., district FEs nested in state clusters). If fixef_k = "nested", nested fixed effects do not count toward k_fe:

k_fe = sum_{i=1}^{n_fe} levels(f_i) - k_fe_nested - (n_fe - 1)

where k_fe_nested is the count of nested fixed effects. For cluster fixed effects, k_fe_nested = G, the number of clusters.

⚠️ Note: If you already subtracted a level from a nested FE, you may need to add it back.


cluster_adj

If cluster_adj = True, we apply a second correction:

cluster_df_val = G / (G - 1)

Where:

  • G is the number of clusters for clustered errors, or N for heteroskedastic errors.
  • This follows the approach in R’s sandwich package, interpreting heteroskedastic errors as “singleton clusters.”

Tip: If cluster_adj = True for IID errors, cluster_df_val defaults to 1. For heteroskedastic erros, despite its name, cluster_adj=True will apply an adjustment of (N-1) / N, as there are \(G = N\) singleton clusters.


cluster_df

Relevant only for multi-way clustering. Two-way clustering, for example, can be written as:

vcov = ssc_A * vcov_A + ssc_B * vcov_B - ssc_AB * vcov_AB

where A and B are clustering dimensions, with G_AB > G_A > G_B.

  • If cluster_df = "min", then G is set to the minimum value of G_A, G_B, and G_AB.
  • If cluster_df = "conventional", each clustering dimension uses its own cluster count (G_A, G_B, etc.) for its respective adjustment.

More on Inference

For computing critical values:

  • OLS and IV: use t-statistics with df_t = N - dof_k (non-clustered) or df_t = G - 1 (clustered).
  • GLMs: use z-statistics (normal approximation).

For multi-way clustering:

  • Two-way: df_t = min(G_1 - 1, G_2 - 1)
  • Three-way: df_t = min(G_1 - 1, G_2 - 1, G_3 - 1) (not currently supported)

See this implementation for details.

In Code

All of the above logic is implemented here.