On Small Sample Corrections
The fixest R package provides various options for small sample corrections. While it has an excellent vignette on the topic, reproducing its behavior in pyfixest took more time than expected. So that future developers (and my future self) can stay sane, I’ve compiled all of my hard-earned understanding of how small sample adjustments work in fixest and how they are implemented in pyfixest in this document.
In both fixest and pyfixest, small sample corrections are controlled by the ssc function. In pyfixest, ssc accepts four arguments: adj, cluster_adj, fixef_k and cluster_df.
Based on these inputs, the adjusted variance-covariance matrix is computed as:
vcov_adj = adj_val(N, dof_k) if adj else 1
* cluster_adj_val(G, cluster_df) if cluster_adj else 1
* vcov
Where:
adj: Enables or disables the first scalar adjustment.cluster_adj: Enables or disables the second scalar adjustment.vcov: The unadjusted variance-covariance matrix.dof_k: The number of estimated parameters considered in the first adjustment. Impactsadj_val.fixef_k: Determines howdof_kis computed (how fixed effects are counted).cluster_df: Determines howcluster_adj_valis computed (only relevant for multi-way clustering).G: The number of unique clusters (G = Nfor heteroskedastic errors).
Outside of this formula, we have df_t, which is the degrees of freedom used for p-values and confidence intervals:
df_t = N - dof_kfor IID or heteroskedastic errors.df_t = G - 1for clustered errors.
Small Sample Adjustments
adj = True
If adj = True, the adjustment factor is:
adj_val = (N - 1) / (N - dof_k)
If adj = False, no adjustment is applied.
fixef_k
The fixef_k argument controls how fixed effects contribute to dof_k, and thus to adj_val. It supports three options:
"none""full""nested"
fixef_k = "none"
Fixed effects are ignored when counting parameters:
- Example:
Y ~ X1 | f1→k = 1Y ~ X1 + X2 | f1→k = 2
fixef_k = "full"
Fixed effects are fully counted. For n_fe total fixed effects and each fixed effect f_i, we set dof_k = k + k_fe,
If there is more than one fixed effect, we drop one level from each fixed effects except the first (to avoid multicollinearity)
k_fe = sum_{i=1}^{n_fe} levels(f_i) - (n_fe - 1)If there is only one fixed effect:
k_fe = sum_{i=1}^{n_fe} levels(f_i) = levels(f_1)
fixef_k = "nested"
Fixed effects may be nested within cluster variables (e.g., district FEs nested in state clusters). If fixef_k = "nested", nested fixed effects do not count toward k_fe:
k_fe = sum_{i=1}^{n_fe} levels(f_i) - k_fe_nested - (n_fe - 1)
where k_fe_nested is the count of nested fixed effects. For cluster fixed effects, k_fe_nested = G, the number of clusters.
⚠️ Note: If you already subtracted a level from a nested FE, you may need to add it back.
cluster_adj
If cluster_adj = True, we apply a second correction:
cluster_df_val = G / (G - 1)
Where:
Gis the number of clusters for clustered errors, orNfor heteroskedastic errors.- This follows the approach in R’s
sandwichpackage, interpreting heteroskedastic errors as “singleton clusters.”
Tip: If
cluster_adj = Truefor IID errors,cluster_df_valdefaults to1. For heteroskedastic erros, despite its name,cluster_adj=Truewill apply an adjustment of (N-1) / N, as there are \(G = N\) singleton clusters.
cluster_df
Relevant only for multi-way clustering. Two-way clustering, for example, can be written as:
vcov = ssc_A * vcov_A + ssc_B * vcov_B - ssc_AB * vcov_AB
where A and B are clustering dimensions, with G_AB > G_A > G_B.
- If
cluster_df = "min", then G is set to the minimum value ofG_A,G_B, andG_AB. - If
cluster_df = "conventional", each clustering dimension uses its own cluster count (G_A,G_B, etc.) for its respective adjustment.
More on Inference
For computing critical values:
- OLS and IV: use t-statistics with
df_t = N - dof_k(non-clustered) ordf_t = G - 1(clustered). - GLMs: use z-statistics (normal approximation).
For multi-way clustering:
- Two-way:
df_t = min(G_1 - 1, G_2 - 1) - Three-way:
df_t = min(G_1 - 1, G_2 - 1, G_3 - 1)(not currently supported)
See this implementation for details.
In Code
All of the above logic is implemented here.