On Small Sample Corrections
The fixest
R package provides various options for small sample corrections. While it has an excellent vignette on the topic, reproducing its behavior in pyfixest
took more time than expected. So that future developers (and my future self) can stay sane, I’ve compiled all of my hard-earned understanding of how small sample adjustments work in fixest
and how they are implemented in pyfixest
in this document.
In both fixest
and pyfixest
, small sample corrections are controlled by the ssc
function. In pyfixest
, ssc
accepts four arguments: adj
, cluster_adj
, fixef_k
and cluster_df
.
Based on these inputs, the adjusted variance-covariance matrix is computed as:
vcov_adj = adj_val(N, dof_k) if adj else 1
* cluster_adj_val(G, cluster_df) if cluster_adj else 1
* vcov
Where:
adj
: Enables or disables the first scalar adjustment.cluster_adj
: Enables or disables the second scalar adjustment.vcov
: The unadjusted variance-covariance matrix.dof_k
: The number of estimated parameters considered in the first adjustment. Impactsadj_val
.fixef_k
: Determines howdof_k
is computed (how fixed effects are counted).cluster_df
: Determines howcluster_adj_val
is computed (only relevant for multi-way clustering).G
: The number of unique clusters (G = N
for heteroskedastic errors).
Outside of this formula, we have df_t
, which is the degrees of freedom used for p-values and confidence intervals:
df_t = N - dof_k
for IID or heteroskedastic errors.df_t = G - 1
for clustered errors.
Small Sample Adjustments
adj = True
If adj = True
, the adjustment factor is:
adj_val = (N - 1) / (N - dof_k)
If adj = False
, no adjustment is applied.
fixef_k
The fixef_k
argument controls how fixed effects contribute to dof_k
, and thus to adj_val
. It supports three options:
"none"
"full"
"nested"
fixef_k = "none"
Fixed effects are ignored when counting parameters:
- Example:
Y ~ X1 | f1
→k = 1
Y ~ X1 + X2 | f1
→k = 2
fixef_k = "full"
Fixed effects are fully counted. For n_fe
total fixed effects and each fixed effect f_i
, we set dof_k = k + k_fe
,
If there is more than one fixed effect, we drop one level from each fixed effects except the first (to avoid multicollinearity)
k_fe = sum_{i=1}^{n_fe} levels(f_i) - (n_fe - 1)
If there is only one fixed effect:
k_fe = sum_{i=1}^{n_fe} levels(f_i) = levels(f_1)
fixef_k = "nested"
Fixed effects may be nested within cluster variables (e.g., district FEs nested in state clusters). If fixef_k = "nested"
, nested fixed effects do not count toward k_fe
:
k_fe = sum_{i=1}^{n_fe} levels(f_i) - k_fe_nested - (n_fe - 1)
where k_fe_nested
is the count of nested fixed effects. For cluster fixed effects, k_fe_nested = G
, the number of clusters.
⚠️ Note: If you already subtracted a level from a nested FE, you may need to add it back.
cluster_adj
If cluster_adj = True
, we apply a second correction:
cluster_df_val = G / (G - 1)
Where:
G
is the number of clusters for clustered errors, orN
for heteroskedastic errors.- This follows the approach in R’s
sandwich
package, interpreting heteroskedastic errors as “singleton clusters.”
Tip: If
cluster_adj = True
for IID errors,cluster_df_val
defaults to1
. For heteroskedastic erros, despite its name,cluster_adj=True
will apply an adjustment of (N-1) / N, as there are \(G = N\) singleton clusters.
cluster_df
Relevant only for multi-way clustering. Two-way clustering, for example, can be written as:
vcov = ssc_A * vcov_A + ssc_B * vcov_B - ssc_AB * vcov_AB
where A
and B
are clustering dimensions, with G_AB > G_A > G_B
.
- If
cluster_df = "min"
, then G is set to the minimum value ofG_A
,G_B
, andG_AB
. - If
cluster_df = "conventional"
, each clustering dimension uses its own cluster count (G_A
,G_B
, etc.) for its respective adjustment.
More on Inference
For computing critical values:
- OLS and IV: use t-statistics with
df_t = N - dof_k
(non-clustered) ordf_t = G - 1
(clustered). - GLMs: use z-statistics (normal approximation).
For multi-way clustering:
- Two-way:
df_t = min(G_1 - 1, G_2 - 1)
- Three-way:
df_t = min(G_1 - 1, G_2 - 1, G_3 - 1)
(not currently supported)
See this implementation for details.
In Code
All of the above logic is implemented here.