# PyFixest > Fast high-dimensional fixed effects regression in Python, closely mirroring the syntax of the R package fixest. ## Docs - [Getting Started](https://pyfixest.org/quickstart.html): [markdown](https://pyfixest.org/quickstart.html.md) - [API Reference](https://pyfixest.org/reference/): [markdown](https://pyfixest.org/reference/index.html.md) - [feols API](https://pyfixest.org/reference/estimation.api.feols.feols.html): [markdown](https://pyfixest.org/reference/estimation.api.feols.feols.html.md) - [Regression Tables (etable)](https://pyfixest.org/table-layout.html): [markdown](https://pyfixest.org/table-layout.html.md) - [Changelog](https://pyfixest.org/changelog.html): [markdown](https://pyfixest.org/changelog.html.md) All documentation pages are available as clean markdown by appending `.md` to the HTML URL (e.g., `quickstart.html` -> `quickstart.html.md`). ## Core API (4 functions) - `pyfixest.feols(fml, data, vcov, weights, ssc, fixef_rm, ...)`: OLS/WLS/IV with fixed effects. - `pyfixest.fepois(fml, data, vcov, ...)`: Poisson regression with fixed effects. - `pyfixest.feglm(fml, data, family, vcov, ...)`: GLM regression (family: "logit", "probit", "gaussian") with fixed effects. - `pyfixest.quantreg(fml, data, quantile, ...)`: Quantile regression via interior point solver. ## Formula Syntax Formulas follow fixest syntax and are split into 1-3 parts by `|`: - One-part: `"Y ~ X1 + X2"` (no fixed effects, no IV) - Two-part: `"Y ~ X1 + X2 | FE1 + FE2"` (fixed effects) - Two-part IV: `"Y ~ X1 + X2 | X_endog ~ Z1 + Z2"` (IV without fixed effects) - Three-part IV: `"Y ~ X1 + X2 | FE1 + FE2 | X_endog ~ Z1 + Z2"` (IV with fixed effects) IV behavior: - The IV part must be `endogenous ~ instruments`. - Exogenous variables from the second-stage RHS are automatically added to the first stage. - Endogenous variables are automatically added to the second stage. - Multiple endogenous variables are not supported. Other syntax: - Multiple depvars are expanded to multiple estimations: `"Y1 + Y2 ~ X1"` behaves like `"sw(Y1, Y2) ~ X1"`. - `i()` creates indicator expansions and interactions: - `i(cat)` expands to dummies for each level of `cat` (one omitted). - `i(cat, ref="Base")` sets the omitted reference level explicitly. - `i(cat, x)` interacts `cat` with `x`. If `x` is numeric, this yields category-specific slopes. If `x` is categorical, this yields cat-by-x indicators. - `i(cat1, cat2, ref2="Base")` interacts two categorical variables; `ref2` sets the omitted level of `cat2`. - Example (cat x numeric): `Y ~ i(industry, exposure)` creates industry-specific slopes on `exposure`. - Example (cat x cat): `Y ~ i(state, year, ref2=2000)` creates state-by-year indicators with 2000 as the base year. - Standard interactions work as well: - `X1 * X2` expands to `X1 + X2 + X1:X2`. - `X1:X2` is the interaction term only (no main effects). - Interacted FEs: `"Y ~ X1 | FE1 ^ FE2"` (creates a combined FE). ### Multiple Estimation Operators Operators can appear anywhere in the formula (RHS, fixed effects, IV parts). They can be combined; expansion is recursive and produces all combinations. Multiple estimation can be significantly faster than independent model calls due to internal caching of demeaned covariates. `sw` (sequential stepwise): - `y ~ x1 + sw(x2, x3)` -> `y ~ x1 + x2` and `y ~ x1 + x3` `sw0` (sequential stepwise with zero step): - `y ~ x1 + sw0(x2, x3)` -> `y ~ x1`, `y ~ x1 + x2`, `y ~ x1 + x3` `csw` (cumulative stepwise): - `y ~ x1 + csw(x2, x3)` -> `y ~ x1 + x2`, `y ~ x1 + x2 + x3` `csw0` (cumulative stepwise with zero step): - `y ~ x1 + csw0(x2, x3)` -> `y ~ x1`, `y ~ x1 + x2`, `y ~ x1 + x2 + x3` `mvsw` (multiverse stepwise): - `y ~ mvsw(x1, x2, x3)` -> all non-empty combinations plus the zero step: `y ~ 1`, `y ~ x1`, `y ~ x2`, `y ~ x3`, `y ~ x1 + x2`, `y ~ x1 + x3`, `y ~ x2 + x3`, `y ~ x1 + x2 + x3` Combining operators example: - `y ~ csw(x1, x2) + sw(z1, z2)` expands to: `y ~ x1 + z1`, `y ~ x1 + z2`, `y ~ x1 + x2 + z1`, `y ~ x1 + x2 + z2` You can run regressions for subsamples by using the `split` and `fsplit` arguments, where both split by the provided variable, but `fsplit` also provides a fit for the full sample. ## Inference (vcov) Pass to `vcov`: - `"iid"` -- IID errors - `"hetero"` -- HC1 heteroskedasticity-robust (alias: `"HC1"`) - `"HC2"` -- HC2 robust (not supported with fixed effects or IV) - `"HC3"` -- HC3 robust (not supported with fixed effects or IV) - `{"CRV1": "cluster_var"}` -- Cluster-robust variance - `{"CRV3": "cluster_var"}` -- Leave-one-cluster-out jackknife - `"NW"` -- Newey-West HAC (requires `vcov_kwargs` with `time_id` and optionally `panel_id`, `lag`) - `"DK"` -- Driscoll-Kraay HAC (requires `vcov_kwargs` with `time_id` and optionally `panel_id`, `lag`) Two-way clustering: `{"CRV1": "var1 + var2"}`. Inference can be adjusted post-estimation: `fit.vcov("hetero").summary()`. ## Post Processing Model objects support: - `.summary()` -- Print regression summary - `.tidy()` -- Tidy DataFrame of coefficients, SEs, t-stats, p-values, CIs - `.coef()` -- Coefficient values - `.se()` -- Standard errors - `.pvalue()` -- P-values - `.confint()` -- Confidence intervals - `.predict(newdata)` -- Predictions - `.resid()` -- Residuals - `.vcov()` -- Variance-covariance matrix - `.tstat()` -- t-statistics - `.fixef()` -- Extract fixed effect estimates - `.wildboottest(param, reps, seed)` -- Wild cluster bootstrap inference - `.ccv(treatment, pk, qk, ...)` -- Causal cluster variance estimator - `.ritest(resampvar, reps, ...)` -- Randomization inference - `.decompose(param, x1_vars, type, ...)` -- Gelbach (2016) decomposition - `.wald_test(R, q)` -- Linear hypothesis testing - `.first_stage()` -- First-stage results (IV only) - `.IV_Diag()` -- IV diagnostic tests (IV only) For IV models, show first and second stage together: `pf.etable([fit._model_1st_stage, fit])`. ## DiD / Causal Inference - `pyfixest.did2s(data, yname, first_stage, second_stage, treatment, cluster)` -- Two-stage DID (Gardner 2022). - `pyfixest.event_study(data, yname, idname, tname, gname, estimator="twfe")` -- Event study with multiple estimators. - `pyfixest.lpdid(data, yname, idname, tname, gname)` -- Local projections DID. - `pyfixest.SaturatedEventStudy(data, yname, idname, tname, gname)` -- Saturated event study with cohort-specific effects. - `pyfixest.panelview(data, unit, time, treat)` -- Panel treatment visualization. ## Visualization - `pyfixest.coefplot(models)` -- Plot coefficients with confidence intervals. - `pyfixest.iplot(models)` -- Plot coefficients from `i()` interactions (event-study style). - `pyfixest.qplot(models)` -- Plot quantile regression coefficients. ## Multiple Testing - `pyfixest.bonferroni(models, param)` -- Bonferroni-adjusted p-values. - `pyfixest.rwolf(models, param, reps, seed)` -- Romano-Wolf adjusted p-values. - `pyfixest.wyoung(models, param, reps, seed)` -- Westfall-Young adjusted p-values. ## Utilities - `pyfixest.get_data(N, seed)` -- Generate example dataset for testing. - `pyfixest.ssc(k_adj, k_fixef, G_adj, G_df)` -- Configure small sample corrections. ## etable Basics For regression tables, use `pf.etable()`. - Build tables: `pf.etable([fit1, fit2, ...])` or `pf.etable(pf.feols("Y~csw(X1,X2)", data))`. - Output formats: `type="gt"` (default), `"md"`, `"tex"`, `"df"`. - Keep/drop variables: `keep="X1"` or `drop=["X2"]`. - Labels: `labels={"X1": "Age"}`, `felabels={"f1": "Industry FE"}`. - Coefficient format: `coef_fmt="b (se)\n[p]"` shows coefficient, SE in parentheses, p-value in brackets. - Title: `caption="Regression Results"`. - Column headers: `model_heads=[...]` and `head_order="hd"` or `"dh"` to control header order.