did.estimation.did2s

did.estimation.did2s(
    data
    yname
    first_stage
    second_stage
    treatment
    cluster
    weights=None
)

Estimate a Difference-in-Differences model using Gardner’s two-step DID2S estimator.

Parameters

Name Type Description Default
data pd.DataFrame The DataFrame containing all variables. required
yname str The name of the dependent variable. required
first_stage str The formula for the first stage, starting with ‘~’. required
second_stage str The formula for the second stage, starting with ‘~’. required
treatment str The name of the treatment variable. required
cluster str The name of the cluster variable. required

Returns

Name Type Description
object A fitted model object of class [Feols(/reference/Feols.qmd).

Examples

import pandas as pd
import numpy as np
import pyfixest as pf

url = "https://raw.githubusercontent.com/py-econometrics/pyfixest/master/pyfixest/did/data/df_het.csv"
df_het = pd.read_csv(url)
df_het.head()
unit state group unit_fe g year year_fe treat rel_year rel_year_binned error te te_dynamic dep_var
0 1 33 Group 2 7.043016 2010 1990 0.066159 False -20.0 -6 -0.086466 0 0.0 7.022709
1 1 33 Group 2 7.043016 2010 1991 -0.030980 False -19.0 -6 0.766593 0 0.0 7.778628
2 1 33 Group 2 7.043016 2010 1992 -0.119607 False -18.0 -6 1.512968 0 0.0 8.436377
3 1 33 Group 2 7.043016 2010 1993 0.126321 False -17.0 -6 0.021870 0 0.0 7.191207
4 1 33 Group 2 7.043016 2010 1994 -0.106921 False -16.0 -6 -0.017603 0 0.0 6.918492

In a first step, we estimate a classical event study model:

# estimate the model
fit = pf.did2s(
    df_het,
    yname="dep_var",
    first_stage="~ 0 | unit + year",
    second_stage="~i(rel_year, ref=-1.0)",
    treatment="treat",
    cluster="state",
)

fit.tidy().head()
Estimate Std. Error t value Pr(>|t|) 2.5% 97.5%
Coefficient
C(rel_year, contr.treatment(base=-1.0))[T.-inf] -3.551930e-08 5.844125e-09 -6.077778 4.038252e-07 -4.734015e-08 -2.369844e-08
C(rel_year, contr.treatment(base=-1.0))[T.-20.0] -5.822583e-02 3.580900e-02 -1.626011 1.120020e-01 -1.306564e-01 1.420471e-02
C(rel_year, contr.treatment(base=-1.0))[T.-19.0] -6.032212e-03 3.034072e-02 -0.198816 8.434398e-01 -6.740212e-02 5.533769e-02
C(rel_year, contr.treatment(base=-1.0))[T.-18.0] -6.152375e-03 3.509400e-02 -0.175311 8.617421e-01 -7.713669e-02 6.483194e-02
C(rel_year, contr.treatment(base=-1.0))[T.-17.0] -1.253327e-02 2.483369e-02 -0.504688 6.166178e-01 -6.276415e-02 3.769761e-02

We can also inspect the model visually:

fit.iplot(figsize= [1200, 400], coord_flip=False).show()

To estimate a pooled effect, we need to slightly update the second stage formula:

fit = pf.did2s(
    df_het,
    yname="dep_var",
    first_stage="~ 0 | unit + year",
    second_stage="~i(treat)",
    treatment="treat",
    cluster="state"
)
fit.tidy().head()
Estimate Std. Error t value Pr(>|t|) 2.5% 97.5%
Coefficient
C(treat)[T.True] 2.230482 0.024709 90.271444 0.0 2.180504 2.280459