did.estimation.did2s

did.estimation.did2s(data, yname, first_stage, second_stage, treatment, cluster)

Estimate a Difference-in-Differences model using Gardner’s two-step DID2S estimator.

Parameters

Name Type Description Default
data pd.DataFrame The DataFrame containing all variables. required
yname str The name of the dependent variable. required
first_stage str The formula for the first stage, starting with ‘~’. required
second_stage str The formula for the second stage, starting with ‘~’. required
treatment str The name of the treatment variable. required
cluster str The name of the cluster variable. required

Returns

Type Description
object A fitted model object of class [Feols(/reference/Feols.qmd).

Examples

import pandas as pd
import numpy as np
from pyfixest.did.estimation import did2s

url = "https://raw.githubusercontent.com/py-econometrics/pyfixest/master/pyfixest/did/data/df_het.csv"
df_het = pd.read_csv(url)
df_het.head()
unit state group unit_fe g year year_fe treat rel_year rel_year_binned error te te_dynamic dep_var
0 1 33 Group 2 7.043016 2010 1990 0.066159 False -20.0 -6 -0.086466 0 0.0 7.022709
1 1 33 Group 2 7.043016 2010 1991 -0.030980 False -19.0 -6 0.766593 0 0.0 7.778628
2 1 33 Group 2 7.043016 2010 1992 -0.119607 False -18.0 -6 1.512968 0 0.0 8.436377
3 1 33 Group 2 7.043016 2010 1993 0.126321 False -17.0 -6 0.021870 0 0.0 7.191207
4 1 33 Group 2 7.043016 2010 1994 -0.106921 False -16.0 -6 -0.017603 0 0.0 6.918492

In a first step, we estimate a classical event study model:

# estimate the model
fit = did2s(
    df_het,
    yname="dep_var",
    first_stage="~ 0 | unit + year",
    second_stage="~i(rel_year, ref=-1.0)",
    treatment="treat",
    cluster="state",
)

fit.tidy().head()
Estimate Std. Error t value Pr(>|t|) 2.5% 97.5%
Coefficient
C(rel_year, contr.treatment(base=-1.0))[T.-20.0] -0.058226 0.035809 -1.626011 0.103947 -0.128410 0.011959
C(rel_year, contr.treatment(base=-1.0))[T.-19.0] -0.006032 0.030341 -0.198816 0.842407 -0.065499 0.053435
C(rel_year, contr.treatment(base=-1.0))[T.-18.0] -0.006152 0.035094 -0.175310 0.860836 -0.074935 0.062631
C(rel_year, contr.treatment(base=-1.0))[T.-17.0] -0.012533 0.024834 -0.504689 0.613778 -0.061206 0.036140
C(rel_year, contr.treatment(base=-1.0))[T.-16.0] -0.034698 0.029806 -1.164128 0.244372 -0.093116 0.023720

We can also inspect the model visually:

fit.iplot(figsize= [1200, 400], coord_flip=False).show()

To estimate a pooled effect, we need to slightly update the second stage formula:

fit = did2s(
    df_het,
    yname="dep_var",
    first_stage="~ 0 | unit + year",
    second_stage="~i(treat)",
    treatment="treat",
    cluster="state"
)
fit.tidy().head()
Estimate Std. Error t value Pr(>|t|) 2.5% 97.5%
Coefficient
C(treat)[T.True] 2.230482 0.024709 90.271437 0.0 2.182054 2.27891