did.estimation.did2s(data, yname, first_stage, second_stage, treatment, cluster)
Estimate a Difference-in-Differences model using Gardner’s two-step DID2S estimator.
Parameters
data |
pd.DataFrame |
The DataFrame containing all variables. |
required |
yname |
str |
The name of the dependent variable. |
required |
first_stage |
str |
The formula for the first stage, starting with ‘~’. |
required |
second_stage |
str |
The formula for the second stage, starting with ‘~’. |
required |
treatment |
str |
The name of the treatment variable. |
required |
cluster |
str |
The name of the cluster variable. |
required |
Returns
object |
A fitted model object of class [Feols(/reference/Feols.qmd). |
Examples
import pandas as pd
import numpy as np
from pyfixest.did.estimation import did2s
url = "https://raw.githubusercontent.com/py-econometrics/pyfixest/master/pyfixest/did/data/df_het.csv"
df_het = pd.read_csv(url)
df_het.head()
0 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1990 |
0.066159 |
False |
-20.0 |
-6 |
-0.086466 |
0 |
0.0 |
7.022709 |
1 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1991 |
-0.030980 |
False |
-19.0 |
-6 |
0.766593 |
0 |
0.0 |
7.778628 |
2 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1992 |
-0.119607 |
False |
-18.0 |
-6 |
1.512968 |
0 |
0.0 |
8.436377 |
3 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1993 |
0.126321 |
False |
-17.0 |
-6 |
0.021870 |
0 |
0.0 |
7.191207 |
4 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1994 |
-0.106921 |
False |
-16.0 |
-6 |
-0.017603 |
0 |
0.0 |
6.918492 |
In a first step, we estimate a classical event study model:
# estimate the model
fit = did2s(
df_het,
yname="dep_var",
first_stage="~ 0 | unit + year",
second_stage="~i(rel_year, ref=-1.0)",
treatment="treat",
cluster="state",
)
fit.tidy().head()
Coefficient |
|
|
|
|
|
|
C(rel_year, contr.treatment(base=-1.0))[T.-20.0] |
-0.058226 |
0.035809 |
-1.626011 |
0.103947 |
-0.128410 |
0.011959 |
C(rel_year, contr.treatment(base=-1.0))[T.-19.0] |
-0.006032 |
0.030341 |
-0.198816 |
0.842407 |
-0.065499 |
0.053435 |
C(rel_year, contr.treatment(base=-1.0))[T.-18.0] |
-0.006152 |
0.035094 |
-0.175310 |
0.860836 |
-0.074935 |
0.062631 |
C(rel_year, contr.treatment(base=-1.0))[T.-17.0] |
-0.012533 |
0.024834 |
-0.504689 |
0.613778 |
-0.061206 |
0.036140 |
C(rel_year, contr.treatment(base=-1.0))[T.-16.0] |
-0.034698 |
0.029806 |
-1.164128 |
0.244372 |
-0.093116 |
0.023720 |
We can also inspect the model visually:
fit.iplot(figsize= [1200, 400], coord_flip=False).show()
To estimate a pooled effect, we need to slightly update the second stage formula:
fit = did2s(
df_het,
yname="dep_var",
first_stage="~ 0 | unit + year",
second_stage="~i(treat)",
treatment="treat",
cluster="state"
)
fit.tidy().head()
Coefficient |
|
|
|
|
|
|
C(treat)[T.True] |
2.230482 |
0.024709 |
90.271437 |
0.0 |
2.182054 |
2.27891 |