We immediately see that we have staggered adoption of treatment in the second case, which implies that a naive application of 2WFE might yield biased estimates under substantial effect heterogeneity.
We can also plot treatment assignment in a disaggregated fashion, which gives us a sense of cohort sizes.
We immediately see that the first cohort is switched into treatment in 2000, while the second cohort is switched into treatment by 2010. Before each cohort is switched into treatment, the trends are parallel.
We can additionally inspect individual units by dropping the collapse_to_cohort argument. Because we have a large sample, we might want to inspect only a subset of units.
pf.panelview( df_multi_cohort, outcome="dep_var", unit="unit", time="year", treat="treat", subsamp=100, title ="Outcome Plot")
One-shot adoption: Static and Dynamic Specifications
After taking a first look at the data, let’s turn to estimation. We return to the df_one_cohort data set (without staggered treatment rollout).
Since this is a single-cohort dataset, this estimate is consistent for the ATT under parallel trends. We can estimate heterogeneous effects by time by interacting time with the treated group:
Event study plots like this are very informative, as they allow us to visually inspect the parallel trends assumption and also the dynamic effects of the treatment.
Based on a cursory glance, one would conclude that parallel trends does not hold because one of the pre-treatment coefficient has a confidence interval that does not include zero. However, we know that parallel trends is true because the treatment is randomly assigned in the underlying DGP.
Pointwise vs Simultaneous Inference in Event Studies
This is an example of a false positive in testing for pre-trends produced by pointwise inference (where each element of the coefficient vector is tested separately).
As an alternative, we can use simultaneous confidence bands of the form \([a, b] = ([a_k, b_k])_{k=1}^K\) such that
These bands can be constructed by using a carefully chosen critical value \(c\) that accounts for the covariance between coefficients using the multiplier bootstrap. In pointwise inference, the critical value is \(c = z_{1 - \alpha/2} = 1.96\) for \(\alpha = 0.05\); the corresponding critical value for simultaneous inference is typically larger. These are also known as sup-t bands in the literature (see lec 3 of the NBER SI methods lectures linked above).
This is implemented in the confint(joint=True) method in the feols class. If we pass the joint='both' argument to iplot, we get the simultaneous confidence bands (for all event study coefficients) in addition to the pointwise confidence intervals. Note that simultaneous inference for all event study coefficients may be overly conservative, especially when the number of coefficients is large; one may instead choose to perform joint inference for all pre-treatment coefficients and all post-treatment coefficients separately.
The joint confidence bands are wider than the pointwise confidence intervals, and they include zero for all pre-treatment coefficients. This is consistent with the parallel trends assumption.
Event Study under Staggered Adoption via feols(), did2s() and lpdid()
We now return to the data set with staggered treatment rollout, df_multi_cohort.
Two-Way Fixed Effects
As a baseline model, we can estimate a simple two-way fixed effects DiD regression via feols():