DMLDiD

Double/debiased machine learning for DiD with Python  twitter @asas_mimi 1
(repeated outcomes)

Table of Contents 1. DMLDiD 2. Reproducing the paper 3.
New SIMULATION 4. Next Works 2

Original Paper: Chang, Neng-Chieh. "Double/debiased machine learning for difference-in-differences models."
The Econometrics Journal 23.2 (2020): 177-191 https://academic.oup.com/ectj/article/23/2/177/5722119#247745047 3

Data structure : repeated outcomes  Chang, N. C. (2020). Double/debiased
machine learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. 4 The following data can be observed Pre-intervention Outcomes post-intervention Outcomes treatment group or not covariates

Assumptions for repeated outcomes  Chang, N. C. (2020). Double/debiased machine
learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. 5 The support of the ps of the treated is a subset of the support for the untreated conditional parallel-trend potential outcomes Counterfactual outcomes if no intervention is received treatment group　control group violation for parallel-trend conditioning with X trend plot | X = 〇〇 ps Not overrap !! Common support is a subset of the untreated Comparable!! This states that the support of the propensity score of the treated group is a subset of the support for the untreated. This is the same constraint placed on ATT estimation in other propensity score methods

previous work : Abadie (2005)  Abadie A. (2005). Semiparametric difference-in-differences
estimators, Review of Economic Studies, 72, 1–19. 6 propensity score Simple diff of pre vs. post ΔY D=1 & ps = 0.9 D=1 & ps = 0.1 D=0 & ps = 0.1 D=0 & ps = 0.9 In the example below, P(D)=0.5 no weight ( only 2 : inverse of P(D)=0.5) Since we want ATT, we do not weight the treatment group by the propensity score. -9 ps = 0.9 ~> Homogeneous with the treated -0.111 ps = 0.1 ~> Heterogeneous with the treated These are the untreated, so they are weighted negatively.

DMLDiD : Chang (2020)  g(X) Chang, N. C. (2020). Double/debiased
machine learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. p g(X) Another ML model added！！ 

DMLDiD : Chang (2020)  Chang, N. C. (2020). Double/debiased machine
learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. Predictive model (supervised learning)   Label = Diff.  Learning with control group only Cross fitting   separates samples for “fitting” and “prediction” as in Chernozhukov (2018) 

DMLDiD : Chang (2020)  Chang, N. C. (2020). Double/debiased machine
learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. Diff. As in Abadie (2005), we weight the propensity score to the untreated. Propensity scores and P(D) are also calculated by "cross ﬁtting". Observable increase/decrease (Diff) ー Counterfactual increase/decrease (Diff) If there was no intervention (= counterfactual), the Diff would look something like this

Score function  Chang, N. C. (2020). Double/debiased machine learning for
difference-in-differences models. The Econometrics Journal, 23(2), 177-191. DMLDiD’s score function is as follows: New with the unknown constant p0 = P(D = 1) and the inﬁnite-dimensional nuisance parameter: nuisance parameter  nuisance parameter 

Orthogonality & Asymptotic properties  Chang, N. C. (2020). Double/debiased machine
learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. DMLDiD’s score function obey the Neyman orthogonality: The orthogonality property above says that the score function is invariant to small perturbations of the nuisance parameters g(propensity socre) and l (outcome model) consistent estimator for the asymptotic variance root-N consistency DMLDiD can achieve root-N consistency

Reproducing Chang (2020)  My notebooks are here:  https://github.com/MasaAsami/ReproducingDMLDiD    
These implementations were based on the following R package:  https://github.com/NengChiehChang/Diff-in-Diff   12

13 def dmldid_rc( df, y1_col, y0_col, d_col, X_cols, ps_model=LogisticRegressionCV(cv=5, random_state=333,
penalty="l1", solver="saga"), l1k_model=LassoCV(cv=5, random_state=333), ) -> np.float: K = 2 df_set = train_test_split(df, random_state=0, test_size=0.5) thetabar = [] for i in range(K): k = 0 if i == 0 else 1 c = 1 if i == 0 else 0 ps_model.fit(df_set[c][X_cols], df_set[c][d_col]) eps = 0.03 ghat = np.clip( ps_model.predict_proba(df_set[k][X_cols])[:, 1], eps, 1 - eps, ) DMLDID for repeated outcomes    cross-ﬁtting propensity socre

def dmldid_rc(....): ….. control_y0 = df_set[c].query(f"{d_col} < 1")[y0_col] control_y1 =
df_set[c].query(f"{d_col} < 1")[y1_col] _y = control_y1 - control_y0 control_x = df_set[c].query(f"{d_col} < 1")[X_cols] l1k_model.fit(control_x, _y) l1hat = l1k_model.predict(df_set[k][X_cols]) p_hat = df_set[c][d_col].mean() _e = ( (df_set[k][y1_col] - df_set[k][y0_col]) / p_hat * (df_set[k][d_col] - ghat) / (1 - ghat) - (df_set[k][d_col] - ghat) / p_hat / (1 - ghat) * l1hat ).mean() thetabar.append(_e) return np.mean(thetabar) 14 DMLDID for repeated outcome  outcome model

15 Simulation data of Chang (2020)   ???? Chang, N.
C. (2020). Double/debiased machine learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. ?? The data generating process in the original paper seems inappropriate in terms of testing the accuracy of this model. The conditional parallel trend assumption is not well represented. (This would be sufficient with ordinary DiD)

16 reproduction result  Chang, N. C. (2020). Double/debiased machine learning
for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. Although we were able to show its superiority over previous studies, simple DID is sufficient because it still does not represent bias well in the data generation process.

New Simulation Data  New Data Chang (2020) X  D  ΔY 
Y(0)  ?? X  D  ΔY  I prepared the following data and experimented again. ※ ΔY := Y(1) - Y(0) X  unobservable

New Simulation Data  'Latent group' is the simulation variable that
generates each variable. In the simulation, we assume that it is not possible to directly observe which latent group each unit belongs to. Y(0) = Y_2022 Y(1) = Y_2023 X : x0~x99 latent_group 

No parallel trend  The parallel trend assumption is clearly not
met by the aggregate data. It is clear that the ATT cannot be addressed by normal DiD.

Conditional parallel trend  Conditional parallel trend assumption is satisfied. However,
latent groups shall not be observable.

Treatment group allocation  Each latent group is designed for 20
units. # (Code excerpt) df["latent_group"] = sg # 0 ~ 9 df["latent_ps"] = np.clip(1- sg/10, 0.0001, 1- 0.1) df["D"] = df["latent_ps"].apply(lambda x: np.random.binomial(1, x)) # (Code excerpt) X  D  ΔY  X  unobservable `latent group` directly affects D 

Failure to meet backdoor criteria with PS  `Latent group` variable
is provided to make estimation difficult in PS-based models such as Abadie (2005). I tested whether DMLDiD can estimate ATT unbiasedly in such DAG. X  D  ΔY  X  PS-based Approach DAG `latent group` directly affects D! But unobservable !! X  D  ΔY  X  backdoor block with PS Failure to close total backdoor paths. A backpath through `latent group` may still exist.

23 Simulation result  True ATT = 3

Next Works:  Chang (2020) also devised DMLDiD in repeated cross-session
data. I will try to reproduce it in the next issue. (There seem to be a few errors in the original demonstration.)  Also, DMLDiD seems to be very versatile. I am currently developing a Python package  24

Thank you  25

References:  [1] Chang, Neng-Chieh. (2020). "Double/debiased machine learning for difference-in-differences
models." The Econometrics Journal 23.2 : 177–191  [2] Abadie A. (2005). Semiparametric difference-in-differences estimators, Review of Economic Studies, 72, 1–19.  [3] Chernozhukov V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, J. Robins (2018). Double/debiased machine learning for treatment and structural parameters, Econometrics Journal, 21, C1–C68.  [4] (slide) 加藤真大 (2021)「DMLによる差分の差推定」 https://speakerdeck.com/masakat0/dmlniyoruchai-fen-falsechai-tui-ding 26

DMLDiD

DMLDiD

Masa

More Decks by Masa

Other Decks in Science

Featured

Transcript

Double/debiased machine learning for DiD with Python  twitter @asas_mimi 1

Table of Contents 1. DMLDiD 2. Reproducing the paper 3.

Original Paper: Chang, Neng-Chieh. "Double/debiased machine learning for difference-in-differences models."

Data structure : repeated outcomes  Chang, N. C. (2020). Double/debiased

Assumptions for repeated outcomes  Chang, N. C. (2020). Double/debiased machine

previous work : Abadie (2005)  Abadie A. (2005). Semiparametric difference-in-differences

DMLDiD : Chang (2020)  g(X) Chang, N. C. (2020). Double/debiased

DMLDiD : Chang (2020)  Chang, N. C. (2020). Double/debiased machine

DMLDiD : Chang (2020)  Chang, N. C. (2020). Double/debiased machine

Score function  Chang, N. C. (2020). Double/debiased machine learning for

Orthogonality & Asymptotic properties  Chang, N. C. (2020). Double/debiased machine

Reproducing Chang (2020)  My notebooks are here:  https://github.com/MasaAsami/ReproducingDMLDiD

13 def dmldid_rc( df, y1_col, y0_col, d_col, X_cols, ps_model=LogisticRegressionCV(cv=5, random_state=333,

def dmldid_rc(....): ….. control_y0 = df_set[c].query(f"{d_col} < 1")[y0_col] control_y1 =

15 Simulation data of Chang (2020)   ???? Chang, N.

16 reproduction result  Chang, N. C. (2020). Double/debiased machine learning

New Simulation Data  New Data Chang (2020) X  D  ΔY

New Simulation Data  'Latent group' is the simulation variable that

No parallel trend  The parallel trend assumption is clearly not

Conditional parallel trend  Conditional parallel trend assumption is satisfied. However,

Treatment group allocation  Each latent group is designed for 20

Failure to meet backdoor criteria with PS  `Latent group` variable

23 Simulation result  True ATT = 3

Next Works:  Chang (2020) also devised DMLDiD in repeated cross-session

Thank you  25

References:  [1] Chang, Neng-Chieh. (2020). "Double/debiased machine learning for difference-in-differences