Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DMLDiD

Masa
March 22, 2022

 DMLDiD

Masa

March 22, 2022
Tweet

More Decks by Masa

Other Decks in Science

Transcript

  1. Original Paper: Chang, Neng-Chieh. "Double/debiased machine learning for difference-in-differences models."

    The Econometrics Journal 23.2 (2020): 177-191 https://academic.oup.com/ectj/article/23/2/177/5722119#247745047 3
  2. Data structure : repeated outcomes
 Chang, N. C. (2020). Double/debiased

    machine learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. 4 The following data can be observed Pre-intervention Outcomes post-intervention Outcomes treatment group or not covariates
  3. Assumptions for repeated outcomes
 Chang, N. C. (2020). Double/debiased machine

    learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. 5 The support of the ps of the treated is a subset of the support for the untreated conditional parallel-trend potential outcomes Counterfactual outcomes if no intervention is received treatment group control group violation for parallel-trend conditioning with X trend plot | X = 〇〇 ps Not overrap !! Common support is a subset of the untreated Comparable!! This states that the support of the propensity score of the treated group is a subset of the support for the untreated. This is the same constraint placed on ATT estimation in other propensity score methods
  4. previous work : Abadie (2005)
 Abadie A. (2005). Semiparametric difference-in-differences

    estimators, Review of Economic Studies, 72, 1–19. 6 propensity score Simple diff of pre vs. post ΔY D=1 & ps = 0.9 D=1 & ps = 0.1 D=0 & ps = 0.1 D=0 & ps = 0.9 In the example below, P(D)=0.5 no weight ( only 2 : inverse of P(D)=0.5) Since we want ATT, we do not weight the treatment group by the propensity score. -9 ps = 0.9 ~> Homogeneous with the treated -0.111 ps = 0.1 ~> Heterogeneous with the treated These are the untreated, so they are weighted negatively.
  5. DMLDiD : Chang (2020)
 g(X) Chang, N. C. (2020). Double/debiased

    machine learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. p g(X) Another ML model added!!

  6. DMLDiD : Chang (2020)
 Chang, N. C. (2020). Double/debiased machine

    learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. Predictive model (supervised learning) 
 Label = Diff.
 Learning with control group only Cross fitting 
 separates samples for “fitting” and “prediction” as in Chernozhukov (2018)

  7. DMLDiD : Chang (2020)
 Chang, N. C. (2020). Double/debiased machine

    learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. Diff. As in Abadie (2005), we weight the propensity score to the untreated. Propensity scores and P(D) are also calculated by "cross fitting". Observable increase/decrease (Diff) ー Counterfactual increase/decrease (Diff) If there was no intervention (= counterfactual), the Diff would look something like this
  8. Score function
 Chang, N. C. (2020). Double/debiased machine learning for

    difference-in-differences models. The Econometrics Journal, 23(2), 177-191. DMLDiD’s score function is as follows: New with the unknown constant p0 = P(D = 1) and the infinite-dimensional nuisance parameter: nuisance parameter
 nuisance parameter

  9. Orthogonality & Asymptotic properties
 Chang, N. C. (2020). Double/debiased machine

    learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. DMLDiD’s score function obey the Neyman orthogonality: The orthogonality property above says that the score function is invariant to small perturbations of the nuisance parameters g(propensity socre) and l (outcome model) consistent estimator for the asymptotic variance root-N consistency DMLDiD can achieve root-N consistency
  10. Reproducing Chang (2020)
 My notebooks are here:
 https://github.com/MasaAsami/ReproducingDMLDiD 
 


    These implementations were based on the following R package:
 https://github.com/NengChiehChang/Diff-in-Diff 
 12
  11. 13 def dmldid_rc( df, y1_col, y0_col, d_col, X_cols, ps_model=LogisticRegressionCV(cv=5, random_state=333,

    penalty="l1", solver="saga"), l1k_model=LassoCV(cv=5, random_state=333), ) -> np.float: K = 2 df_set = train_test_split(df, random_state=0, test_size=0.5) thetabar = [] for i in range(K): k = 0 if i == 0 else 1 c = 1 if i == 0 else 0 ps_model.fit(df_set[c][X_cols], df_set[c][d_col]) eps = 0.03 ghat = np.clip( ps_model.predict_proba(df_set[k][X_cols])[:, 1], eps, 1 - eps, ) DMLDID for repeated outcomes
 
 cross-fitting propensity socre
  12. def dmldid_rc(....): ….. control_y0 = df_set[c].query(f"{d_col} < 1")[y0_col] control_y1 =

    df_set[c].query(f"{d_col} < 1")[y1_col] _y = control_y1 - control_y0 control_x = df_set[c].query(f"{d_col} < 1")[X_cols] l1k_model.fit(control_x, _y) l1hat = l1k_model.predict(df_set[k][X_cols]) p_hat = df_set[c][d_col].mean() _e = ( (df_set[k][y1_col] - df_set[k][y0_col]) / p_hat * (df_set[k][d_col] - ghat) / (1 - ghat) - (df_set[k][d_col] - ghat) / p_hat / (1 - ghat) * l1hat ).mean() thetabar.append(_e) return np.mean(thetabar) 14 DMLDID for repeated outcome
 outcome model
  13. 15 Simulation data of Chang (2020) 
 ???? Chang, N.

    C. (2020). Double/debiased machine learning for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. ?? The data generating process in the original paper seems inappropriate in terms of testing the accuracy of this model. The conditional parallel trend assumption is not well represented. (This would be sufficient with ordinary DiD)
  14. 16 reproduction result
 Chang, N. C. (2020). Double/debiased machine learning

    for difference-in-differences models. The Econometrics Journal, 23(2), 177-191. Although we were able to show its superiority over previous studies, simple DID is sufficient because it still does not represent bias well in the data generation process.
  15. New Simulation Data
 New Data Chang (2020) X
 D
 ΔY


    Y(0)
 ?? X
 D
 ΔY
 I prepared the following data and experimented again. ※ ΔY := Y(1) - Y(0) X
 unobservable
  16. New Simulation Data
 'Latent group' is the simulation variable that

    generates each variable. In the simulation, we assume that it is not possible to directly observe which latent group each unit belongs to. Y(0) = Y_2022 Y(1) = Y_2023 X : x0~x99 latent_group

  17. No parallel trend
 The parallel trend assumption is clearly not

    met by the aggregate data. It is clear that the ATT cannot be addressed by normal DiD.
  18. Treatment group allocation
 Each latent group is designed for 20

    units. # (Code excerpt) df["latent_group"] = sg # 0 ~ 9 df["latent_ps"] = np.clip(1- sg/10, 0.0001, 1- 0.1) df["D"] = df["latent_ps"].apply(lambda x: np.random.binomial(1, x)) # (Code excerpt) X
 D
 ΔY
 X
 unobservable `latent group` directly affects D

  19. Failure to meet backdoor criteria with PS
 `Latent group` variable

    is provided to make estimation difficult in PS-based models such as Abadie (2005). I tested whether DMLDiD can estimate ATT unbiasedly in such DAG. X
 D
 ΔY
 X
 PS-based Approach DAG `latent group` directly affects D! But unobservable !! X
 D
 ΔY
 X
 backdoor block with PS Failure to close total backdoor paths. A backpath through `latent group` may still exist.
  20. Next Works:
 Chang (2020) also devised DMLDiD in repeated cross-session

    data. I will try to reproduce it in the next issue. (There seem to be a few errors in the original demonstration.)
 Also, DMLDiD seems to be very versatile. I am currently developing a Python package
 24
  21. References:
 [1] Chang, Neng-Chieh. (2020). "Double/debiased machine learning for difference-in-differences

    models." The Econometrics Journal 23.2 : 177–191
 [2] Abadie A. (2005). Semiparametric difference-in-differences estimators, Review of Economic Studies, 72, 1–19.
 [3] Chernozhukov V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, J. Robins (2018). Double/debiased machine learning for treatment and structural parameters, Econometrics Journal, 21, C1–C68.
 [4] (slide) 加藤真大 (2021)「DMLによる差分の差推定」 https://speakerdeck.com/masakat0/dmlniyoruchai-fen-falsechai-tui-ding 26