Upgrade to PRO for Only $50/Yearโ€”Limited-Time Offer! ๐Ÿ”ฅ

Synthetic Control Methods through Predictive Sy...

Avatar for MasaKat0 MasaKat0
August 02, 2023

Synthetic Control Methods through Predictiveย Synthesis

Presentation slides at EcoSta 2023.

Avatar for MasaKat0

MasaKat0

August 02, 2023
Tweet

More Decks by MasaKat0

Other Decks in Research

Transcript

  1. Synthetic Control Methods through Predictive Synthesis Masahiro Kato (The University

    of Tokyo) Coauthors: Akira Fukuda, Kosaku Takanashi, Kenichiro McAlinn, Akari Ohda, Masaaki Imaizumi Paper 1: Synthetic Control Methods by Density Matching under Implicit Endogeneity (https://arxiv.org/abs/2307.11127) Paper 2: Bayesian Predictive Synthetic Control Methods (https://drive.google.com/file/d/1veWTQTuWTx2gAMyh7VSZnenxsVqs1nla/view) Speaker Deck: https://speakerdeck.com/masakat0/synthetic-control-methods-through-predictive-synthesis?slide=25
  2. Synthetic Control Methods ร˜Synthetic Control Methods (SCMs; Abadie et al.

    2003). n Core idea. โ€ข There are several units. One unit among them receives a policy intervention (treated unit). โ€ข Policy intervention affects outcomes of the treated unit. โ€ข We cannot observe outcomes when the treated unit does not receive the policy intervention โ€ข Estimate counterfactual outcomes of the treated unit by using a weighted sum of observed outcomes of untreated units. โ€ข Then, using the estimated outcome, estimate the causal effect of the treated unit.
  3. Problem Setting n ๐ฝ + 1 units, ๐‘— โˆˆ ๐’ฅ

    โ‰” {0,1,2, โ€ฆ , ๐ฝ}. โ€ข ๐‘— = 0: Treated unit (a unit affected by the policy intervention). โ€ข ๐‘— โˆˆ ๐’ฅ!: ๐’ฅ โˆ– {0}: Untreated units. n ๐‘‡ Periods, ๐‘ก โˆˆ ๐’ฏ โ‰” {1,2, โ€ฆ , ๐‘‡}. โ€ข Intervention occurs at ๐‘ก = ๐‘‡" < ๐‘‡. โ€ข ๐‘ก โˆˆ ๐’ฏ " โ‰” {1,2, โ€ฆ , ๐‘‡"} : before the intervention. โ€ข ๐‘ก โˆˆ ๐’ฏ โˆ– ๐’ฏ " : after the intervention (๐‘‡# โ‰” ๐’ฏ # = ๐‘‡ โˆ’ ๐‘‡" ).
  4. Problem Setting ร˜Potential outcomes (Neyman, 1923; Rubin, 1974): n For

    each unit ๐‘— โˆˆ ๐’ฅ and period ๐‘ก โˆˆ ๐’ฏ, define potential outcomes ๐‘Œ $,& ' , ๐‘Œ $,& ( โˆˆ โ„). โ€ข ๐‘Œ& ' and ๐‘Œ& ( are potential outcomes with and without interventions. โ€ข ๐”ผ$,& : expectation over ๐‘Œ& ' and ๐‘Œ& (. ร˜Observations: n Observe one of the outcomes, ๐‘Œ $,& โˆˆ โ„, corresponding to actual intervention; that is, ๐‘Œ",& = 9 ๐‘Œ",& ' ๐‘–๐‘“ ๐‘ก โˆˆ ๐’ฏ # ๐‘Œ",& ( ๐‘–๐‘“ ๐‘ก โˆˆ ๐’ฏ " , ๐‘Œ $,& = ๐‘Œ $,& ( ๐‘“๐‘œ๐‘Ÿ ๐‘— โˆˆ ๐’ฅ!.
  5. Problem Setting ร˜Causal effects: ๐œ",& โ‰” ๐”ผ",& ๐‘Œ",& ' โˆ’

    ๐‘Œ",& ( ๐‘“๐‘œ๐‘Ÿ ๐‘ก โˆˆ ๐’ฏ #. n Estimating the causal effect by predicting ๐‘Œ",& ( for ๐‘ก โˆˆ ๐’ฏ # . ร˜Core idea. n Predict ๐‘Œ",& ( by a weighted sum of ๐‘Œ#,& ( , โ€ฆ , ๐‘Œ *,& (. A ๐‘Œ",& ( = โˆ‘ $โˆˆ*! ๐‘ค$๐‘Œ $,& ( . โ€ข A ๐‘Œ",& ( is a counterfactual trend of the treated unit. โ€ข A ๐‘Œ",& ( is called a synthetic control unit. v
  6. Contents ร˜Research questions mainly lie in estimation of the weights,

    ๐‘ค#, โ€ฆ , ๐‘ค* . n Paper 1: Synthetic Control Methods by Density Matching under Implicit Endogeneity. โ€ข Estimators in existing SCMs are not consistent (Ferman and Pinto, 2021). โ€ข We discuss the inconsistency problem from the viewpoint of endogeneity. โ€ข Propose frequentist SCMs with the generalized method of moments (GMM). n Paper 2: Bayesian Predictive Synthetic Control Methods. โ€ข Apply Bayesian predicative synthesis for SCMs. โ€ข Flexible modeling with time-varying parameter, finite-sample analysis, and minimax optimality.
  7. Least-Squares Estimator n In standard SCM, we usually estimate the

    weights by constraint least squares. โ€ข That is, we estimate ๐‘ค$ as D ๐‘ค$ ,- $โˆˆ*! = arg min ." "โˆˆ$! 1 ๐‘‡ K &โˆˆ๐’ฏ % ๐‘Œ",& ( โˆ’ ๐‘ค$๐‘Œ $,& ( ) such that K $โˆˆ*! ๐‘ค$ = 1, ๐‘ค$ โ‰ฅ 0 โˆ€๐‘— โˆˆ ๐ฝ!. n To justify the least squares (LS) estimator, we assume the linearity in the expected outcomes: ๐”ผ ๐‘Œ",& ( = K $โˆˆ*! ๐‘ค$ โˆ—๐”ผ ๐‘Œ $,& ( . v
  8. Inconsistency of the LS Estimator n Ferman and Pinto (2021)

    shows that the LS estimator is inconsistent; that is, D ๐‘ค$ ,- โ†’ 1 T ๐‘ค$ โ‰  ๐‘ค2 โˆ—. โ€ข They propose another LS-based estimator that reduces the bias. โ€ข However, the estimator is still biased. n Their results imply that the LS estimator is incompatible to SCMs under the linearity assumption, ๐”ผ ๐‘Œ",& ( = โˆ‘ $โˆˆ*! ๐‘ค$ โˆ—๐”ผ ๐‘Œ $,& ( . v
  9. Implicit Endogeneity n We investigate this problem from the viewpoint

    of endogeneity. โ€ข Let ๐‘Œ $,& ( = ๐”ผ$,& ๐‘Œ $,& ( + ๐œ€$,& . โ€ข Under ๐”ผ ๐‘Œ",& ( = โˆ‘ $โˆˆ*! ๐‘ค$ โˆ—๐”ผ ๐‘Œ $,& ( , it holds that ๐‘Œ",& ( = K $โˆˆ๐’ฅ! ๐‘ค$ โˆ— ๐‘Œ $,& ( โˆ’ K $โˆˆ๐’ฅ! ๐‘ค$ โˆ— ๐œ€$,& + ๐œ€",& = K $โˆˆ๐’ฅ! ๐‘ค$ โˆ— ๐‘Œ $,& ( + ๐‘ฃ&. n Implicit endogeneity (measurement error bias): correlation between ๐‘Œ $,& ( and ๐‘ฃ& . โ€ข There is an (implicit) endogeneity between the explanatory variable and the error term. โ€ข This is a reason why the LS estimator D ๐‘ค$ ,- is biased; that is, D ๐‘ค$ ,- โ†’ 1 T ๐‘ค$ โ‰  ๐‘ค$ โˆ—. v
  10. Mixture Models n The implicit endogeneity implies that the LS

    estimator is incompatible to SCMs. ร˜Consider another estimation strategy. n Assume mixture models and estimate the weights by the GMM. โ€ข ๐‘$,&(๐‘ฆ): density of ๐‘Œ $,& ( n Mixture models between ๐‘",&(๐‘ฆ) and ๐‘$,& ๐‘ฆ $โˆˆ๐’ฅ! : ๐‘",& ๐‘ฆ = K $โˆˆ๐’ฅ! ๐‘ค$ โˆ— ๐‘$,& ๐‘ฆ . v
  11. Fine-Grained Models. n Assuming mixture models is stronger than assuming

    ๐”ผ ๐‘Œ",& ( = โˆ‘ $โˆˆ*! ๐‘ค$ โˆ—๐”ผ ๐‘Œ $,& ( . ร˜Mixture models can be justified from the viewpoint of fine-grained models (Shi et al., 2021). n Linear factor models are usually assumed in SCMs: ๐‘Œ $,& ( = ๐‘$ + ๐›ฟ& + ๐œ†&๐œ‡$ + ๐œ€$,&, ๐‘Œ $,& ' = ๐œ",& + ๐‘Œ $,& ( โ€ข Shi et al., (2021) finds that mixture models imply factor models under some assumptions.
  12. Fine-Grained Models. ร˜Fine-grained models (Shi et al., 2021). n Assume

    that ๐‘Œ $,& ( represents a group-level outcome. n In each unit ๐‘—, there are unobserved small units ๐‘Œ $,&# ( , ๐‘Œ $,&) ( , โ€ฆ. = In each unit, there are unobserved units that constitute ๐‘Œ $,& (. โ†’ Under some assumptions, โ€ข each ๐‘$,& ๐‘ฆ can be linked to the linear factor model, and โ€ข ๐‘",& ๐‘ฆ = โˆ‘ $โˆˆ๐’ฅ! ๐‘ค$ โˆ— ๐‘$,& ๐‘ฆ holds. ๐‘Œ!,# $ ๐‘Œ%,# $ ๐‘Œ&,# $ ๐‘Œ!,#,% $ ๐‘Œ!,#,& $ ๐‘Œ!,#,' $ ๐‘Œ%,#,% $ ๐‘Œ%,#,% $ ๐‘Œ%,#,% $ ๐‘Œ&,#,% $ ๐‘Œ&,#,% $ ๐‘Œ&,#,% $
  13. Moment Conditions ร˜Moment conditions. n Under the mixture models, the

    following moment conditions hold: ๐”ผ",& ๐‘Œ",& ( 4 = K $โˆˆ๐’ฅ! ๐‘ค$ โˆ— ๐”ผ$,& ๐‘Œ $,& ( 4 โˆ€ ๐›พ โˆˆ โ„5. n Empirical approximation of ๐”ผ",& ๐‘Œ",& ( 4 โˆ’ โˆ‘ $โˆˆ๐’ฅ! ๐‘ค$ โˆ— ๐”ผ$,& ๐‘Œ $,& ( 4 : D ๐‘š4 ๐‘ค โ‰” 1 ๐‘‡" K &โˆˆ๐’ฏ % ๐‘Œ",& ( 4 โˆ’ K $โˆˆ๐’ฅ! ๐‘ค$ ๐‘Œ $,& ( 4 . โ€ข We estimate ๐‘ค to achieve D ๐‘š4 ๐‘ค โ‰ˆ 0. v v
  14. GMM n A set of positive values ฮ“ โ‰” {1,2,3,

    โ€ฆ , ๐บ}, e.x., ฮ“ = {1, 2, 3, 4, 5}. n Estimate ๐‘ค$ โˆ— as D ๐‘ค$ 677 $โˆˆ๐’ฅ! โ‰” arg min ." :โˆ‘ "โˆˆ๐’ฅ! .":# K 4โˆˆ ; D ๐‘š4 ๐‘ค ) . โ€ข We can weight each empirical moment condition; that is, by using some weight ๐‘ฃ4 โˆˆ โ„5, D ๐‘ค$ 677 $โˆˆ๐’ฅ! โ‰” arg min ." :โˆ‘ "โˆˆ๐’ฅ! .":# K 4โˆˆ ; ๐‘ฃ4 D ๐‘š4 ๐‘ค ) . n We can show that the GMM estimator is asymptotically unbiased; that is, D ๐‘ค$ 677 โ†’ 1 ๐‘ค$ โˆ—. v
  15. Inference n Hypothesis testing about the sharp null ๐ป": ๐œ",&

    = 0 for ๐‘ก โˆˆ ๐’ฏ #. โ€ข Note that ๐œ",& = ๐‘Œ",& ' โˆ’ ๐‘Œ",& ( under the linear factor model. n We usually employ the conformal inference for testing the hypothesis. โ€ข Nonparametrically test the sharp null. โ€ข Computational costs.
  16. Simulation Studies n ๐บ is chosen from 2,3,5,10,20,30,40,50,60,70,80,90,100 . ๐ฝ

    is chosen from {10, 30, 60}. โ€ข Recall that 3 ๐‘ค( )** (โˆˆ๐’ฅ! โ‰” arg min -" :โˆ‘ "โˆˆ๐’ฅ! -"0% โˆ‘1โˆˆ {%,&,โ€ฆ,4} % 6% โˆ‘#โˆˆ๐’ฏ % ๐‘Œ!,# $ 1 โˆ’ โˆ‘ (โˆˆ๐’ฅ! ๐‘ค( ๐‘Œ (,# $ 1 & . n Generate ๐‘Œ (,# $ from gaussian distributions. n The y-axis denotes the estimation error, and the x-axis denotes ๐บ.
  17. n Empirical analysis using case studies in existing studies. โ€ข

    Tobacco control in California (Abadie, Diamond and Hainmueller, 2010). โ€ข Basque conflict in the Basque country (Abadie and Gardeazabal, 2003). โ€ข Reunification of Germany (Abadie, Diamond and Hainmueller, 2015). n Pretreatment fit: Predictive ability for outcomes for ๐‘ก โˆˆ ๐‘‡" . Empirical Studies
  18. Bayesian SCMs n We introduced frequentist method for SCMs. n

    Frequentist SCMs require โ€ข Large samples for showing the convergence of the weight estimators. โ€ข Special inference methods, such as conformal inference. โ€ข Distance minimization to employ covariates, which is not easy to be justified. n Consider Bayesian approach for SCMs. โ€ข Works with finite samples. โ€ข Inference with posterior distribution.
  19. Bayesian Predictive Synthesis n Our Bayesian SCMs are based on

    the formulation of Bayesian predictive synthesis (BPS). n BPS: a method for synthesizing predictive models (McAlinn and West, 2019). โ€ข Synthesize predictive models with reflecting the model uncertainty. โ€ข A generalization of Bayesian model averaging. โ€ข Incorporating various predictive models with weighting them time-varying parameters. n We regard untreated outcomes and predictive models for the outcomes using covariates as predictors of ๐‘Œ",& ( โ€ข We first predict outcomes using covariates. โ€ข Then, we incorporate the predictors using the BPS.
  20. BPSCM n We propose SCMs with the BPS, referred to

    as the BPSCMs. ร˜BPSCM. โ€ข ฮฆ& : a set of time-varying parameters at ๐‘ก. ฮฆ& depends on ๐‘Œ",&5# ( &โˆˆ[#:&] . n The conditional density function of ๐‘Œ",&5# ( is referred to as the synthesis function, denoted by ๐›ผ ๐‘ฆ ๐‘Œ $,& ( $โˆˆ๐’ฅ! , ฮฆ> . n Bayesian decision maker predicts ๐‘Œ",&5# ( using the posterior distribution defined as ๐‘( ๐‘ฆ ๐‘Œ",&5# ( &โˆˆ[#:&] , ฮฆ&) โ‰” m ๐›ผ ๐‘ฆ ๐‘ฆ$,& ( $โˆˆ๐’ฅ! , ฮฆ&) n $โˆˆ๐’ฅ! ๐‘$,& ๐‘ฆ$,& ( d๐‘ฆ$,& ( . v
  21. Dynamic Latent Factor Linear Regression Models n There are several

    specifications for the synthesis function. n Ex. Latent factor dynamic linear model: โ€ข Set the synthesis function as ๐›ผ ๐‘ฆ",& ( ๐‘Œ $,& ( $โˆˆ๐’ฅ! , ฮฆ& = ๐œ™ ๐‘ฆ",& ( , ; ๐‘ค",& + โˆ‘ $:# * ๐‘ค$,&๐‘Œ $,& (, ๐œˆ& . โ€ข ๐œ™(โ‹…; ๐‘Ž, ๐‘)): a univariate normal density with mean ๐‘Ž and variance ๐‘). โ€ข ๐‘ฃ& are unobserved error terms. โ€ข Specify the process of ๐‘Œ",& ( and ๐‘ค&,$ as ๐‘Œ!,# $ = ๐‘ค!,# + % %โˆˆ๐’ฅ! ๐‘ค#,% ๐‘Œ %,# $ + ๐œ–# , ๐œ–# โˆผ ๐‘ 0, ๐œˆ# , ๐‘ค#,% = ๐‘ค#().% + ๐œ‚#,% , ๐œ‚#,% โˆผ ๐‘(0, ๐œˆ# ๐‘พ# ),
  22. Auxiliary Covariates ร˜The BPSCM can use covariates by predicting outcomes

    using various predictive models. โ€ข ๐‘‹$,& : Covariates for the unit ๐‘—. n Define ๐ฟ predictors for ๐‘Œ $,& ( by x ๐‘“? ๐‘‹$,& ?:# @ . โ€ข These predictors can be constructed from machine learning methods. โ€ข We can use covariates in the predictive models. n With the original untreated outcomes ๐‘Œ %,# $, there are ๐พ = 1 + ๐ฟ ๐ฟ predictrs ๐‘Œ %,# $, 3 ๐‘“+ ๐‘ฅ%,# +,) - %,) . . n We incorporate them by using the BPS.
  23. Auxiliary Covariates n A set of predictors are denoted by

    ๐’& = z๐‘Œ#,& ( , โ€ฆ , ๐‘Œ *,& (, x ๐‘“# ๐‘‹#,& , โ€ฆ , x ๐‘“@ ๐‘‹#,& , { x ๐‘“# ๐‘‹),& , โ€ฆ , x ๐‘“@ ๐‘‹*A#,& , x ๐‘“# ๐‘‹*,& , โ€ฆ , x ๐‘“@ ๐‘‹*,& n Conduct BPSCM as if there are ๐ฝ + ๐ฝ๐ฟ untreated units that can be used for SCMs: ๐‘$ ๐‘ฆ| ฮฆ# , ๐‘Œ!,# $ #โˆˆ %:# = C ๐›ผ ๐‘ฆ ๐’›# , ฮฆ# F (โˆˆ %,&,โ€ฆ, %89 : ๐‘(,# ๐‘ง(,# d๐‘ง(,# . Ex. Synthesize predictive models such as linear regression and random forest.
  24. Advantages of the BPSCM รผTime-varying parameters. รผIncorporate uncertainty of each

    untreated outcomeโ€™s outcome. รผMinimax optimality โ€ข Even under model misspecification, predictor of the BPSCM is minimax optimal in terms of KL divergence (Takanashi and McAlinn, 2021). โ€ข Avoid the implicit endogeneity problem? รผWorks with finite samples. รผInference (posterior distribution).
  25. Empirical Analysis ร˜Empirical studies using the same case studies in

    the previous slide. n Compare following five prediction models. Time-varying coef.s Using covariates Synthesized predictive models Abadie โœ“ - BPSCM โœ“ - BPSCM (Linear) โœ“ โœ“ Least squares BPSCM (RF) โœ“ โœ“ Random forests BPSCM (Linear + RF) โœ“ โœ“ Least squares + random forests
  26. n SCMs suffer from the issue of inconsistency. โ€ข The

    LS estimator is incompatible to the assumption, ๐”ผ ๐‘Œ!,# $ = โˆ‘ (โˆˆ:! ๐‘ค( โˆ—๐”ผ ๐‘Œ (,# $ . โ†’ Implicit endogeneity (measurement error bias). โ€ข ๐‘Œ!,# $ = โˆ‘ (โˆˆ:! ๐‘ค( โˆ—๐‘Œ (,# $ is not realistic...? n Frequentist density matching (Mixture model + GMM). โ€ข Mixture model ๐‘!,# ๐‘ฆ = โˆ‘ (โˆˆ๐’ฅ! ๐‘ค( โˆ— ๐‘(,# ๐‘ฆ , a stronger assumption than ๐”ผ ๐‘Œ!,# $ = โˆ‘ (โˆˆ:! ๐‘ค( โˆ—๐”ผ ๐‘Œ (,# $ . โ€ข By using the GMM under the assumption, we can estimate the weight consistently. n BPSCM. โ€ข By using the Bayesian method, we can obtain the minimax optimal predictor without assuming the mixture models without assuming mixture models. โ€ข Advantages such as flexible modeling and finite sample inference.
  27. โ€ข Abadie, A. and Gardeazabal, J. โ€œThe economic costs of

    conflict: A case study of the basque country.โ€ American Economic Review, 2003. โ€ข Abadie, A., Diamond, A., and Hainmueller, J. โ€œSynthetic control methods for comparative case studies: Estimating the effect of californiaโ€™s tobacco control program.โ€ Journal of the American Statistical Association, 2010 โ€ข Abadie, A., Diamond, A., and Hainmueller, J. โ€œComparative politics and the synthetic control method.โ€ American Journal of Political Science, 2015 โ€ข Ferman, B. and Pinto, C. Synthetic controls with imperfect pretreatment fit. Quantitative Economics, 12(4):1197โ€“1221, 2021. โ€ข McAlinn, K. and West, M., โ€œDynamic Bayesian predictive synthesis in time series forecasting,โ€ Journal of econometrics, 2019 โ€ข McAlinn, K., Aastveit, K. A., Nakajima, J., and West, M. โ€œMultivariate Bayesian predictive synthesis in macroeconomic forecasting.โ€ Journal of the American Statistical Association, 2020 โ€ข Shi, C., Sridhar, D., Misra, V., and Blei, D. On the assumptions of synthetic control methods. In AISTATS, pp. 7163โ€“7175, 2022 โ€ข Takanashi, K. and McAlinn, K. โ€œPredictions with dynamic bayesian predictive synthesis are exact minimaxโ€, 2021 โ€ข West, M. and Harrison, P. J. โ€œBayesian Forecasting & Dynamic Models.โ€ Springer Verlag, 2nd edition, 1997