Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

Linear non-Gaussian models with latent variable...

Shohei SHIMIZU
September 27, 2020

Linear non-Gaussian models with latent variables for causal discovery (Pacific Causal Inference Conference)

Talk slides at The 2020 Pacific Causal Inference Conference

Shohei SHIMIZU

September 27, 2020
Tweet

More Decks by Shohei SHIMIZU

Other Decks in Research

Transcript

  1. Causal discovery • A challenge of causal inference (Pearl, 2019)

    • Exploratory analysis for finding causal hypotheses – Infers a causal graph(s) in data-driven ways – Compute intervention effects based on the inferred graph – Leads to developing better hypotheses combined with domain knowledge and useful for designing future surveys and experiments 2 Sleep problems Depression mood Sleep problems Depression mood ? or OpInc.gr(t) Empl.gr(t) Sales.gr(t) R&D.gr(t) Empl.gr(t+1) Sales.gr(t+1) R&D(.grt+1) OpInc.gr(t+1) Empl.gr(t+2) Sales.gr(t+2) R&D.gr(t+2) OpInc.gr(t+2) (Moneta et al., 2012) (Rosenstrom et al., 2012) Chemistry (Campomanes et al., 2014) Epidemiology Economics
  2. A linear non-Gaussian acyclic model: LiNGAM (Shimizu et al., 2006;

    Shimizu, 2014) • Classic methods use conditional independence of variables (Pearl 2001; Spirtes 1993) – The limit is finding the Markov equivalent models • Need more assumptions to go beyond the limit – Restrictions on the functional forms or/and the distributions of variables • LiNGAM is an example – non-Gaussian assumption to examine independence – Unique identification or smaller numbers of equivalent models 3
  3. How independence and non-Gaussiniaty work? (Shimizu et al., 2011) 4

    ! = !"" + ! and " (!) are dependent, although they are uncorrelated Underlying model Regress effect on cause Regress cause on effect Residual " = " and ! (") are independent ! = ! " = "! ! + " ("! ≠ 0) " ! ! " ! (") = ! − cov !, " var " " = 1 − %!"&'( )",)! (+, )! ! − %!"(+, )" (+, )! " " (!) = " − cov ", ! var ! ! = " − "!! = " ! , " are non-Gaussian
  4. Python toolbox https://github.com/cdt15/lingam 5 • ICA-based LiNGAM algorithm • DirectLiNGAM

    • AR-LiNGAM and VARMA-LiNGAM • LiNGAM for multiple datasets • (Bottomup-) ParceLiNGAM Planning to implement more JNQPSUMJOHBN GSPNHSBQIWJ[JNQPSU%JHSBQI OQTFU@QSJOUPQUJPOT QSFDJTJPO TVQQSFTT5SVF TFFE FQTF σʔλΛ࡞੒ EFGNBLF@HSBQI EBH  E%JHSBQI FOHJOFEPU JGDPFGJOEBH GPSGSPN@ UP DPFGJO[JQ EBH<GSPN> EBH<UP> EBH<DPFG>  EFEHF GY\GSPN@^ GY\UP^ MBCFMG\DPFGG^ FMTF GPSGSPN@ UPJO[JQ EBH<GSPN> EBH<UP>  EFEHF GY\GSPN@^ GY\UP^ MBCFM SFUVSOE x3 x0 3.00 x2 6.00 x5 4.00 x4 8.00 x1 3.00 1.00 2.00 EBH\ GSPN<      > UP<      > DPFG<      > ^ NBLF@HSBQI EBH Bootstrap prob. Causal graph
  5. LiNGAM with hidden common causes (Hoyer, Shimizu, Kerminen, & Palviainen,

    2008) • Example causal graph • The model: 6 " ! ' = '' ' + ' ( = (' ' + (' ' + ( • Model: • Its matrix form: ! = # "#! !" " + # $%& ' !$ $ + ! = + + ! " !
  6. Two lines of researches 1. Estimate causal structures of variables

    that share hidden common causes 2. Estimate causal structures of variables that do not share hidden common causes 7 " ! ! " ! ! or ? " ! ! . " ! ! . or ?
  7. 1. Estimate causal structures of variables that share hidden common

    causes • ICA: Independent Component Analysis (Comon, 1991; Eriksson et al., 2004; Hyvarinen et al., 2001) – Factor analysis with no factor rotation indeterminacy – Factors are independent and non-Gaussian • LiNGAM with hidden common causes is ICA 8 = + + = ( − )(& ( − )(& LiNGAM with hidden common causes ICA
  8. Basic idea (Hoyer, Shimizu, Kerminen & Palviainen, 2008) • All

    the three models are identifiable ICA – The zero/non-zero patterns of the mixing matrices are different under the faithfulness – Apply ICA and see the zero/non-zero pattern 9 " ! = 1 0 "" !" 1 !" " ! " " ! ! ! " !" !" "" " ! = 1 "! "" 0 1 !" " ! " " ! ! ! " "! !" "" " ! = 1 0 "" 0 1 !" " ! " " ! ! ! " !" ""
  9. Identifiability (Salehkaleybar et al., 2020) • If no overlap in

    descendants of observed variables and hidden common causes, the causal orders and intervention effects are identifiable • If there is some overlap, only causal orders are identifiable, their intervention effects are not 10 " ! = 1 0 "" !" 1 !" " ! " " ! ! ! " !" !" "" " ! % = 1 0 0 "" !" 1 0 !" 0 0 1 %" " ! # " " ! ! ! " !" !" "" Overlap No overlap . . #"
  10. 2. Estimate causal structures of variables that do not share

    hidden common causes • A simple case: only exogenous variables share hidden common causes 11 " ! ! ! " . Underlying model Output / . / " ! . / . /
  11. Bottom-up approach for estimating causal orders (Tashiro, Shimizu, Hyvarinen, &

    Washio, 2014) • Do the following for all the variables - ( = 1, … , ) – Regress - on the other variables – If and only if the explanatory variables and residual are independent, the variable is an unconfounded sink • Exclude the sink • Repeat … 12 !! !" "" !# !$ !! !" "" !# !! !" "" The algorithm stops #$ ##
  12. A generalization for finding unconfounded parents of non-sink variables (Maeda

    & Shimizu, 2020) • 1. Find unconfounded ancestors of each variable • 2. Find unconfounded parents among the unconfounded ancestors found 13 Find a set of variables that gives independent residuals when # is regressed on every its subset (Lemma 3) Regress # on the unconfounded ancestors of # except ! Regress ! on the unconfounded common ancestors of ! and # If the two residuals are correlated, ! is a (unconfounded) parent of ! Otherwise not (Lemma 4) Wang and Drton (2020, arXiv preprint) considered criteria that can be applied to more general cases !! !" "" !# !$ "! !! !" "" !# !$ "! !! !!
  13. Final summary • Causal structure learning in the presence of

    hidden common causes – A challenge of causal discovery – Independence matters rather than uncorrelatedness • Future lines of research – Mixed data with continuous and discrete variables – Multiple datasets – More collaborations with domain experts • Other latent variable models – Latent factors (Shimizu et al., 2009) – latent class (Shimizu et al., 2008) etc. 14 Y. Zeng, S. Shimizu, R. Cai, F. Xie, M. Yamamoto, Z. Hao (2020, arXiv preprint)
  14. References • J. Pearl. The seven tools of causal inference

    with reflections on machine learning. Communications of the ACM, 62(3), 54-60, 2019 • T. Rosenström, M. Jokela, S. Puttonen, M. Hintsanen, L. Pulkki-Råback, J. S. Viikari, O. T. Raitakari and L. Keltikangas-Järvinen. Pairwise measures of causal direction in the epidemiology of sleep problems and depression. PLoS ONE, 7(11): e50841, 2012 • A. Moneta, D. Entner, P. O. Hoyer and A. Coad. Causal inference by independent component analysis: Theory and applications. Oxford Bulletin of Economics and Statistics, 75(5): 705-730, 2013. • P. Campomanes, M. Neri, B. A.C. Horta, U. F. Roehrig, S. Vanni, I. Tavernelli and U. Rothlisberger. Origin of the spectral shifts among the early intermediates of the rhodopsin photocycle. Journal of the American Chemical Society, 136(10): 3842-3851, 2014 • S. Shimizu, P. O. Hoyer, A. Hyvärinen and A. Kerminen. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7: 2003--2030, 2006 • S. Shimizu. LiNGAM: Non-Gaussian methods for estimating causal structures. Behaviormetrika, 41(1): 65--98, 2014 • S. Shimizu, T. Inazumi, Y. Sogawa, A. Hyvärinen, Y. Kawahara, T. Washio, P. O. Hoyer and K. Bollen. DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. Journal of Machine Learning Research, 12(Apr): 1225-- 1248, 2011. • J. Pearl. Causality. Cambridge University Press, 2001. • P. Spirtes, C. Glymour, R. Scheines. Causation, Prediction, and Search. Springer, 1993. 15
  15. References • P. O. Hoyer, S. Shimizu, A. Kerminen and

    M. Palviainen. Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 49(2): 362-378, 2008 • P. Comon. Independent component analysis, a new concept? Signal processing, 1994 • J. Eriksson, V. Koivunen. Identifiability, separability, and uniqueness of linear ICA models. IEEE signal processing letters, 2004 • A. Hyvärinen, J. Karhunen, E. Oja. Independent Component Analysis, Wiley, 2001 • S. Salehkaleybar, A. Ghassami, N. Kiyavash, K. Zhang. Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables. Journal of Machine Learning Research, 21:1-24, 2020 • T. Tashiro, S. Shimizu, A. Hyvärinen and T. Washio. ParceLiNGAM: A causal ordering method robust against latent confounders. Neural Computation, 26(1): 57--83, 2014 • T. N. Maeda, S. Shimizu. RCD: Repetitive causal discovery of linear non-Gaussian acyclic models with latent confounders. In Proc. 23rd International Conference on Artificial Intelligence and Statistics (AISTATS2020), 2020 • Y. S. Wang, M. Drton. Causal Discovery with Unobserved Confounding and non-Gaussian Data. Arxiv preprint arXiv:2007.11131, 2020 • S. Shimizu, P. O. Hoyer and A. Hyvärinen. Estimation of linear non-Gaussian acyclic models for latent factors. Neurocomputing, 72: 2024-2027, 2009. • S. Shimizu and A. Hyvärinen. Discovery of linear non-gaussian acyclic models in the presence of latent classes. In Proc. 14th Int. Conf. on Neural Information Processing (ICONIP2007), pp. 752-761, Kitakyushu, Japan, 2008. • Y. Zeng, S. Shimizu, R. Cai, F. Xie, M. Yamamoto, Z. Hao. Causal Discovery with Multi-Domain LiNGAM for Latent Factors. Arxiv preprint arXiv:2009.09176, 2020. 16