Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LiNGAM Python package

Shohei SHIMIZU
November 05, 2021

LiNGAM Python package

Explains what LiNGAM python package can do at a seminar with causal discovery users

Shohei SHIMIZU

November 05, 2021
Tweet

More Decks by Shohei SHIMIZU

Other Decks in Science

Transcript

  1. LiNGAM model is identifiable (Shimizu, Hyvarinen, Hoyer & Kerminen, 2006)

    • Linear Non-Gaussian Acyclic Model: – 𝑘(𝑖) (𝑖 = 1, … , 𝑝): causal (topological) order of 𝑥! – Error variables 𝑒! are independent and non-Gaussian • Coefficients and causal orders identifiable • Causal graph identifiable 4 or 𝑥" 𝑥# 𝑥$ Causal graph 𝑥! = # " # $"(!) 𝑏!# 𝑥# + 𝑒! 𝒙 = 𝐵𝒙 + 𝒆 𝑒$ 𝑒" 𝑒# 𝑏#" 𝑏#$ 𝑏"$
  2. Statistical reliability assessment • Bootstrap probability (bp) of directed paths

    and edges • Interpret causal effects having bp larger than a threshold, say 5% 5 x3 x1 … … x3 x1 x0 x3 x1 x2 x3 x1 99% 96% Total effect: 20.9 10% LiNGAM Python package: https://github.com/cdt15/lingam
  3. Before estimating causal graphs • Assessing assumptions by – Gaussianity

    test – Histograms • continuous? – Too high correlation? • multicollinearity? – Background knowledge 6
  4. After estimating causal graphs • Assessing assumptions by – Testing

    independence of error variables, for example, by HSIC (Gretton et al., 2005) – Prediction accuracy using Markov boundary (Biza et al., 2020) – Compare with the results of other datasets in which causal graphs are expected to be similar – Check against background knowledge 7
  5. DirectLiNGAM algorithm (Shimizu et al., 2011) • Repeat linear regression

    and independence evaluation – https://lingam.readthedocs.io/en/latest/tutorial/lingam.html • p>n cases (Wang & Drton, 2020) – https://github.com/ysamwang/highDNG 8 ú ú ú û ù ê ê ê ë é + ú ú ú û ù ê ê ê ë é ú ú ú û ù ê ê ê ë é - = ú ú ú û ù ê ê ê ë é 2 1 3 2 1 3 2 1 3 0 3 . 1 0 0 0 5 . 1 0 0 0 e e e x x x x x x 0 0 0 0 0 0 0 0 ú û ù ê ë é + ú û ù ê ë é ú û ù ê ë é - = ú û ù ê ë é 2 1 ) 3 ( 2 ) 3 ( 1 ) 3 ( 2 ) 3 ( 1 0 3 . 1 0 0 e e r r r r 0 0 ) 3 ( 2 r ) 3 ( 1 r x3 x1 x2 0
  6. Prior knowledge https://lingam.readthedocs.io/en/latest/tutorial/pk_direct.html • Prior knowledge about topological orders: k(3)

    < k(1) < k(2) • Use prior knowledge in estimating topological causal orders and in pruning redundant edges 9 ) 3 ( 2 r ) 3 ( 1 r x3 x1 x2
  7. Multiple datasets • Simultaneously analyze different datasets to use similarity

    (Ramsey et al. 2011; Shimizu, 2012) – Similarity: Causal orders same, distributions and coefficients may differ – https://lingam.readthedocs.io/en/latest/tutorial/multiple_dataset.html 10 x3 x1 x2 e1 e2 e3 4 -3 2 x3 x1 x2 e1 e2 e3 -0.5 5 Dataset 1 Dataset 2
  8. Multiple datasets: Longitudinal data • Longitudinal data consist of multiple

    samples collected over a period of time (Kadowaki et al., 2013) • https://lingam.readthedocs.io/en/latest/tutorial/longitudinal.html 11
  9. Analysis of predictive mechanisms • Combine the causal model and

    predictive model to model the prediction mechanism 12 𝑋! 𝑋" 𝑋# 𝑋$ 𝑌 𝑋! 𝑋" # 𝑌 𝑋# 𝑋$ 𝑋! 𝑋" 𝑋# 𝑋$ 𝑌 Causal model Predictive model # 𝑌 Prediction mechanism model ( ) 4 4 4 ,e y f x = ( ) 4 3 2 1 , , , ˆ x x x x f y = ( ) ( ) c x do y E i = | ˆ https://lingam.readthedocs.io/en/latest/tutorial/causal_effect.html#identification-of- feature-with-greatest-causal-influence-on-prediction
  10. Illustrative example • Auto-MPG (miles per gallon) dataset • Linear

    regression • Which variable has the greatest intervention effect on MPG prediction? • Which variable should be intervened on to obtain a certain MPG prediction? (Control) 13 Cylinders Displacement Weight Horsepower Acceleration MPG ! 𝑀𝑃𝐺 Desired MPG prediction Suggested intervention on cylinders 15 8 21 6 30 4
  11. Time series model • Subsampling data: – SVAR: Structural Vector

    Autoregressive model (Swanson & Granger, 1997) – Identifiability using non-Gaussianity (Hyvarinen et al., 2010) • https://lingam.readthedocs.io/en/latest/tutorial/var.html – VARMA instead of VAR (Kawahara et al., 2011) • https://lingam.readthedocs.io/en/latest/tutorial/varma.html • Nonstationarity – Assumption: Differences are stationarity (Moneta et al., 2013) 14 ) ( ) ( ) ( 0 t t t k e x B x + - = å = t t t x1(t) x1(t-1) x2(t-1) x2(t) e1(t-1) e2(t-1) e1(t) e2(t)
  12. Hidden common cause (1) 15 • Assumption: only exogenous variables

    allow hidden common causes x2 x3 x1 x2 x3 x1 f1 https://lingam.readthedocs.io/en/latest/tutorial/bottom_up_parce.html
  13. Hidden common cause (2) RCD • For unconfounded pairs with

    no hidden common causes, estimate the causal directions • For confounded pairs with hidden common causes, let them remain unknown 16 𝑥# 𝑥" 𝑓" 𝑥$ Underlying model Output 𝑥% 𝑥# 𝑥" 𝑥$ 𝑥% 𝑓# https://lingam.readthedocs.io/en/latest/tutorial/rcd.html
  14. Time series model with hidden common causes • SVAR with

    hidden common causes – Malinsky and Spirtes (2018) – Gerhardus and Runge (2020) – Nonparametric – Conditional independence – Python: https://github.com/jakobrunge/tigramite 17
  15. Methods based on conditional independencies • GUI: Tetrad – https://github.com/cmu-phil/tetrad

    • Python: causal-learn (including LiNGAM variants) – https://github.com/cmu-phil/causal-learn • R: pcalg – https://cran.r-project.org/web/packages/pcalg/index.html 19
  16. Future plan • A nonlinear version of RCD: CAM-UV •

    Latent factors • Mixed data with continuous and discrete variables • Overcomplete ICA based method for hidden common cause cases under development 20
  17. LiNGAM for latent factors (Shimizu et al., 2009) • Model:

    – Two pure measurement variables per latent factor needed to identify the measurement model (Silva et al., 2006; Xie et al., 2020) • Estimate the latent factors and then their causal graph 21 𝒇 = 𝐵𝒇+𝝐 𝒙 = 𝐺𝒇+𝒆 𝑥! 𝑥" & 𝑓! & 𝑓" 𝑥# 𝑥$ ?
  18. Find common and unique factors across multiple datasets (Zeng et

    al., 2021) • Model • Score function: likelihood + DAGness (Zheng et al., 2018) • Feature extraction across multiple datasets + causal discovery of latent factors 22 𝒇(') = 𝐵(') 𝒇(')+ 𝝐(') 𝒙(') = 𝐺(') 𝒇(')+ 𝒆(') 𝑚 = 1, … , 𝑀 ! " ! (#) ! ! (!) ! $ (!) ! % (!) ! & (!) ? ! ! ($) ! $ ($) ! " ! (!) ! % (%) ! & (&) ? ! " # (!) ! " # (#) ! " # (#) = ! " ! (!)?