LiNGAM Python package

LiNGAM Python package Shohei SHIMIZU Shiga University & RIKEN 13
Nov 2021

LiNGAM Python package • https://github.com/cdt15/lingam 2 ぜひstarを! Takashi Ikeuchi SCREEN
AS

Documentation • https://lingam.readthedocs.io/en/latest/# 3

LiNGAM model is identifiable (Shimizu, Hyvarinen, Hoyer & Kerminen, 2006)
• Linear Non-Gaussian Acyclic Model: – 𝑘(𝑖) (𝑖 = 1, … , 𝑝): causal (topological) order of 𝑥! – Error variables 𝑒! are independent and non-Gaussian • Coefficients and causal orders identifiable • Causal graph identifiable 4 or 𝑥" 𝑥# 𝑥$ Causal graph 𝑥! = # " # $"(!) 𝑏!# 𝑥# + 𝑒! 𝒙 = 𝐵𝒙 + 𝒆 𝑒$ 𝑒" 𝑒# 𝑏#" 𝑏#$ 𝑏"$

Statistical reliability assessment • Bootstrap probability (bp) of directed paths
and edges • Interpret causal effects having bp larger than a threshold, say 5% 5 x3 x1 … … x3 x1 x0 x3 x1 x2 x3 x1 99% 96% Total effect: 20.9 10% LiNGAM Python package: https://github.com/cdt15/lingam

Before estimating causal graphs • Assessing assumptions by – Gaussianity
test – Histograms • continuous? – Too high correlation? • multicollinearity? – Background knowledge 6

After estimating causal graphs • Assessing assumptions by – Testing
independence of error variables, for example, by HSIC (Gretton et al., 2005) – Prediction accuracy using Markov boundary (Biza et al., 2020) – Compare with the results of other datasets in which causal graphs are expected to be similar – Check against background knowledge 7

DirectLiNGAM algorithm (Shimizu et al., 2011) • Repeat linear regression
and independence evaluation – https://lingam.readthedocs.io/en/latest/tutorial/lingam.html • p>n cases (Wang & Drton, 2020) – https://github.com/ysamwang/highDNG 8 ú ú ú û ù ê ê ê ë é + ú ú ú û ù ê ê ê ë é ú ú ú û ù ê ê ê ë é - = ú ú ú û ù ê ê ê ë é 2 1 3 2 1 3 2 1 3 0 3 . 1 0 0 0 5 . 1 0 0 0 e e e x x x x x x 0 0 0 0 0 0 0 0 ú û ù ê ë é + ú û ù ê ë é ú û ù ê ë é - = ú û ù ê ë é 2 1 ) 3 ( 2 ) 3 ( 1 ) 3 ( 2 ) 3 ( 1 0 3 . 1 0 0 e e r r r r 0 0 ) 3 ( 2 r ) 3 ( 1 r x3 x1 x2 0

Prior knowledge https://lingam.readthedocs.io/en/latest/tutorial/pk_direct.html • Prior knowledge about topological orders: k(3)
< k(1) < k(2) • Use prior knowledge in estimating topological causal orders and in pruning redundant edges 9 ) 3 ( 2 r ) 3 ( 1 r x3 x1 x2

Multiple datasets • Simultaneously analyze different datasets to use similarity
(Ramsey et al. 2011; Shimizu, 2012) – Similarity: Causal orders same, distributions and coefficients may differ – https://lingam.readthedocs.io/en/latest/tutorial/multiple_dataset.html 10 x3 x1 x2 e1 e2 e3 4 -3 2 x3 x1 x2 e1 e2 e3 -0.5 5 Dataset 1 Dataset 2

Multiple datasets: Longitudinal data • Longitudinal data consist of multiple
samples collected over a period of time (Kadowaki et al., 2013) • https://lingam.readthedocs.io/en/latest/tutorial/longitudinal.html 11

Analysis of predictive mechanisms • Combine the causal model and
predictive model to model the prediction mechanism 12 𝑋! 𝑋" 𝑋# 𝑋$ 𝑌 𝑋! 𝑋" # 𝑌 𝑋# 𝑋$ 𝑋! 𝑋" 𝑋# 𝑋$ 𝑌 Causal model Predictive model # 𝑌 Prediction mechanism model ( ) 4 4 4 ,e y f x = ( ) 4 3 2 1 , , , ˆ x x x x f y = ( ) ( ) c x do y E i = | ˆ https://lingam.readthedocs.io/en/latest/tutorial/causal_effect.html#identification-of- feature-with-greatest-causal-influence-on-prediction

Illustrative example • Auto-MPG (miles per gallon) dataset • Linear
regression • Which variable has the greatest intervention effect on MPG prediction? • Which variable should be intervened on to obtain a certain MPG prediction? (Control) 13 Cylinders Displacement Weight Horsepower Acceleration MPG ! 𝑀𝑃𝐺 Desired MPG prediction Suggested intervention on cylinders 15 8 21 6 30 4

Time series model • Subsampling data: – SVAR: Structural Vector
Autoregressive model (Swanson & Granger, 1997) – Identifiability using non-Gaussianity (Hyvarinen et al., 2010) • https://lingam.readthedocs.io/en/latest/tutorial/var.html – VARMA instead of VAR (Kawahara et al., 2011) • https://lingam.readthedocs.io/en/latest/tutorial/varma.html • Nonstationarity – Assumption: Differences are stationarity (Moneta et al., 2013) 14 ) ( ) ( ) ( 0 t t t k e x B x + - = å = t t t x1(t) x1(t-1) x2(t-1) x2(t) e1(t-1) e2(t-1) e1(t) e2(t)

Hidden common cause (1) 15 • Assumption: only exogenous variables
allow hidden common causes x2 x3 x1 x2 x3 x1 f1 https://lingam.readthedocs.io/en/latest/tutorial/bottom_up_parce.html

Hidden common cause (2) RCD • For unconfounded pairs with
no hidden common causes, estimate the causal directions • For confounded pairs with hidden common causes, let them remain unknown 16 𝑥# 𝑥" 𝑓" 𝑥$ Underlying model Output 𝑥% 𝑥# 𝑥" 𝑥$ 𝑥% 𝑓# https://lingam.readthedocs.io/en/latest/tutorial/rcd.html

Time series model with hidden common causes • SVAR with
hidden common causes – Malinsky and Spirtes (2018) – Gerhardus and Runge (2020) – Nonparametric – Conditional independence – Python: https://github.com/jakobrunge/tigramite 17

Nonlinear model • Additive noise model: • R code: http://web.math.ku.dk/~peters/code.html
18 𝑥! = 𝑓! (par(𝑥! )) + 𝑒!

Methods based on conditional independencies • GUI: Tetrad – https://github.com/cmu-phil/tetrad
• Python: causal-learn (including LiNGAM variants) – https://github.com/cmu-phil/causal-learn • R: pcalg – https://cran.r-project.org/web/packages/pcalg/index.html 19

Future plan • A nonlinear version of RCD: CAM-UV •
Latent factors • Mixed data with continuous and discrete variables • Overcomplete ICA based method for hidden common cause cases under development 20

LiNGAM for latent factors (Shimizu et al., 2009) • Model:
– Two pure measurement variables per latent factor needed to identify the measurement model (Silva et al., 2006; Xie et al., 2020) • Estimate the latent factors and then their causal graph 21 𝒇 = 𝐵𝒇+𝝐 𝒙 = 𝐺𝒇+𝒆 𝑥! 𝑥" & 𝑓! & 𝑓" 𝑥# 𝑥$ ?

Find common and unique factors across multiple datasets (Zeng et
al., 2021) • Model • Score function: likelihood + DAGness (Zheng et al., 2018) • Feature extraction across multiple datasets + causal discovery of latent factors 22 𝒇(') = 𝐵(') 𝒇(')+ 𝝐(') 𝒙(') = 𝐺(') 𝒇(')+ 𝒆(') 𝑚 = 1, … , 𝑀 ! " ! (#) ! ! (!) ! $ (!) ! % (!) ! & (!) ? ! ! ($) ! $ ($) ! " ! (!) ! % (%) ! & (&) ? ! " # (!) ! " # (#) ! " # (#) = ! " ! (!)?

LiNGAM Python package

LiNGAM Python package

Shohei SHIMIZU

More Decks by Shohei SHIMIZU

Other Decks in Science

Featured

Transcript

LiNGAM Python package Shohei SHIMIZU Shiga University & RIKEN 13

LiNGAM Python package • https://github.com/cdt15/lingam 2 ぜひstarを! Takashi Ikeuchi SCREEN

Documentation • https://lingam.readthedocs.io/en/latest/# 3

LiNGAM model is identifiable (Shimizu, Hyvarinen, Hoyer & Kerminen, 2006)

Statistical reliability assessment • Bootstrap probability (bp) of directed paths

Before estimating causal graphs • Assessing assumptions by – Gaussianity

After estimating causal graphs • Assessing assumptions by – Testing

DirectLiNGAM algorithm (Shimizu et al., 2011) • Repeat linear regression

Prior knowledge https://lingam.readthedocs.io/en/latest/tutorial/pk_direct.html • Prior knowledge about topological orders: k(3)

Multiple datasets • Simultaneously analyze different datasets to use similarity

Multiple datasets: Longitudinal data • Longitudinal data consist of multiple

Analysis of predictive mechanisms • Combine the causal model and

Illustrative example • Auto-MPG (miles per gallon) dataset • Linear

Time series model • Subsampling data: – SVAR: Structural Vector

Hidden common cause (1) 15 • Assumption: only exogenous variables

Hidden common cause (2) RCD • For unconfounded pairs with

Time series model with hidden common causes • SVAR with

Nonlinear model • Additive noise model: • R code: http://web.math.ku.dk/~peters/code.html

Methods based on conditional independencies • GUI: Tetrad – https://github.com/cmu-phil/tetrad

Future plan • A nonlinear version of RCD: CAM-UV •

LiNGAM for latent factors (Shimizu et al., 2009) • Model:

Find common and unique factors across multiple datasets (Zeng et