1k

# LiNGAM Python package

Explains what LiNGAM python package can do at a seminar with causal discovery users ## Shohei SHIMIZU

November 05, 2021

## Transcript

1. LiNGAM Python package
Shohei SHIMIZU
Shiga University & RIKEN
13 Nov 2021

2. LiNGAM Python package
• https://github.com/cdt15/lingam
2
ぜひstarを!
Takashi Ikeuchi
SCREEN AS

3. Documentation
3

4. LiNGAM model is identifiable
(Shimizu, Hyvarinen, Hoyer & Kerminen, 2006)
• Linear Non-Gaussian Acyclic Model:
– 𝑘(𝑖) (𝑖 = 1, … , 𝑝): causal (topological) order of 𝑥!
– Error variables 𝑒!
are independent and non-Gaussian
• Coefficients and causal orders identifiable
• Causal graph identifiable
4
or
𝑥"
𝑥#
𝑥\$
Causal graph
𝑥!
= #
" # \$"(!)
𝑏!#
𝑥#
+ 𝑒! 𝒙 = 𝐵𝒙 + 𝒆
𝑒\$
𝑒" 𝑒#
𝑏#"
𝑏#\$
𝑏"\$

5. Statistical reliability assessment
• Bootstrap probability (bp) of directed paths and edges
• Interpret causal effects having bp larger than a threshold, say 5%
5
x3
x1
… …
x3
x1
x0
x3
x1
x2
x3
x1
99% 96%
Total effect:
20.9
10%
LiNGAM Python package: https://github.com/cdt15/lingam

6. Before estimating causal graphs
• Assessing assumptions by
– Gaussianity test
– Histograms
• continuous?
– Too high correlation?
• multicollinearity?
– Background knowledge
6

7. After estimating causal graphs
• Assessing assumptions by
– Testing independence of error variables, for example, by HSIC
(Gretton et al., 2005)
– Prediction accuracy using Markov boundary (Biza et al., 2020)
– Compare with the results of other datasets in which causal graphs
are expected to be similar
– Check against background knowledge
7

8. DirectLiNGAM algorithm
(Shimizu et al., 2011)
• Repeat linear regression and independence evaluation
• p>n cases (Wang & Drton, 2020)
– https://github.com/ysamwang/highDNG
8
ú
ú
ú
û
ù
ê
ê
ê
ë
é
+
ú
ú
ú
û
ù
ê
ê
ê
ë
é
ú
ú
ú
û
ù
ê
ê
ê
ë
é
-
=
ú
ú
ú
û
ù
ê
ê
ê
ë
é
2
1
3
2
1
3
2
1
3
0
3
.
1
0
0
0
5
.
1
0
0
0
e
e
e
x
x
x
x
x
x 0
0
0 0
0
0
0
0
ú
û
ù
ê
ë
é
+
ú
û
ù
ê
ë
é
ú
û
ù
ê
ë
é
-
=
ú
û
ù
ê
ë
é
2
1
)
3
(
2
)
3
(
1
)
3
(
2
)
3
(
1
0
3
.
1
0
0
e
e
r
r
r
r 0 0
)
3
(
2
r
)
3
(
1
r
x3 x1 x2
0

9. Prior knowledge
• Prior knowledge about topological orders: k(3) < k(1) < k(2)
• Use prior knowledge in estimating topological causal orders
and in pruning redundant edges
9
)
3
(
2
r
)
3
(
1
r
x3 x1 x2

10. Multiple datasets
• Simultaneously analyze different datasets to use similarity
(Ramsey et al. 2011; Shimizu, 2012)
– Similarity: Causal orders same, distributions and coefficients may differ
10
x3
x1
x2
e1
e2
e3
4
-3
2
x3
x1
x2
e1
e2
e3
-0.5
5
Dataset 1 Dataset 2

11. Multiple datasets: Longitudinal data
• Longitudinal data consist of multiple samples collected over a
period of time (Kadowaki et al., 2013)
11

12. Analysis of predictive mechanisms
• Combine the causal model and predictive model
to model the prediction mechanism
12
𝑋!
𝑋" 𝑋#
𝑋\$
𝑌
𝑋!
𝑋" #
𝑌
𝑋#
𝑋\$
𝑋!
𝑋"
𝑋#
𝑋\$
𝑌
Causal model Predictive model
#
𝑌
Prediction mechanism model
( )
4
4
4
,e
y
f
x = ( )
4
3
2
1
,
,
,
ˆ x
x
x
x
f
y = ( )
( )
c
x
do
y
E
i
=
|
ˆ
feature-with-greatest-causal-influence-on-prediction

13. Illustrative example
• Auto-MPG (miles per gallon) dataset
• Linear regression
• Which variable has the greatest intervention effect
on MPG prediction?
• Which variable should be intervened on to obtain a
certain MPG prediction? (Control)
13
Cylinders
Displacement
Weight
Horsepower
Acceleration
MPG
!
𝑀𝑃𝐺
Desired
MPG
prediction
Suggested
intervention
on cylinders
15 8
21 6
30 4

14. Time series model
• Subsampling data:
– SVAR: Structural Vector Autoregressive model (Swanson & Granger, 1997)
– Identifiability using non-Gaussianity (Hyvarinen et al., 2010)
– VARMA instead of VAR (Kawahara et al., 2011)
• Nonstationarity
– Assumption: Differences are stationarity (Moneta et al., 2013)
14
)
(
)
(
)
(
0
t
t
t
k
e
x
B
x +
-
= å
=
t
t
t
x1(t)
x1(t-1)
x2(t-1) x2(t)
e1(t-1)
e2(t-1)
e1(t)
e2(t)

15. Hidden common cause (1)
15
• Assumption: only exogenous variables allow hidden
common causes
x2 x3
x1
x2 x3
x1
f1

16. Hidden common cause (2) RCD
• For unconfounded pairs with no hidden common causes, estimate the
causal directions
• For confounded pairs with hidden common causes, let them remain
unknown
16
𝑥# 𝑥"
𝑓"
𝑥\$
Underlying model Output
𝑥%
𝑥# 𝑥"
𝑥\$
𝑥%
𝑓#

17. Time series model with hidden common
causes
• SVAR with hidden common causes
– Malinsky and Spirtes (2018)
– Gerhardus and Runge (2020)
– Nonparametric
– Conditional independence
– Python: https://github.com/jakobrunge/tigramite
17

18. Nonlinear model
• R code: http://web.math.ku.dk/~peters/code.html
18
𝑥!
= 𝑓!
(par(𝑥!
)) + 𝑒!

19. Methods based on conditional independencies
• Python: causal-learn (including LiNGAM variants)
– https://github.com/cmu-phil/causal-learn
• R: pcalg
– https://cran.r-project.org/web/packages/pcalg/index.html
19

20. Future plan
• A nonlinear version of RCD: CAM-UV
• Latent factors
• Mixed data with continuous and discrete variables
• Overcomplete ICA based method for hidden common cause
cases under development
20

21. LiNGAM for latent factors (Shimizu et al., 2009)
• Model:
– Two pure measurement variables per latent factor needed to identify the
measurement model (Silva et al., 2006; Xie et al., 2020)
• Estimate the latent factors and then their causal graph
21
𝒇 = 𝐵𝒇+𝝐
𝒙 = 𝐺𝒇+𝒆
𝑥!
𝑥"
&
𝑓!
&
𝑓"
𝑥#
𝑥\$
?

22. Find common and unique factors across
multiple datasets (Zeng et al., 2021)
• Model
• Score function: likelihood + DAGness (Zheng et al., 2018)
• Feature extraction across multiple datasets
+ causal discovery of latent factors
22
𝒇(') = 𝐵(') 𝒇(')+ 𝝐(')
𝒙(') = 𝐺(') 𝒇(')+ 𝒆(')
𝑚 = 1, … , 𝑀
!
"
!
(#)
!
!
(!)
!
\$
(!)
!
%
(!)
!
&
(!)
?
!
!
(\$)
!
\$
(\$)
!
"
!
(!)
!
%
(%)
!
&
(&)
?
!
"
#
(!)
!
"
#
(#)
!
"
#
(#) = !
"
!
(!)?