LiNGAM approach to causal discovery (Preliminary version)

LiNGAM approach to causal discovery Shohei SHIMIZU Shiga University &
RIKEN The KDD2021 Workshop on Causal Discovery (CD2021)

What is causal discovery? • Methodology for inferring causal graphs
using data 2 Maeda and Shimizu (2020) Assumptions • Functional form? • Distribution? • Hidden common cause present? • Acyclic? etc. Data Causal graph

Causal graphs are the key to statistical causal inference •
Estimate intervention effects – Need causal graph to select variables to be adjusted, e.g., using backdoor criterion (Pearl, 1995) • Also useful for machine learning – E.g., domain adaptation (Zhang et al., 2020), fairness (Kuzner et al., 2017), and interpretability (Blobaum & Shimizu, 2017) 3 Messerli (2012) Chocolate Nobel laureates GDP Number of Nobel laureates Chocolate consumption

How do we draw a causal graph? • Common way:
Use background knowledge • Often need to use both background knowledge AND DATA • Causal discovery: Infer the causal graph from data 4 ? or or Chocolate Nobel laureates GDP Chocolate Nobel GDP Chocolate Nobel GDP Chocolate Nobel GDP

Application areas https://sites.google.com/view/sshimizu06/lingam/lingampapers/applications-and-tailor-made-methods 5 Epidemiology Economics Sleep problems Depression mood
Sleep problems Depression mood ? or OpInc.gr(t) Empl.gr(t) Sales.gr(t) R&D.gr(t) Empl.gr(t+1) Sales.gr(t+1) R&D(.grt+1) OpInc.gr(t+1) Empl.gr(t+2) Sales.gr(t+2) R&D.gr(t+2) OpInc.gr(t+2) (Moneta et al., 2012) (Rosenstrom et al., 2012) Neuroscience Chemistry (Campomanes et al., 2014) (Boukrina & Graves, 2013) Prevention Medicine (Kotoku et al., 2020) Climatology (Liu & Niyogi, 2020)

Causal discovery is a challenge in causal inference • Classical
non-parametric approach uses conditional independence (Pearl 2001; Spirtes 1993) – Make no assumptions about function forms or distribution – The limit is finding the Markov equivalent models • Additional assumptions needed to go beyond the limit – Restrictions on functional forms and distributions – Uniquely Identifiable or Smaller numbers of Equivalent models • LiNGAM is one example (Shimizu et al., 2006; Shimizu, 2014). – Non-Gaussian assumption to exploit independence – Growing literature on its variants (Peters et al., 2018; Shimizu & Blobaum, 2020) 6

non-parametric approach uses conditional independence (Pearl 2001; Spirtes 1993) – Make no assumptions about function forms or distribution – The limit is finding the Markov equivalent models • Additional assumptions needed to go beyond the limit – Restrictions on functional forms and distributions – Uniquely identifiable or smaller numbers of equivalent models • LiNGAM is one example (Shimizu et al., 2006; Shimizu, 2014). – Non-Gaussian assumption to exploit independence – Growing literature on its variants (Peters et al., 2018; Shimizu & Blobaum, 2020) 7

non-parametric approach uses conditional independence (Pearl 2001; Spirtes 1993) – Make no assumptions about function forms or distribution – The limit is finding the Markov equivalent models • Additional assumptions needed to go beyond the limit – Restrictions on functional forms and distributions – Uniquely identifiable or smaller numbers of equivalent models • LiNGAM is one example (Shimizu et al., 2006; Shimizu, 2014). – Non-Gaussian assumption to exploit independence – Growing literature on its variants (Peters et al., 2018; Shimizu & Blobaum, 2020) 8

Methods of causal discovery 9

Framework • Structural causal model (Pearl, 2001) • Make assumptions
and find a causal graph(s) that is consistent with the data – Typical example 1: • Directed acyclic graph (DAG) • No hidden common cause (all observed) – Typical example 2: • DAG • Hidden common causes may exist 10 x3 x1 e3 e1 x2 e2 Error variable 𝑥! = 𝑓! (parents of 𝑥! , 𝑒! )

Non-parametric approach To what extent can we infer the causal
graph without making any assumptions about the functional form or distribution? 11 Spirtes, Glymour, Shceines, 2001 (2nd ed)

Non-parametric approach: Example 1. Making assumptions on the underlying causal
graph – Directed acyclic graph – No hidden common causes (all have been observed) 2. Find the graph that best matches the data among such causal graphs that satisfy the assumptions. 12 If x and y are independent in the data, select (c) on the right. If x and y are dependent in the data, select (a) and (b). (a) and (b) are indistinguishable (not uniquely identifiable): Markov equivalence class Three candidates x y x y x y (a) (b) (c)

Non-parametric approach: Example 1. Making assumptions on the underlying causal
graph – Directed acyclic graph – No hidden common causes (all have been observed) 2. Find the graph that best matches the data among such causal graphs that satisfy the assumptions. 13 If x and y are independent in the data, select (c) on the right. If x and y are dependent in the data, select (a) and (b). (a) and (b) are indistinguishable (not uniquely identifiable): Markov equivalence class Three candidates x y x y x y (a) (b) (c)

Various extensions • Equivalent models including unobserved common causes (Spirtes
et al., 1995) • Those for time series cases (Malinsky & Spirtes, 2018) • Equivalence class including cyclic graphs (Richardson, 1996) • Lower bound on intervention effects (Maathuis et al., 2009; Malinsky & Spirtes, 2017) 14 x y ｆ w z x y w z x y ｆ1 w z ｆ2 F. Eberhardt CRM Workshop 2016

Semi-parametric approach: Make additional assumptions on function forms and distributions
What are the assumptions for making causal graphs identifiable? 15

Make additional assumptions on functional forms and distributions • More
information available than conditional independence • E.g., linearity + non-Gaussian continuous distribution 16 Results in different distributions of x1 and x2 No difference in terms of their conditional independence x y x y (a) (b)

LiNGAM model is identifiable (Shimizu, Hyvarinen, Hoyer & Kerminen, 2006)
• Linear Non-Gaussian Acyclic Model: – 𝑘(𝑖) (𝑖 = 1, … , 𝑝): causal (topological) order of 𝑥! – Error variables 𝑒! independent and non-Gaussian • Coefficients and causal orders identifiable • Causal graph identifiable 17 or 𝑥" 𝑥# 𝑥$ Causal graph 𝑥! = # " # $"(!) 𝑏!# 𝑥# + 𝑒! 𝒙 = 𝐵𝒙 + 𝒆 𝑒$ 𝑒" 𝑒# 𝑏#" 𝑏#$ 𝑏"$

How do we use non-Gaussianity and independence? 18 𝑏!" 𝑥!
= 𝑏!"𝑒" + 𝑒! and 𝑟" (!) are dependent, although they are uncorrelated Residual 𝑥" = 𝑒" and 𝑟! (") are independent 𝑟" (#) = 𝑥" − cov 𝑥", 𝑥# var 𝑥# 𝑥# = 1 − '!"()* +",+! *-. +! 𝑒" − '!"*-. +" *-. +! 𝑒# 𝑟# (") = 𝑥# − cov 𝑥# , 𝑥" var 𝑥" 𝑥" = 𝑥# − 𝑏#" 𝑥" = 𝑒# Underlying model 𝑥" = 𝑒" 𝑥# = 𝑏#" 𝑥" + 𝑒# (𝑏#" ≠ 0) 𝑥# 𝑥" 𝑒" 𝑒# 𝑒! , 𝑒" non-Gaussian Regress effect x2 on cause x1 Regress cause x1 on effect x2

Independence measure (Hyvarinen & Smith, 2013) • Can compute difference
of mutual information of explanatory variable and its residual for different directions by one- dimensional entropy • Maximum entropy approximation of entropy 𝐻 (Hyvarinen, 1999) 19 𝐻(𝑢) ≈ 𝐻 𝑣 − 𝑘- [𝐸 log cosh 𝑢 − 𝛾].−𝑘. [𝐸 𝑢 exp (−𝑢./2 ]. 𝐼 𝑥" , 𝑟# " − 𝐼 𝑥# , 𝑟" # = 𝐻 𝑥" + 𝐻 𝑟# " sd 𝑟# " − 𝐻 𝑥# + 𝐻 𝑟" # sd 𝑟" #

Evaluation of estimated causal graphs 20

Before estimating causal graphs • Assessing assumptions by – Gaussianity
test – Histograms • continuous? – Too high correlation? • multicollinearity? – Background knowledge 21

After estimating causal graphs • Assessing assumptions by – Testing
independence of error variables, e.g., by HSIC (Gretton et al., 2005) – Prediction accuracy using Markov boundary (Biza et al., 2020) – Compare to the results of other datasets in which causal graphs expected to be similar – Check against background knowledge 22

Statistical reliability assessment • Bootstrap probability (bp) of directed paths
and edges • Interpret causal effects whose bp larger than a threshold, say 5% 23 x3 x1 … … x3 x1 x0 x3 x1 x2 x3 x1 99% 96% Total effect: 20.9 10% LiNGAM Python package: https://github.com/cdt15/lingam

To relax the model assumptions 24

Other identifiable models • Nonlinearity + “additive” noise (Hoyer+08NIPS, Zhang+09UAI,
Peters+14JMLR) • 𝑥% = 𝑓%(par(𝑥%)) + 𝑒% • 𝑥% = 𝑔% &"(𝑓%(par(𝑥%)) + 𝑒%) • Discrete variables – Poisson DAG model and its extensions (Park+18JMLR) • Mixed types of variables: LiNGAM + logistic-type model – Identifiability condition for two variables (Wenjuan+18IJCAI) – Probably ok also for multivariate cases using the idea of Thm.28 of Peters et al. (2014) 25

Peters+14JMLR) • 𝑥% = 𝑓%(par(𝑥%)) + 𝑒% • 𝑥% = 𝑔% &"(𝑓%(par(𝑥%)) + 𝑒%) • Discrete variables – Poisson DAG model and its extensions (Park+18JMLR) • Mixed types of variables: LiNGAM + logistic-type model – Identifiability condition for two variables (Wenjuan+18IJCAI) – Probably ok also for multivariate cases using the idea of Thm.28 of Peters et al. (2014) 26

Peters+14JMLR) • 𝑥% = 𝑓%(par(𝑥%)) + 𝑒% • 𝑥% = 𝑔% &"(𝑓%(par(𝑥%)) + 𝑒%) • Discrete variables – Poisson DAG model and its extensions (Park+18JMLR) • Mixed types of variables: LiNGAM + logistic-type model – Identifiability condition for two variables (Wenjuan+18IJCAI) 27

For better statistical reliability 28

For better statistical reliability • Use background knowledge in estimation
– Causal orders – Specify functional forms – Specify distribution • E.g., in manufacturing, causal orders of these 3 groups often known – Manufacturing conditions – Intermediate characteristics – Final characteristic(s) 29 Final characteristic Manufacturing Condition 1 Manufacturing Condition 10 Intermediate chrctrstc 1 Intermediate chrctrstc 100 … Intermediate chrctrstc 82 Intermediate chrctrstc 8 Intermediate chrctrstc 66 Intermediate chrctrstc 66 Intermediate chrctrstc 16 … … … …

For better statistical reliability • Simultaneously analyze different datasets to
use similarity (Ramsey et al. 2011; Shimizu, 2012) – Similarity: Causal orders same, distributions and coefficients may different – Accuracy greatly improved in fMRI simulated data (Ramsey et al., 2011) 30 x3 x1 x2 e1 e2 e3 4 -3 2 x3 x1 x2 e1 e2 e3 -0.5 5 Dataset 1 Dataset 2

LiNGAM with hidden common causes 31

Estimate causal structures of variables that do not share hidden
common causes • For unconfounded pairs with no hidden common causes, estimate the causal directions • For confounded pairs with hidden common causes, leave them remain unknown 32 𝑥# 𝑥" 𝑓" 𝑥$ Underlying model Output 𝑥0 𝑥# 𝑥" 𝑥$ 𝑥0 𝑓#

Non-Gaussianity and independence work again • Existence of hidden common
causes leads to dependence btw. explanatory variable and its residual (Tashiro et al., 2014) • Key result (Maeda & Shimizu, 2020) – Find a set of variables that that gives independent residual when a variable is regressed on every its subset – If succeeded, variables in such a set (x1 and x2) are the unconfounded ancestors of the variable (x4) • For nonlinear additive models, existence of hidden intermediate variables also leads to dependence (Maeda & Shimizu, 2021) 33 𝑥# 𝑥" 𝑓" !! !" "" !# !$ "! !! 𝑥# 𝑥" 𝑓$

causes leads to dependence btw. explanatory variable and its residual (Tashiro et al., 2014) • Key result (Maeda & Shimizu, 2020) – Find a set of variables that that gives independent residual when a variable is regressed on every its subset – If succeeded, variables in such a set (x1 and x2) are unconfounded ancestors of the variable (x4) • For nonlinear additive models, existence of hidden intermediate variables also leads to dependence (Maeda & Shimizu, 2021) 34 𝑥# 𝑥" 𝑓" !! !" "" !# !$ "! !! 𝑥# 𝑥" 𝑓$

causes leads to dependence btw. explanatory variable and its residual (Tashiro et al., 2014) • Key result (Maeda & Shimizu, 2020) – Find a set of variables that that gives independent residual when a variable is regressed on every its subset – If succeeded, variables in such a set (x1 and x2) are unconfounded ancestors of the variable (x4) • For nonlinear additive models, existence of hidden intermediate variables also leads to dependence (Maeda & Shimizu, 2021) 35 𝑥# 𝑥" 𝑓" !! !" "" !# !$ "! !! 𝑥# 𝑥" 𝑓$

Estimate causal structures of variables that share hidden common causes
(Hoyer, Shimizu, Kerminen & Palviainen, 2008; Salehkaleybar et al., 2020) • LiNGAM with unobserved common cause is ICA (Hyvarinen et al.,2001) • Apply ICA and look at the zero/non-zero pattern 36 𝒙 = 𝐵𝒙 + 𝛬𝒇 + 𝒆 𝒙 = (𝐼 − 𝐵)"# (𝐼 − 𝐵)"#𝛬 𝒆 𝒇 𝑥" 𝑥! = 1 0 𝜆"" 𝑏!" 1 𝜆!" 𝑒" 𝑒! 𝑓" 𝑥# 𝑥" 𝑓" 𝑒" 𝑒# 𝑏!" 𝜆!" 𝜆"" 𝑥" 𝑥! = 1 𝑏"! 𝜆"" 0 1 𝜆!" 𝑒" 𝑒! 𝑓" 𝑥# 𝑥" 𝑓" 𝑒" 𝑒# 𝑏"! 𝜆!" 𝜆"" 𝑥" 𝑥! = 1 0 𝜆"" 0 1 𝜆!" 𝑒" 𝑒! 𝑓" 𝑥# 𝑥" 𝑓" 𝑒" 𝑒# 𝜆!" 𝜆"" Independent components

Estimate causal structures of variables that share hidden common causes
(Hoyer, Shimizu, Kerminen & Palviainen, 2008; Salehkaleybar et al., 2020) • LiNGAM with unobserved common cause is ICA (Hyvarinen et al.,2001) • Apply ICA and look at the zero/non-zero pattern 37 𝒙 = 𝐵𝒙 + 𝛬𝒇 + 𝒆 𝒙 = (𝐼 − 𝐵)"# (𝐼 − 𝐵)"#𝛬 𝒆 𝒇 𝑥" 𝑥! = 1 0 𝜆"" 𝑏!" 1 𝜆!" + 𝜆!"𝜆"" 𝑒" 𝑒! 𝑓" 𝑥# 𝑥" 𝑓" 𝑒" 𝑒# 𝑏!" 𝜆!" 𝜆"" 𝑥" 𝑥! = 1 𝑏"! 𝜆"" + 𝑏"!𝜆!" 0 1 𝜆!" 𝑒" 𝑒! 𝑓" 𝑥# 𝑥" 𝑓" 𝑒" 𝑒# 𝑏"! 𝜆!" 𝜆"" 𝑥" 𝑥! = 1 0 𝜆"" 0 1 𝜆!" 𝑒" 𝑒! 𝑓" 𝑥# 𝑥" 𝑓" 𝑒" 𝑒# 𝜆!" 𝜆"" Independent components

LiNGAM for latent factors 38

LiNGAM for latent factors (Shimizu et al., 2009) • Model:
– 2 pure measurement variables per latent needed to identify the measurement model (Silva et al., 2006; Xie et al., 2020) • Estimate the latent factors and then their causal graph 39 𝑥" 𝑥! $ 𝑓" $ 𝑓! 𝑥# 𝑥$ ? 𝒇 = 𝐵𝒇+𝝐 𝒙 = 𝐺𝒇+𝒆

Find common and unique factors across multiple datasets (Zeng et
al., 2021) • Model • Score function: likelihood + DAGness (Zheng et al., 2018) • Feature extraction across multiple datasets + causal discovery of latent factors 40 𝒇(1) = 𝐵(1) 𝒇(1)+ 𝝐(1) 𝒙(1) = 𝐺(1) 𝒇(1)+ 𝒆(1) 𝑚 = 1, … , 𝑀 ! " ! (#) ! ! (!) ! $ (!) ! % (!) ! & (!) ? ! ! ($) ! $ ($) ! " ! (!) ! % (%) ! & (&) ? ! " # (!) ! " # (#) ! " # (#) = ! " ! (!)?

Final summary 41

Final summary • Statistical causal inference is a fundamental tool
for science – Many well-developed methods available in cases that a causal graph can be drawn with background knowledge – Helping drawing causal graphs with data is the key: Causal discovery • LiNGAM-related papers: https://sites.google.com/view/sshimizu06/lingam/lingampapers • Next default assumptions – Hidden common cause / latent factors – Mixed data: Continuous and discrete – (Cyclicity (Lacerda et al., 2008)) 42

References • T. N. Maeda, S. Shimizu. RCD: Repetitive causal
discovery of linear non-Gaussian acyclic models with latent confounders. In Proc. 23rd International Conference on Artificial Intelligence and Statistics (AISTATS2020), 2020 • F. H. Messerli, Chocolate Consumption, Cognitive Function, and Nobel Laureates. New England Journal of Medicine, 2012. • T. Rosenström, M. Jokela, S. Puttonen, M. Hintsanen, L. Pulkki-Råback, J. S. Viikari, O. T. Raitakari and L. Keltikangas-Järvinen. Pairwise measures of causal direction in the epidemiology of sleep problems and depression. PLoS ONE, 7(11): e50841, 2012 • A. Moneta, D. Entner, P. O. Hoyer and A. Coad. Causal inference by independent component analysis: Theory and applications. Oxford Bulletin of Economics and Statistics, 75(5): 705-730, 2013. • O. Boukrina and W. W. Graves. Neural networks underlying contributions from semantics in reading aloud. Frontiers in Human Neuroscience, 7:518, 2013. • P. Campomanes, M. Neri, B. A.C. Horta, U. F. Roehrig, S. Vanni, I. Tavernelli and U. Rothlisberger. Origin of the spectral shifts among the early intermediates of the rhodopsin photocycle. Journal of the American Chemical Society, 136(10): 3842-3851, 2014. • Peters, Janzing, and Schölkopf. (2018). Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press. • S. Shimizu and P. Blöbaum. Recent advances in semi-parametric methods for causal discovery. In Direction Dependence in Statistical Models: Methods of Analysis (W. Wiedermann, D. Kim, E. Sungur, and A. von Eye, eds.), Chapter. Wiley, 2020. 43

References • J. Pearl. Causality. Cambridge University Press, 2001. •
P. Spirtes, C. Glymour, R. Scheines. Causation, Prediction, and Search. Springer, 1993. • S. Shimizu, P. O. Hoyer, A. Hyvärinen and A. Kerminen. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7: 2003--2030, 2006 • S. Shimizu. LiNGAM: Non-Gaussian methods for estimating causal structures. Behaviormetrika, 41(1): 65--98, 2014 • P. Spirtes, C. Meek, T. S. Richardson. Causal Inference in the Presence of Latent Variables and Selection Bias. In Proc. 11th Conf. on Uncertainty in Artificial Intelligence (UAI1995), 1995. • D. Malinsky and P. Spirtes. Causal Structure Learning from Multivariate Time Series in Settings with Unmeasured Confounding. In Proc. 2018 ACM SIGKDD Workshop on Causal Discovery (KDD-CD), 2018. • T. S. Richardson. A Discovery Algorithm for Directed Cyclic Graphs. In Proc. 12th Conf. on Uncertainty in Artificial Intelligence (UAI1996), 1996. 44

References • D. Malinsky and P. Spirtes, Estimating bounds on
causal effects in high-dimensional and possibly confounded systems. International J. Approximate Reasoning, 2017 • S. Shimizu, T. Inazumi, Y. Sogawa, A. Hyvärinen, Y. Kawahara, T. Washio, P. O. Hoyer and K. Bollen. DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. Journal of Machine Learning Research, 12(Apr): 1225--1248, 2011. • A. Hyvärinen and S. M. Smith. Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. Journal of Machine Learning Research, 14(Jan): 111--152, 2013. • A. Hyvarinen. New approximations of differential entropy for independent component analysis and projection pursuit, In Advances in Neural Information Processing Systems 12 (NIPS1999), 1999 • P. O. Hoyer, D. Janzing, J. Mooij, J. Peters and B. Schölkopf. Nonlinear causal discovery with additive noise models. In Advances in Neural Information Processing Systems 21 (NIPS2008), pp. 689-696, 2009. • K. Zhang and A. Hyvärinen. Distinguishing causes from effects using nonlinear acyclic causal models. In JMLR Workshop and Conference Proceedings, Causality: Objectives and Assessment (Proc. NIPS2008 workshop on causality), 6: 157-164, 2010. • J. Peters, J. Mooij, D. Janzing and B. Schölkopf. Causal discovery with continuous additive noise models. Journal of Machine Learning Research, 15: 2009--2053, 2014. 45

References • G. Lacerda, P. Spirtes, J. Ramsey and P.
O. Hoyer. Discovering cyclic causal models by independent components analysis. In Proc. 24th Conf. on Uncertainty in Artificial Intelligence (UAI2008), pp. 366-374, Helsinki, Finland, 2008. • P. O. Hoyer, S. Shimizu, A. Kerminen and M. Palviainen. Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 49(2): 362-378, 2008. • S. Salehkaleybar, A. Ghassami, N. Kiyavash, K. Zhang. Learning Linear Non-Gaussian Causal Models in the Presence of Latent Variables. Journal of Machine Learning Research, 21:1-24, 2020. • S. Shimizu, P. O. Hoyer and A. Hyvärinen. Estimation of linear non-Gaussian acyclic models for latent factors. Neurocomputing, 72: 2024-2027, 2009. • Y. Zeng, S. Shimizu, R. Cai, F. Xie, M. Yamamoto, Z. Hao. Causal Discovery with Multi-Domain LiNGAM for Latent Factors. Proc. IJCAI2021. • Zheng, Xun and Aragam, Bryon and Ravikumar, Pradeep K and Xing, Eric P. DAGs with NO TEARS: Continuous Optimization for Structure Learning, Part of Advances in Neural Information Processing Systems 31 (NeurIPS 2018), 2018 • J. D. Ramsey, S. J. Hanson and C. Glymour. Multi-subject search correctly identifies causal connections and most causal directions in the DCM models of the Smith et al. simulation study. NeuroImage, 58(3): 838--848, 2011. • S. Shimizu. Joint estimation of linear non-Gaussian acyclic models. Neurocomputing, 81: 104-107, 2012. 46

References • W. Wenjuan, F. Lu, and L. Chunchen. Mixed
Causal Structure Discovery with Application to Prescriptive Pricing. In Proc. 27th International Joint Conference on Artificial Intelligence (IJCAI2018), pp. xx--xx, Stockholm, Sweden, 2018. • Y. Komatsu, S. Shimizu and H. Shimodaira. Assessing statistical reliability of LiNGAM via multiscale bootstrap. In Proc. International Conference on Artificial Neural Networks (ICANN2010), pp.309-314, Thessaloniki, Greece, 2010. • K. Biza, I. Tsamardinos, S. Triantafillou. Tuning causal discovery algorithms. In Proc. Probabilistic Graphical Models (PGM2020), 2020. • R. Silva, R. Scheines, C. Glymour, and P. Spirtes. Learning the structure of linear latent variable models. Journal of Machine Learning Research, 7:191–246, 2006. • F. Xie, R. Cai, B. Huang, C. Glymour, Z. Hao, and K. Zhang. Generalized independent noise condition for estimating latent variable causal graphs. NeurIPS, 33, 2020. • K. Zhang, M. Gong, P. Stojanov, B. Huang, Q. Liu, C. Glymour. Domain Adaptation as a Problem of Inference on Graphical Models. NeurIPS, 33, 2020. • M. J. Kusner, J. Loftus, C. Russell, R. Silva. Counterfactual Fairness. In Advances in Neural Information Processing Systems 30 (NIPS 2017), 2017 • P. Blöbaum and S. Shimizu. Estimation of interventional effects of features on prediction. In Proc. 2017 IEEE International Workshop on Machine Learning for Signal Processing (MLSP2017), pp. xx--xx, Tokyo, Japan, 2017. 47

LiNGAM approach to causal discovery (Preliminar...

LiNGAM approach to causal discovery (Preliminary version)

More Decks by Shohei SHIMIZU

Other Decks in Science

Featured

Transcript