Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Non-Gaussian, nonlinear causal discovery with h...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Non-Gaussian, nonlinear causal discovery with hidden variables and application

Talk slides at Symposium: Approaching the World through the Lens of Causality
https://www.tfc.tohoku.ac.jp/event/4314.html

Avatar for Shohei SHIMIZU

Shohei SHIMIZU

February 19, 2026
Tweet

More Decks by Shohei SHIMIZU

Other Decks in Science

Transcript

  1. Non-Gaussian, nonlinear causal discovery with hidden variables and application SHIMIZU

    Shohei1,2,3 1SANKEN, The University of Osaka 2Faculty of Data Science, Shiga University 3Center for Advanced Intelligence Project (AIP), RIKEN Approaching the World through the Lens of Causality SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 1 / 18
  2. Abstract ▶ What is causal discovery? ▶ Inferring causal structures

    from data with prior knowledge ▶ Why does it matter? ▶ Increasing importance in the era of ML and generative AI ▶ What will I show today? ▶ Non-Gaussian and nonlinear methods with hidden variables ▶ Real-world application example SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 2 / 18
  3. The Rise of AI for Science ▶ AI for Science

    aims to speed up research: ▶ AI-assisted hypothesis, design, simulation, analysis, interpretation, and discussion ▶ The idea of Causal inference is needed at every step ▶ From correlation-driven intuition to causally grounded research X A I 1 2 → 1 3 Al-assisted hypothesis A I 1 → 1 Experimental and survey design Publishing papers Interpretation a n d A I d i s c u s s i o A I 3 6 3 7 Simulations and Experiments 6 1 → 1 1 AI-assisted hypothesis Experimental and survey design Simulations and Experiments Interpretation and discussion Publishing papers Adapted from MEXT (2025) https://www.mext.go.jp/content/20250826-ope_dev02-000044427_8.pdf SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 3 / 18
  4. Estimating Causal Effects Requires Structural Information ▶ Draw a causal

    graph based on prior knowledge ▶ Identify variables required for adjustment ▶ Adjust for those variables (if any) and estimate the intervention effect E[N | do(C)] = Epa(C) [E(N | C, pa(C))] Messerli [2012] Sleep disorder Depression mood Third variable (Common cause) Chocolate Nobel laureates GDP SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 4 / 18
  5. Causal Discovery [Spirtes et al., 2001, Shimizu, 2022] ▶ Infer

    causal graphs from data + assumptions ▶ Need assumptions: ▶ distribution, functional form, and graph structure ▶ Structural Causal Model (SCM) [Pearl, 2000]: xi = fi (pa(xi ), ei ) SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 5 / 18
  6. Basic Idea of Nonparametric Approach [Spirtes et al., 2001] 1

    Assume a class of causal graphs: ▶ Directed acyclic graphs (DAGs) ▶ No hidden common causes (causal sufficiency) 2 Select the graph(s) that best match the observed conditional independencies in the data ▶ If x and y are independent, select (c) ▶ If x and y are dependent, select (a) and (b) ▶ (a) and (b) are indistinguishable: a Markov equivalence class Non-parametric approach: Example (Spirtes et al., 1993; 2001) Make assumptions on the underlying causal graph – Directed acyclic graph – No hidden common causes (all have been observed) Find the graph that best matches the data among such causal gra that satisfy the assumptions. If x and y are independent in the data, select (c) on the right. If x and y are dependent in the data, select (a) and (b). (a) and (b) are indistinguishable: Markov Equivalence class x y x y x y (a) (b) (c) SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 6 / 18
  7. Leveraging Functional Form and Distributional Information [Shimizu, 2014, 2022] ▶

    Under linearity and non-Gaussian continuous noise (LiNGAM), causal direction x → y and y → x becomes identifiable [Shimizu et al., 2006] ▶ In the correct direction, the regressor and residual are independent ▶ Estimate the model by maximizing independence between the regressor and the residual ditional information on functiona orms and/or distributions helpful iparametric approach ., linearity + non-Gaussian continuous tribution results in different dist. of x and izu, Hoyer, Hyvarinen & Kerminen, 2006; Shimizu, 2022) No difference in terms of their conditional independence (a) (b) " = $#$ ! + &# ! = $$# " + &$ SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 7 / 18
  8. Independence between cause and noise ensures identifiability ▶ Nonlinear extensions

    ▶ Additive Noise Model (ANM) [Hoyer et al., 2009] xi = fi (pa(i)) + ei ▶ Causal Additive Models (CAM) [B¨ uhlmann et al., 2014] xi = j∈pa(i) fij (xj ) + ei ▶ Post-Nonlinear Model (PNL) [Zhang and Chan, 2006, Zhang and Hyv¨ arinen, 2009] xi = gi (fi (pa(i)) + ei ) ▶ Location–Scale Noise Model (LSNM) [Immer et al., 2023] xi = fi (pa(i)) + σi (pa(i))ei ▶ Discrete and mixed-variable models [Park and Raskutti, 2017, Wei et al., 2018, Zeng et al., 2022, Maeda et al., 2025] SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 8 / 18
  9. Our Recent Methodological Advances and Applications Hidden variables and preventive

    medicine example SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 9 / 18
  10. Extension: Hidden common cause models ▶ LiNGAM with hidden common

    causes [Tashiro et al., 2014, Maeda and Shimizu, 2020] xi = par(xi ) bij xj + par(xi ) λikuk + ei ▶ Sociology data example: RCD: Repetitive causal discovery of linear non-Gaussian acyclic models with latent confounders Table 1: The results of the application to sociological data. Bidirected arrows (Latent confounders) Directed arrows (Causality) Method # of estimation # of successes Precision # of estimation # of successes Precision RCD 4 4 1.0 5 4 0.8 FCI 3 3 1.0 3 1 0.3 RFCI 3 3 1.0 3 1 0.3 GFCI 0 0 0.0 0 0 0.0 PC - - - 2 1 0.5 GES - - - 2 1 0.5 RESIT - - - 12 4 0.3 LiNGAM - - - 5 4 0.8 1 265 1 6 265 ( 4 6 2 325 )65 265 )65 6 265 )65 25 64 Figure 4: Variables and causal relations in the General Social Survey data set used for the evaluation. 1 265 1 6 265 ( 4 6 2 325 )65 265 )65 6 265 )65 25 64 Figure 5: Causal graph produced by RCD: The dashed arrow, x3 Ω x5 is incorrect inference, but the other arrows are reasonable based on Figure 4 5 Conclusion Maeda and Shimizu [2020] SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 10 / 18
  11. Continuous Optimization for LiNGAM with hidden common causes [Morinishi and

    Shimizu, 2025] ▶ Objective: Likelihood + Graph Constraint min θ∈Θ −2 ln L(X; θ) + λ tanh(c|θi |) + ρ 2 ∥h(θ)∥2 + α h(θ) ▶ Graph constraint for bow-free graphs with hidden common causes [Bhattacharya et al., 2021] h(θ) = tr(eD) − #variables + (D ◦ B) ▶ D and B are adjacency matrices among observed and hidden ▶ Equals zero iff the estimated graph is bow-free ransactions on Machine Learning Research (8/2025) DAG without unmeasured variables. (b) Ancestral ADMGs. (c) Arid ADMGs. (d) Gs. Bow-free graph SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 11 / 18
  12. Nonlinear Models with hidden Variables ▶ Causal Additive Models (CAM)

    with hidden Variables (CAM-UV) [Maeda and Shimizu, 2021] xi = j∈paX (i) fij (xj ) + k∈paU (i) gik(uk) + ei ▶ Sufficient identifiability conditions leveraging both types of independence [Pham et al., 2026] ▶ Independence between cause and noise (via regression residuals) ▶ Conditional independence among observed variables ▶ Although (x3 , x2) is a bow case, we can still identify that x3 → x2. Thong Pham, Takashi Nicholas Maeda, Shohei Shimizu particular, due to the direct e set M → X \{x1 , x2 } = {x3 {x2 , x3 } that can make the re G2 (N) independent. i.e., Eq there are some sets M → X X \ {x1 , x2 } = {x3 }, in p N = ⊋, that can make the x1 ↔ G2 (N) independent, i.e Visible non-edges are defined Lemma 2 (visible non-edge xi , xj ↑ X→. There is no di and there is no UBP or UCP spect to X→ if and only if ≃G1 , G2 ↑ G, M → X→ \ {x xi SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 12 / 18
  13. Intuition in determining the direction between (x2 , x3 )

    A clearer intuitive explanation is under development SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 13 / 18
  14. Causal Discovery Among Groups of Variables ▶ Known grouping ▶

    Variables that share hidden common causes ▶ Multicollinearity ▶ Group version of DirectLiNGAM [Entner and Hoyer, 2012] ▶ Group version of the Location–Scale Noise Model (LSNM) [Kikuchi and Shimizu, 2023], applicable to time-series settings xt j = fj PAt j , . . . , PAt−L j + σj PAt j , . . . , PAt−L j et j each group. Let Y = {Y1, ..., YM } be a supervertex obtained by con belonging to the same group on G. In this paper, we refer to a graph on and a graph on Y as a group graph, and we call the corresponding ad {0, 1}P→P and B→ → {0, 1}M→M variable adjacency matrices and grou , respectively. B→ encapsulates the connections between the groups, w only if ↓[B]i↑K(k), j↑K(l) = 1. We further assume that the group gra acyclic graph (DAG), where we call G group-acyclic given the groupin The goal is to estimate B↓ from L(X), which we call the corresp DAG. Many existing structure-learning methods perform estimation u number of groups is equal to that of variables; we call this the correspo DAG. An example of a variable DAG with grouping K and the corre is shown in Figure 1. If M < P and fj are linear and Nj represents a noises that are independent of each other over the groups, this problem of Entner and Hoyer (2012). (a) Variable DAG (b) Group D Figure 1: Example of a variable DAG and a group D ▶ Unknown grouping [Kawahara et al., 2010] ▶ Maybe useful for defining variables SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 14 / 18
  15. Preventive Medicine Example with Fukuma’s Group (Kyoto Univ.) ▶ Effect

    of the health guidance intervention on subsequent outcomes ▶ Use causal discovery to identify appropriate covariates for adjustment ▶ Operational data collection rules that impose structural constraints: ▶ Biological knowledge (Common sense) ▶ Estimate the causal structure of health outcomes within the same year FY2021 FY2022 FY2023 Time point 0 Time point 1 Time point 2 Time point 3 Background Factors Age(2021) Sex Background Factors Age(2023) Sex FY2020 Health Outcomes BMI(2020) HbA1c(2020) SBP(2020) DBP(2020) LDL(2020) Health Outcomes BMI(2021) HbA1c(2021) SBP(2021) DBP(2021) LDL(2021) Health Outcomes BMI(2023) HbA1c(2023) SBP(2023) DBP(2023) LDL(2023) Health Outcomes BMI(2022) HbA1c(2022) SBP(2022) DBP(2022) LDL(2022) Intervention Health-guidance(2021) Lifestyle Habits Smoke(2020) Exercise(2020) Alcohol(2020) Medications Drug-HT(2020) Drug-DM(2020) Drug-LDL(2020) Background Factors Age(2020) Sex Check num. Lifestyle Habits Smoke(2021) Exercise(2021) Alcohol(2021) Medications Drug-HT(2021) Drug-DM(2021) Drug-LDL(2021) Lifestyle Habits Smoke(2022) Exercise(2022) Alcohol(2022) Medications Drug-HT(2022) Drug-DM(2022) Drug-LDL(2022) Lifestyle Habits Smoke(2023) Exercise(2023) Alcohol(2023) Medications Drug-HT(2023) Drug-DM(2023) Drug-LDL(2023) Intervention Health-guidance(2020) Intervention Health-guidance(2022) Background Factors Age(2022) Sex Based on nationwide insurer database SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 15 / 18
  16. Results ▶ Covariate selection for adjustment based on the estimated

    graph: Health-guidance(2022) BMI(2023) DBP(2023) Drug-HT(2023) SBP(2023) HbA1c(2023) LDL(2023) Drug-DM(2023) Drug-LDL(2023) Smoke(2023) Exercise(2023) Alcohol(2023) Age(2023) Sex(2023) Health-guidance(2021) BMI(2022) DBP(2022) Exercise(2022) SBP(2022) HbA1c(2022) LDL(2022) Drug-HT(2022) Drug-DM(2022) Drug-LDL(2022) Smoke(2022) Alcohol(2022) Age(2022) Sex(2022) Health-guidance(2020) BMI(2021) SBP(2021) DBP(2021) HbA1c(2021) LDL(2021) Drug-HT(2021) Drug-DM(2021) Drug-LDL(2021) Smoke(2021) Exercise(2021) Alcohol(2021) Age(2021) Sex(2021) BMI(2020) SBP(2020) DBP(2020) HbA1c(2020) LDL(2020) Drug-HT(2020) Drug-DM(2020) Drug-LDL(2020) Smoke(2020) Exercise(2020) Alcohol(2020) Age(2020) Sex(2020) Check_num. ▶ Estimated effects of guidance (2020) with bootstrap confidence intervals ▶ Consistent with previous regression discontinuity analysis [Fukuma et al., 2020] Effect on BMI (2021): -0.129 95% CI =(-0.165, -0.094) Effect on BMI (2022): -0.067 95% CI =(-0.109, -0.029) Effect on BMI (2023): -0.031 95% CI =(-0.076, 0.014) SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 16 / 18
  17. Final summary ▶ Causal inference is key for AI for

    Science ▶ When the graph is known, many tools exist ▶ When the graph is unknown, causal discovery helps ▶ Recent advances in causal discovery with hidden variables ▶ Open-source: LiNGAM-related methods [Ikeuchi et al., 2023], non-parametric methods[Zheng et al., 2024, Kalisch et al., 2012, Runge et al., 2023] ▶ No-code tools: Causalas, Node AI, NTech Predict, Causal analysis, etc. SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 17 / 18
  18. CFP: Recent Advances in Causal Inference, Causal Discovery, and Applications

    ▶ Special issue of Japanese Journal of Statistics and Data Science ▶ Coordinating Editors: Shohei Shimizu, Manabu Kuroki, Aapo Hyv¨ arinen ▶ Submission deadline: February 1, 2027 We welcome submissions from theory to real-world applications! SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 18 / 18
  19. Reference I Rohit Bhattacharya, Tushar Nagarajan, Daniel Malinsky, and Ilya

    Shpitser. Differentiable causal discovery under unmeasured confounding. In Arindam Banerjee and Kenji Fukumizu, editors, Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 of Proceedings of Machine Learning Research, pages 2314–2322. PMLR, 13–15 Apr 2021. URL https://proceedings.mlr.press/v130/bhattacharya21a.html. Peter B¨ uhlmann, Jonas Peters, and Jan Ernest. CAM: Causal additive models, high-dimensional order search and penalized regression. Annals of Statistics, 42(6):2526–2556, 2014. Doris Entner and Patrik O. Hoyer. Estimating a causal order among groups of variables in linear models. In Proceedings of the 22nd International Conference on Artificial Neural Networks (ICANN), pages 83–90, Lausanne, Switzerland, 2012. Shingo Fukuma, Toshiaki Iizuka, Tatsuyoshi Ikenoue, and Yusuke Tsugawa. Association of the national health guidance intervention for obesity and cardiovascular risks with health outcomes among japanese men. JAMA Internal Medicine, 180(12):1630–1637, 2020. doi: 10.1001/jamainternmed.2020.4334. Published online October 5, 2020. Patrik O. Hoyer, Dominik Janzing, Joris Mooij, Jonas Peters, and Bernhard Sch¨ olkopf. Nonlinear causal discovery with additive noise models. In Advances in Neural Information Processing Systems 21, pages 689–696. Curran Associates Inc., 2009. Takashi Ikeuchi, Mayumi Ide, Yan Zeng, Takashi Nicholas Maeda, and Shohei Shimizu. Python package for causal discovery based on lingam. Journal of Machine Learning Research, 24 (14):1–8, 2023. URL http://jmlr.org/papers/v24/21-0321.html. SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 18 / 18
  20. Reference II Alexander Immer, Christoph Schultheiss, Julia E. Vogt, Bernhard

    Sch¨ olkopf, Peter B¨ uhlmann, and Alexander Marx. On the identifiability and estimation of causal location-scale noise models. In Proceedings of the 40th International Conference on Machine Learning (ICML), volume 202 of Proceedings of Machine Learning Research, pages 14316–14332. PMLR, 2023. Markus Kalisch, Martin M¨ achler, Diego Colombo, Marloes H Maathuis, and Peter B¨ uhlmann. Causal inference using graphical models with the R package pcalg. Journal of Statistical Software, 47(11):1–26, 2012. Y. Kawahara, K. Bollen, S. Shimizu, and T. Washio. GroupLiNGAM: Linear non-Gaussian acyclic models for sets of variables. arXiv:1006.5041, 2010. Genta Kikuchi and Shohei Shimizu. Structure learning for groups of variables in nonlinear time-series data with location-scale noise. In Erich Kummerfeld, Sisi Ma, Eric Rawls, and Bryan Andrews, editors, Proceedings of the 2023 Causal Analysis Workshop Series, volume 223 of Proceedings of Machine Learning Research, pages 20–39. PMLR, 14 Aug 2023. URL https://proceedings.mlr.press/v223/kikuchi23a.html. Takashi Nicholas Maeda and Shohei Shimizu. RCD: Repetitive causal discovery of linear non-Gaussian acyclic models with latent confounders. In Proc. 23rd International Conference on Artificial Intelligence and Statistics (AISTATS2010), volume 108 of Proceedings of Machine Learning Research, pages 735–745. PMLR, 26–28 Aug 2020. Takashi Nicholas Maeda and Shohei Shimizu. Causal additive models with unobserved variables. In Proc. 37th Conference on Uncertainty in Artificial Intelligence (UAI2021), pages 97–106. PMLR, 2021. SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 18 / 18
  21. Reference III Takashi Nicholas Maeda, Shohei Shimizu, and Hidetoshi Matsui.

    Density ratio-based causal discovery from bivariate continuous-discrete data, 2025. F. H. Messerli. Chocolate consumption, cognitive function, and Nobel laureates. New England Journal of Medicine, 367:1562–1564, 2012. Yoshimitsu Morinishi and Shohei Shimizu. Differentiable causal discovery of linear non-gaussian acyclic models under unmeasured confounding. Transactions on Machine Learning Research, August 2025. URL https://openreview.net/forum?id=HR7MFlW73I. Gunwoong Park and Garvesh Raskutti. Learning quadratic variance function (QVF) DAG models via overdispersion scoring (ODS). Journal of Machine Learning Research, 18:224–1, 2017. Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000. Thong Pham, Takashi Nicholas Maeda, and Shohei Shimizu. Causal additive models with unobserved causal paths and backdoor paths. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), volume XX of Proceedings of Machine Learning Research, pages XX–XX. PMLR, 2026. Jakob Runge, Andreas Gerhardus, Gherardo Varando, Veronika Eyring, and Gustau Camps-Valls. Causal inference for time series. Nature Reviews Earth & Environment, 4:487–505, June 2023. doi: 10.1038/s43017-023-00431-y. URL https://doi.org/10.1038/s43017-023-00431-y. Shohei Shimizu. LiNGAM: Non-Gaussian methods for estimating causal structures. Behaviormetrika, 41(1):65–98, 2014. SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 18 / 18
  22. Reference IV Shohei Shimizu. Statistical Causal Discovery: LiNGAM Approach. Springer,

    Tokyo, 2022. Shohei Shimizu, Patrik O. Hoyer, Aapo Hyv¨ arinen, and Antti Kerminen. A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7:2003–2030, 2006. Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. MIT Press, 2001. 2nd ed. Tatsuya Tashiro, Shohei Shimizu, Aapo Hyv¨ arinen, and Takashi Washio. ParceLiNGAM: A causal ordering method robust against latent confounders. Neural Computation, 26(1): 57–83, 2014. Wenjuan Wei, Lu Feng, and Chunchen Liu. Mixed causal structure discovery with application to prescriptive pricing. In Proc. 27rd International Joint Conference on Artificial Intelligence (IJCAI2018), pages 5126–5134, 2018. Yan Zeng, Shohei Shimizu, Hidetoshi Matsui, and Fuchun Sun. Causal discovery for linear mixed data. In Proceedings of the First Conference on Causal Learning and Reasoning (CLeaR2022), volume 177 of Proceedings of Machine Learning Research, pages 994–1009. PMLR, 11–13 Apr 2022. K. Zhang and L.-W. Chan. ICA with sparse connections. In Proc. 7th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2006), pages 530–537, 2006. K. Zhang and A. Hyv¨ arinen. On the identifiability of the post-nonlinear causal model. In Proc. 25th Conference on Uncertainty in Artificial Intelligence (UAI2009), pages 647–655, 2009. SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 18 / 18
  23. Reference V Yujia Zheng, Biwei Huang, Wei Chen, Joseph Ramsey,

    Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes, and Kun Zhang. Causal-learn: Causal discovery in python. Journal of Machine Learning Research, 25(60):1–8, 2024. URL http://jmlr.org/papers/v25/23-0970.html. SHIMIZU Shohei (Univ. Osaka) 20th Feb 2026 18 / 18