Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Non-Gaussian methods for causal discovery

Shohei SHIMIZU
December 17, 2023

Non-Gaussian methods for causal discovery

Shohei Shimizu (17 Dec 2023)
Non-Gaussian methods for causal discovery
16th International Conference of the ERCIM WG on Computational and Methodological Statistics (CMStatistics 2023), Berlin
Organized Invited Session: Statistical Learning of Non-Gaussian Data

Shohei SHIMIZU

December 17, 2023
Tweet

More Decks by Shohei SHIMIZU

Other Decks in Science

Transcript

  1. Non-Gaussian methods for
    causal discovery
    Shohei Shimizu
    Shiga University and RIKEN
    CMStatistics2023 Berlin
    Organized Session: Statistical Learning of Non-Gaussian Data

    View full-size slide

  2. What is causal discovery?
    • Methodology for inferring causal graphs using data
    • Help select covariates in causal effect estimation
    2
    Maeda and Shimizu (2020)
    Assumptions
    • Functional form?
    • Distribution?
    • Hidden common
    cause present?
    • Acyclic? etc.
    Data Causal graph

    View full-size slide

  3. Applications
    https://www.shimizulab.org/lingam/lingampapers/applications-and-tailor-made-methods
    3
    Epidemiology Economics
    OpInc.gr(t)
    Empl.gr(t)
    Sales.gr(t)
    R&D.gr(t)
    Empl.gr(t+1)
    Sales.gr(t+1)
    R&D(.grt+1)
    OpInc.gr(t+1)
    Empl.gr(t+2)
    Sales.gr(t+2)
    R&D.gr(t+2)
    OpInc.gr(t+2)
    (Moneta et al., 2012)
    (Rosenstrom et al., 2012)
    Neuroscience Chemistry
    (Campomanes et al., 2014)
    (Ogawa et al., 2022)
    Prevention Medicine
    (Kotoku et al., 2020)
    Finance
    (Jiang & Shimizu, 2023)
    Sleep
    problems
    Depression
    mood
    Sleep
    problems
    Depression
    mood ?
    or

    View full-size slide

  4. Methods of causal discovery
    4

    View full-size slide

  5. Non-parametric approach: Example
    (Spirtes et al., 1993; 2001)
    1. Make assumptions on the underlying causal graph
    – Directed acyclic graph
    – No hidden common causes (all have been observed)
    2. Find the graph that best matches the data among such causal graphs
    that satisfy the assumptions.
    5
    If x and y are independent in the data, select (c) on the right.
    If x and y are dependent in the data, select (a) and (b).
    (a) and (b) are indistinguishable: Markov Equivalence class
    Three candidates
    x y x y x y
    (a) (b) (c)

    View full-size slide

  6. Additional information on functional
    forms and/or distributions helpful
    • Semiparametric approach
    • E.g., linearity + non-Gaussian continuous
    distribution results in different dist. of x and y
    (Shimizu, Hoyer, Hyvarinen & Kerminen, 2006; Shimizu, 2022)
    6
    No difference in terms of their conditional independence
    x y x y
    (a) (b)

    View full-size slide

  7. Semiparametric approach:
    Example identifiable models
    • Linear Non-Gaussian Acyclic Model: LiNGAM (Shimizu et al., 2006)
    • Nonlinearity + “additive” noise
    (Hoyer et al. 2009, Zhang & Hyvarinen, 2009, Peters et al. 2014)
    • Discrete variable model or mixed cases
    (Park et al., 2018; Wei et al., 2018; Zeng et al., 2022)
    7
    𝑥! 𝑥"
    𝑥#
    Causal graph identifiable
    𝑥!
    = #
    "#$(&!)
    𝑏!(
    𝑥(
    + 𝑒!
    𝑒#
    𝑒! 𝑒"
    𝑥!
    = 𝑔!
    )*(𝑓!
    (par(𝑥!
    )) + 𝑒!
    )
    𝑥!
    = 𝑓!
    (par(𝑥!
    )) + 𝑒!

    View full-size slide

  8. How independence and non-Gaussianity work?
    (Shimizu et al., 2011)
    8
    𝑥! = 𝑏!"𝑒" + 𝑒!
    and 𝑟"
    (!) are dependent,
    although they are uncorrelated
    Underlying model
    Regress effect on cause Regress cause on effect
    Residual
    𝑥" = 𝑒"
    and 𝑟!
    (") are independent
    𝑥! = 𝑒!
    𝑥" = 𝑏"!𝑥! + 𝑒" (𝑏"!≠ 0)
    𝑥" 𝑥!
    𝑒!
    𝑒"
    𝑟"
    (!) = 𝑥" −
    cov 𝑥", 𝑥!
    var 𝑥!
    𝑥!
    = 1 − %!"&'( )",)!
    (+, )!
    𝑒" − %!"(+, )"
    (+, )!
    𝑒!
    𝑟!
    (") = 𝑥! −
    cov 𝑥!, 𝑥"
    var 𝑥"
    𝑥"
    = 𝑥! − 𝑏!"𝑥"
    = 𝑒!
    𝑒!
    , 𝑒"
    are non-Gaussian

    View full-size slide

  9. Hidden common causes
    Additional information on functional
    forms and/or distributions helpful
    9

    View full-size slide

  10. Semiparametric approach:
    Linear non-Gaussian case
    • Dependence btw explanatory variables and the
    regression residuals implies existence of hidden
    variables and/or wrong causal direction (Tashiro et al., 2014)
    – Regress 𝑥*
    on 𝑥9
    (in the presence of 𝑈)
    – The residual and 𝑥9
    not independent because of hidden 𝑈
    10
    𝑥! 𝑥"
    𝑈
    𝑥!
    𝑥"
    𝑒"
    𝑒!
    𝑟"
    (!) = 𝑥" −
    cov 𝑥", 𝑥!
    var 𝑥!
    𝑥!
    𝑥! = (𝑏!"𝜆" + 𝜆!)𝑢 + 𝑏!"𝑒" + 𝑒!
    = 𝜆" − &'( )",)!
    (+, )!
    𝑏!"𝜆" + 𝜆! 𝑢 + 1 − &'( )",)!
    (+, )!
    𝑏!" 𝑒" − &'( )",)!
    (+, )!
    𝑒!
    𝜆#
    𝜆$
    𝑏$#

    View full-size slide

  11. Semiparametric approach:
    Causal additive models with unobserved variables
    (Maeda & Shimizu, 2021)
    • Acyclicity (and kind of faithfulness)
    • Extends LiNGAM in two ways
    – Hidden common causes
    – (Additive) nonlinearity
    • Can be applied to time series cases like structural VAR
    (Maeda & Shimizu, in prep.)
    11
    𝑥!
    =∑=>?@$A@B "#$(&!)
    𝑓(
    ! (𝑥(
    ) + ∑CD=>?@$A@B "#$(&!)
    𝑔E
    ! (𝑢E
    ) +𝑒!
    Model Output
    !!
    !"
    ""
    !#
    !$
    !%
    "!
    !&
    !'
    !!
    !"
    !#
    !$
    !%
    !&
    !'
    Underlying structure

    View full-size slide

  12. Codes and Software
    12

    View full-size slide

  13. Python packages
    and other no-code tools
    • Semiparametric: LiNGAM (Ikeuchi et al., 2023)
    and causal-learn (Zheng et al., 2023)
    • Nonparametric: pcalg (Kalisch et al., 2012)
    , causal-learn, Tigramite
    • Commercial software (no-code tools)
    – Causalas by SCREEN AS, Node AI by NTT Communications, Ntech Predict by neutral,
    Causal analysis by NEC
    13
    2019/08/20 20(06
    tLiNGAM.IPYNB - Colaboratory
    JNQPSUOVNQZBTOQ
    JNQPSUQBOEBTBTQE
    JNQPSUMJOHBN
    GSPNHSBQIWJ[JNQPSU%JHSBQI
    OQTFU@QSJOUPQUJPOT QSFDJTJPO TVQQSFTT5SVF

    TFFE
    FQTF
    σʔλΛ࡞੒
    EFGNBLF@HSBQI EBH

    E%JHSBQI FOHJOFEPU

    JGDPFGJOEBH
    GPSGSPN@ UP DPFGJO[JQ EBH<GSPN> EBH<UP> EBH<DPFG>

    EFEHF GY\GSPN@^ GY\UP^ MBCFMG\DPFGG^

    FMTF
    GPSGSPN@ UPJO[JQ EBH<GSPN> EBH<UP>

    EFEHF GY\GSPN@^ GY\UP^ MBCFM

    SFUVSOE
    x3
    x0
    3.00
    x2
    6.00
    x5
    4.00
    x4
    8.00
    x1
    3.00 1.00
    2.00
    EBH\
    GSPN< >
    UP< >
    DPFG< >
    ^
    NBLF@HSBQI EBH

    Total effects and Bootstrap prob.
    Causal graph Model Evaluation
    Independence of error variables
    Classical SEM model fit indices like RMSEA
    (semopy)
    Peason-correlation 0.03
    F-correlation (Bach & Jordan) 0.86

    View full-size slide

  14. Statistical causalinference is
    a fundamental tool for science
    • Many well-developed methods available when causal graphs are
    known from background knowledge
    • Helping draw causal graphs with data is the key: Causal
    discovery
    – LiNGAM-related papers: https://www.shimizulab.org/lingam/lingampapers
    • Next default assumptions:
    – Hidden common causes (Spirtes et al., 1995; Hoyer et al., 2008; Wang & Drton 2023)
    – Mixed data: Continuous and discrete variables
    (Sedgewick et al., 2019; Wei et al. 2018; Zeng et al., 2022)
    – (Cyclicity (Lacerda et al., 2008) & Non-stationarity (Huang et al., 2019))
    15

    View full-size slide