Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking 2022 Lecture 14

Statistical Rethinking 2022 Lecture 14

Richard McElreath

February 14, 2022
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Statistical Rethinking
    14: Correlated Varying Effects
    2022

    View Slide

  2. View Slide

  3. Varying effects as confounds
    Varying effect strategy: Unmeasured features
    of clusters leave an imprint on the data that can
    be measured by (1) repeat observations of each
    cluster and (2) partial pooling among clusters
    Predictive perspective: Important source of
    cluster-level variation, regularize
    Causal perspective: Competing causes or
    unobserved confounds

    View Slide

  4. M D
    A
    Age at marriage
    Marriage Divorce

    View Slide

  5. M D
    A
    Age at marriage
    Marriage Divorce
    R
    Region

    View Slide

  6. G P
    U
    C
    Grandparents Parents
    Children
    Neighborhood

    View Slide

  7. R
    X S
    Y
    E G
    P
    U
    Treatment
    Participation
    Individual

    View Slide

  8. R
    X S
    Y
    E G
    P
    U
    Treatment
    Participation
    Individual

    View Slide

  9. W
    G1 G2
    X1 X2
    Government Government
    War

    View Slide

  10. W
    G1
    N2
    N1
    G2
    X1 X2
    Government Government
    War
    Nation Nation

    View Slide

  11. Varying effects as confounds
    Causal perspective: Competing causes
    or actual confounds
    Advantage over “fixed effect”
    approach: Can include other cluster-
    level (time invariant) causes
    Fixed effects: Varying effects with
    variance fixed at infinity, no pooling
    Don’t panic: Make a generative model
    and draw the owl
    W
    G1
    N2
    N1
    G2
    X1 X2
    War
    Nation Nation
    Geography Geography

    View Slide

  12. 2 4 6 8 10 12
    1 2 3 4 5 6 7
    Story
    Response
    0 10 20 30 40 50
    1 2 3 4 5 6 7
    Participant
    Response
    Add clusters: More index variables, more
    population priors (previous lecture)
    Add features: More parameters, more
    dimensions in each population prior
    (this lecture)
    Cluster

    tanks

    stories

    individuals

    departments
    Features

    survival

    treatment effect

    average response

    admission rate, bias

    View Slide

  13. Adding Features
    One prior distribution for each cluster
    One feature: One-dimensional
    distribution
    Two features: Two-D distribution
    N features: N-dimensional distribution
    Hard part: Learning associations
    α
    j
    ∼ Normal( ¯
    α, σ)

    j
    , β
    j
    ] ∼ MVNormal([ ¯
    α, ¯
    β], Σ)
    α
    j,1..N
    ∼ MVNormal( ¯
    α, Σ)

    View Slide

  14. View Slide

  15. View Slide

  16. View Slide

  17. View Slide

  18. View Slide

  19. View Slide

  20. View Slide

  21. View Slide

  22. View Slide

  23. View Slide

  24. View Slide

  25. View Slide

  26. View Slide

  27. #*/0.
    Prosocial chimpanzees
    data(chimpanzees)
    504 trials, 7 actors, 6 blocks
    4 treatments:
    (1) right, no partner

    (2) left, no partner

    (3) right, partner

    (4) left, partner

    View Slide

  28. -2 -1 0 1 2
    treatment
    log-odds
    R/N L/N R/P L/P
    0 1 2 3 4 5
    0.0 0.4 0.8
    sigma_A
    Density
    0 1 2 3 4 5
    0.0 1.5
    sigma_B
    Density
    0.0 0.2 0.4 0.6 0.8 1.0
    actor
    probability pull left
    1 2 3 4 5 6 7
    actor variation
    treatment/block
    variation

    View Slide

  29. P
    i
    ∼ Bernoulli(p
    i
    )
    mean + Actor-treatment + Block-treatment
    P
    T
    A
    B
    block actor
    pulled
    left
    side
    condition
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]

    View Slide

  30. P
    i
    ∼ Bernoulli(p
    i
    )
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]
    mean for
    each actor
    actor offset
    for each
    treatment
    mean for
    each block
    block offset
    for each
    treatment

    View Slide

  31. P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ MVNormal([0,0,0,0], R
    A
    , S
    A
    )
    for j ∈ 1..7
    mean + Actor-treatment + Block-treatment
    prior for covarying actor-treatment effects
    a vector of 4 parameters for each actor j
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]

    View Slide

  32. P
    i
    ∼ Bernoulli(p
    i
    )
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]
    α
    j
    ∼ MVNormal([0,0,0,0], R
    A
    , S
    A
    )
    β
    k
    ∼ MVNormal([0,0,0,0], R
    B
    , S
    B
    )
    for j ∈ 1..7
    for k ∈ 1..6
    mean + Actor-treatment + Block-treatment
    prior for covarying actor-treatment effects
    a vector of 4 parameters for each actor j
    prior for covarying block-treatment effects
    a vector of 4 parameters for each block k

    View Slide

  33. P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ MVNormal([0,0,0,0], R
    A
    , S
    A
    )
    β
    k
    ∼ MVNormal([0,0,0,0], R
    B
    , S
    B
    )
    ¯
    α
    j
    ∼ Normal(0,τ
    A
    )
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    for j ∈ 1..7
    for k ∈ 1..6
    for j ∈ 1..4
    mean + Actor-treatment + Block-treatment
    prior for covarying actor-treatment effects
    a vector of 4 parameters for each actor j
    prior for covarying block-treatment effects
    a vector of 4 parameters for each block k
    each standard deviation gets same prior
    one standard deviation for each treatment
    correlation matrix prior
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]
    ¯
    β
    k
    ∼ Normal(0,τ
    B
    )
    priors for actor and treatment means

    View Slide

  34. P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ MVNormal([0,0,0,0], R
    A
    , S
    A
    )
    β
    k
    ∼ MVNormal([0,0,0,0], R
    B
    , S
    B
    )
    ¯
    α
    j
    ∼ Normal(0,τ
    A
    )
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    for j ∈ 1..7
    for k ∈ 1..6
    for j ∈ 1..4
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]
    ¯
    β
    k
    ∼ Normal(0,τ
    B
    )

    View Slide

  35. data(chimpanzees)
    d <- chimpanzees
    dat <- list(
    P = d$pulled_left,
    A = as.integer(d$actor),
    B = as.integer(d$block),
    T = 1L + d$prosoc_left + 2L*d$condition)
    m14.2 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors
    vector[4]:a[A] ~ multi_normal(0,Rho_A,sigma_A),
    vector[4]:b[B] ~ multi_normal(0,Rho_B,sigma_B),
    abar[A] ~ normal(0,tau_A),
    bbar[B] ~ normal(0,tau_B),
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    sigma_A ~ exponential(1),
    Rho_A ~ dlkjcorr(4),
    sigma_B ~ exponential(1),
    Rho_B ~ dlkjcorr(4)
    ) , data=dat , chains=4 , cores=4 )
    P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ MVNormal(0,R
    A
    , S
    A
    )
    β
    k
    ∼ MVNormal(0,R
    B
    , S
    B
    )
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B
    ¯
    α
    j
    ∼ Normal(0,τ
    A
    )
    ¯
    β
    k
    ∼ Normal(0,τ
    B
    )
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)

    View Slide

  36. data(chimpanzees)
    d <- chimpanzees
    dat <- list(
    P = d$pulled_left,
    A = as.integer(d$actor),
    B = as.integer(d$block),
    T = 1L + d$prosoc_left + 2L*d$condition)
    m14.2 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors
    vector[4]:a[A] ~ multi_normal(0,Rho_A,sigma_A),
    vector[4]:b[B] ~ multi_normal(0,Rho_B,sigma_B),
    abar[A] ~ normal(0,tau_A),
    bbar[B] ~ normal(0,tau_B),
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    sigma_A ~ exponential(1),
    Rho_A ~ dlkjcorr(4),
    sigma_B ~ exponential(1),
    Rho_B ~ dlkjcorr(4)
    ) , data=dat , chains=4 , cores=4 )
    P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ MVNormal(0,R
    A
    , S
    A
    )
    β
    k
    ∼ MVNormal(0,R
    B
    , S
    B
    )
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B
    ¯
    α
    j
    ∼ Normal(0,τ
    A
    )
    ¯
    β
    k
    ∼ Normal(0,τ
    B
    )
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)

    View Slide

  37. data(chimpanzees)
    d <- chimpanzees
    dat <- list(
    P = d$pulled_left,
    A = as.integer(d$actor),
    B = as.integer(d$block),
    T = 1L + d$prosoc_left + 2L*d$condition)
    m14.2 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors
    vector[4]:a[A] ~ multi_normal(0,Rho_A,sigma_A),
    vector[4]:b[B] ~ multi_normal(0,Rho_B,sigma_B),
    abar[A] ~ normal(0,tau_A),
    bbar[B] ~ normal(0,tau_B),
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    sigma_A ~ exponential(1),
    Rho_A ~ dlkjcorr(4),
    sigma_B ~ exponential(1),
    Rho_B ~ dlkjcorr(4)
    ) , data=dat , chains=4 , cores=4 )
    P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ MVNormal(0,R
    A
    , S
    A
    )
    β
    k
    ∼ MVNormal(0,R
    B
    , S
    B
    )
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B
    ¯
    α
    j
    ∼ Normal(0,τ
    A
    )
    ¯
    β
    k
    ∼ Normal(0,τ
    B
    )
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)

    View Slide

  38. data(chimpanzees)
    d <- chimpanzees
    dat <- list(
    P = d$pulled_left,
    A = as.integer(d$actor),
    B = as.integer(d$block),
    T = 1L + d$prosoc_left + 2L*d$condition)
    m14.2 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors
    vector[4]:a[A] ~ multi_normal(0,Rho_A,sigma_A),
    vector[4]:b[B] ~ multi_normal(0,Rho_B,sigma_B),
    abar[A] ~ normal(0,tau_A),
    bbar[B] ~ normal(0,tau_B),
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    sigma_A ~ exponential(1),
    Rho_A ~ dlkjcorr(4),
    sigma_B ~ exponential(1),
    Rho_B ~ dlkjcorr(4)
    ) , data=dat , chains=4 , cores=4 )
    P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ MVNormal(0,R
    A
    , S
    A
    )
    β
    k
    ∼ MVNormal(0,R
    B
    , S
    B
    )
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B
    ¯
    α
    j
    ∼ Normal(0,τ
    A
    )
    ¯
    β
    k
    ∼ Normal(0,τ
    B
    )
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)

    View Slide

  39. 500 1000 1500 2000
    1.00 1.04 1.08
    number of effective samples
    Rhat
    220 240 260 280 300 320 340
    0.000 0.010 0.020
    HMC energy
    Density
    37
    Divergent transitions
    Check yourself before
    you wreck yourself
    n_eff = 113
    log-probability

    View Slide

  40. PAUSE

    View Slide

  41. v ∼ Normal(0,3)
    z ∼ Normal(0,1)
    x = z exp(v)
    v ∼ Normal(0,3)
    x ∼ Normal(0, exp(v))
    “Centered” “Non-centered”

    View Slide

  42. P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ Normal( ¯
    α, σ
    A
    )
    σ
    A
    , σ
    B
    ∼ Exponential(1)
    logit(p
    i
    ) = β
    T[i],B[i]
    + α
    A[i]
    β
    j,k
    ∼ Normal(0,σ
    B
    )
    ¯
    α ∼ Normal(0,1.5)
    Centered

    View Slide

  43. P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ Normal( ¯
    α, σ
    A
    )
    σ
    A
    , σ
    B
    ∼ Exponential(1)
    logit(p
    i
    ) = β
    T[i],B[i]
    + α
    A[i]
    β
    j,k
    ∼ Normal(0,σ
    B
    )
    P
    i
    ∼ Bernoulli(p
    i
    )
    logit(p
    i
    ) = ¯
    α + (z
    α,A[i]

    A
    + (z
    β,T[i],B[i]

    B
    z
    α,j
    ∼ Normal(0,1)
    z
    β,j
    ∼ Normal(0,1)
    σ
    A
    , σ
    B
    ∼ Exponential(1)
    ¯
    α ∼ Normal(0,1.5) ¯
    α ∼ Normal(0,1.5)
    Centered Non-centered

    View Slide

  44. P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ Normal( ¯
    α, σ
    A
    )
    σ
    A
    , σ
    B
    ∼ Exponential(1)
    logit(p
    i
    ) = β
    T[i],B[i]
    + α
    A[i]
    β
    j,k
    ∼ Normal(0,σ
    B
    )
    P
    i
    ∼ Bernoulli(p
    i
    )
    logit(p
    i
    ) = ¯
    α + (z
    α,A[i]

    A
    + (z
    β,T[i],B[i]

    B
    z
    α,j
    ∼ Normal(0,1)
    z
    β,j
    ∼ Normal(0,1)
    σ
    A
    , σ
    B
    ∼ Exponential(1)
    ¯
    α ∼ Normal(0,1.5) ¯
    α ∼ Normal(0,1.5)
    Centered Non-centered

    View Slide

  45. Centered
    How can we factor R and S
    out of the priors?
    P
    i
    ∼ Bernoulli(p
    i
    )
    α
    j
    ∼ MVNormal([0,0,0,0], R
    A
    , S
    A
    )
    β
    k
    ∼ MVNormal([0,0,0,0], R
    B
    , S
    B
    )
    ¯
    α
    j
    ∼ Normal(0,τ
    A
    )
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]
    ¯
    β
    k
    ∼ Normal(0,τ
    B
    )

    View Slide

  46. View Slide

  47. P
    i
    ∼ Bernoulli(p
    i
    )
    α =
    S
    A,1
    0 0 0
    0 S
    A,2
    0 0
    0 0 S
    A,3
    0
    0 0 0 S
    A,4
    L
    A
    Z
    T,A

    diagonal matrix of
    standard deviations
    transpose!

    flips rows and columns
    a 7-by-4
    matrix
    Cholesky factor of
    correlation matrix
    across treatments
    matrix of treatment-
    actor z-scores
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]

    View Slide

  48. α =
    S
    A,1
    0 0 0
    0 S
    A,2
    0 0
    0 0 S
    A,3
    0
    0 0 0 S
    A,4
    L
    A
    Z
    T,A

    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    P
    i
    ∼ Bernoulli(p
    i
    )
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]
    Cholesky factor of
    correlation matrix
    across treatments

    View Slide

  49. I. NOTICES SCIENTIFIQUES
    Commandant BENOIT '.
    NOTE SUR UNE IVIETHODE DE RESOLUTION DES EflUA-
    TIONS NOR~IALES PROVENANT DE L'APPLICATION BE
    LA METHODE DES IVIOINDRES CARRIES A UN SYSTEME
    D'EQUATIONS LINP.AIRES EN NOI~IBRE INF~.RIEUR A
    CELUI DES INCONNUES. -- APPLICATION DE LA MP~-
    THODE A LA RF.SOLUTION D'UN Su DEFINI
    D'P.QUATIONS LINEAIRES.
    (Procdd~ du Commandant CnoLl~S~'~-.)
    Le Commandant d' krtillerie Cholesky, du Service g6ographi-
    que de l':krm6e, tu6 pendant la grande guerre, a imagin6, au
    cours de recherches sur la compensation des r6seaux g6od6si-
    ques, un proc6d6 tr~s ing,~nieux de r6solution des 6quations
    dites normales, obtenues par application de la m6thode des
    moindres carr6s '~ des 6quations lin6aires en hombre inf6rieur
    h celui des inconnues, lien a conclu une m6thode g6n6rale de
    r6solution des 6quations lin6aires.
    Nous suivrons, pour la d6monstration de cette m6thode, la
    progression m6me qui a ser~i au Commandant Cholesky pour
    l'imaginer.
    André-Louis Cholesky (1875–1918)

    View Slide

  50. I. NOTICES SCIENTIFIQUES
    Commandant BENOIT '.
    NOTE SUR UNE IVIETHODE DE RESOLUTION DES EflUA-
    TIONS NOR~IALES PROVENANT DE L'APPLICATION BE
    LA METHODE DES IVIOINDRES CARRIES A UN SYSTEME
    D'EQUATIONS LINP.AIRES EN NOI~IBRE INF~.RIEUR A
    CELUI DES INCONNUES. -- APPLICATION DE LA MP~-
    THODE A LA RF.SOLUTION D'UN Su DEFINI
    D'P.QUATIONS LINEAIRES.
    (Procdd~ du Commandant CnoLl~S~'~-.)
    Le Commandant d' krtillerie Cholesky, du Service g6ographi-
    que de l':krm6e, tu6 pendant la grande guerre, a imagin6, au
    cours de recherches sur la compensation des r6seaux g6od6si-
    ques, un proc6d6 tr~s ing,~nieux de r6solution des 6quations
    dites normales, obtenues par application de la m6thode des
    moindres carr6s '~ des 6quations lin6aires en hombre inf6rieur
    h celui des inconnues, lien a conclu une m6thode g6n6rale de
    r6solution des 6quations lin6aires.
    Nous suivrons, pour la d6monstration de cette m6thode, la
    progression m6me qui a ser~i au Commandant Cholesky pour
    l'imaginer.
    NOTE SUR UNE IVIETHODE DE RESOLUTION DES EflUA-
    TIONS NOR~IALES PROVENANT DE L'APPLICATION BE
    LA METHODE DES IVIOINDRES CARRIES A UN SYSTEME
    D'EQUATIONS LINP.AIRES EN NOI~IBRE INF~.RIEUR A
    CELUI DES INCONNUES. -- APPLICATION DE LA MP~-
    THODE A LA RF.SOLUTION D'UN Su DEFINI
    D'P.QUATIONS LINEAIRES.
    (Procdd~ du Commandant CnoLl~S~'~-.)
    Le Commandant d' krtillerie Cholesky, du Service g6ographi-
    que de l':krm6e, tu6 pendant la grande guerre, a imagin6, au
    cours de recherches sur la compensation des r6seaux g6od6si-
    ques, un proc6d6 tr~s ing,~nieux de r6solution des 6quations
    dites normales, obtenues par application de la m6thode des
    moindres carr6s '~ des 6quations lin6aires en hombre inf6rieur
    h celui des inconnues, lien a conclu une m6thode g6n6rale de
    r6solution des 6quations lin6aires.
    André-Louis Cholesky (1875–1918)

    View Slide

  51. I. NOTICES SCIENTIFIQUES
    Commandant BENOIT '.
    NOTE SUR UNE IVIETHODE DE RESOLUTION DES EflUA-
    TIONS NOR~IALES PROVENANT DE L'APPLICATION BE
    LA METHODE DES IVIOINDRES CARRIES A UN SYSTEME
    D'EQUATIONS LINP.AIRES EN NOI~IBRE INF~.RIEUR A
    CELUI DES INCONNUES. -- APPLICATION DE LA MP~-
    THODE A LA RF.SOLUTION D'UN Su DEFINI
    D'P.QUATIONS LINEAIRES.
    (Procdd~ du Commandant CnoLl~S~'~-.)
    Le Commandant d' krtillerie Cholesky, du Service g6ographi-
    que de l':krm6e, tu6 pendant la grande guerre, a imagin6, au
    cours de recherches sur la compensation des r6seaux g6od6si-
    ques, un proc6d6 tr~s ing,~nieux de r6solution des 6quations
    dites normales, obtenues par application de la m6thode des
    moindres carr6s '~ des 6quations lin6aires en hombre inf6rieur
    h celui des inconnues, lien a conclu une m6thode g6n6rale de
    r6solution des 6quations lin6aires.
    Nous suivrons, pour la d6monstration de cette m6thode, la
    progression m6me qui a ser~i au Commandant Cholesky pour
    l'imaginer.
    NOTE SUR UNE IVIETHODE DE RESOLUTION DES EflUA-
    TIONS NOR~IALES PROVENANT DE L'APPLICATION BE
    LA METHODE DES IVIOINDRES CARRIES A UN SYSTEME
    D'EQUATIONS LINP.AIRES EN NOI~IBRE INF~.RIEUR A
    CELUI DES INCONNUES. -- APPLICATION DE LA MP~-
    THODE A LA RF.SOLUTION D'UN Su DEFINI
    D'P.QUATIONS LINEAIRES.
    (Procdd~ du Commandant CnoLl~S~'~-.)
    Le Commandant d' krtillerie Cholesky, du Service g6ographi-
    que de l':krm6e, tu6 pendant la grande guerre, a imagin6, au
    cours de recherches sur la compensation des r6seaux g6od6si-
    ques, un proc6d6 tr~s ing,~nieux de r6solution des 6quations
    dites normales, obtenues par application de la m6thode des
    moindres carr6s '~ des 6quations lin6aires en hombre inf6rieur
    h celui des inconnues, lien a conclu une m6thode g6n6rale de
    r6solution des 6quations lin6aires.
    André-Louis Cholesky (1875–1918)
    The artillery commander Cholesky, of the Geographical
    Service of the army, killed during the Great War, imagined,
    during research on the compensation of the geodesic
    networks, a very ingenious process of solving the equations
    known as normal, obtained by application of the method of
    least squares to linear equations in lower number than that
    of the unknowns.

    View Slide

  52. # define 2D Gaussian with correlation 0.6
    N <- 1e4
    sigma1 <- 2
    sigma2 <- 0.5
    rho <- 0.6
    # independent z-scores
    z1 <- rnorm( N )
    z2 <- rnorm( N )
    # use Cholesky to blend in correlation
    a1 <- z1 * sigma1
    a2 <- ( rho*z1 + sqrt( 1-rho^2 )*z2 )*sigma2
    que de l':krm6e, tu6 pendant la grande guerre, a imagin6, au
    cours de recherches sur la compensation des r6seaux g6od6si-
    ques, un proc6d6 tr~s ing,~nieux de r6solution des 6quations
    dites normales, obtenues par application de la m6thode des
    moindres carr6s '~ des 6quations lin6aires en hombre inf6rieur
    h celui des inconnues, lien a conclu une m6thode g6n6rale de
    r6solution des 6quations lin6aires.
    Nous suivrons, pour la d6monstration de cette m6thode, la
    progression m6me qui a ser~i au Commandant Cholesky pour
    l'imaginer.
    x. De l'~krtillerie coloniale, ancien officier g6od6sien au Service
    g~ographique de l'[email protected] et au Service g6ographique de l'Indo-Chine,
    Membre du Comit6 national franc ais de Gdoddsie et G6oph3-sique.
    2. Sur le Commandant Cholesky, tu6 h l'ennemi le 3~ ao6t I9~8,
    voir la notice biographique insdr6e dans le volume du B~dlelin g~o-
    d~sique de x922 intitul6 : Unior~ 9~od~siq~le et ggophysique inlernalio-
    hale, Premibre Assemblde 9~n~rale, Rome, mai 1929, Seclion de G~o-
    dgsie, Toulouse, Privat, I9~, in-8 ~ 2~I p., pp. x59 ~ x6i.
    5

    View Slide

  53. # define 2D Gaussian with correlation 0.6
    N <- 1e4
    sigma1 <- 2
    sigma2 <- 0.5
    rho <- 0.6
    # independent z-scores
    z1 <- rnorm( N )
    z2 <- rnorm( N )
    # use Cholesky to blend in correlation
    a1 <- z1 * sigma1
    a2 <- ( rho*z1 + sqrt( 1-rho^2 )*z2 )*sigma2
    que de l':krm6e, tu6 pendant la grande guerre, a imagin6, au
    cours de recherches sur la compensation des r6seaux g6od6si-
    ques, un proc6d6 tr~s ing,~nieux de r6solution des 6quations
    dites normales, obtenues par application de la m6thode des
    moindres carr6s '~ des 6quations lin6aires en hombre inf6rieur
    h celui des inconnues, lien a conclu une m6thode g6n6rale de
    r6solution des 6quations lin6aires.
    Nous suivrons, pour la d6monstration de cette m6thode, la
    progression m6me qui a ser~i au Commandant Cholesky pour
    l'imaginer.
    x. De l'~krtillerie coloniale, ancien officier g6od6sien au Service
    g~ographique de l'[email protected] et au Service g6ographique de l'Indo-Chine,
    Membre du Comit6 national franc ais de Gdoddsie et G6oph3-sique.
    2. Sur le Commandant Cholesky, tu6 h l'ennemi le 3~ ao6t I9~8,
    voir la notice biographique insdr6e dans le volume du B~dlelin g~o-
    d~sique de x922 intitul6 : Unior~ 9~od~siq~le et ggophysique inlernalio-
    hale, Premibre Assemblde 9~n~rale, Rome, mai 1929, Seclion de G~o-
    dgsie, Toulouse, Privat, I9~, in-8 ~ 2~I p., pp. x59 ~ x6i.
    5

    View Slide

  54. # define 2D Gaussian with correlation 0.6
    N <- 1e4
    sigma1 <- 2
    sigma2 <- 0.5
    rho <- 0.6
    # independent z-scores
    z1 <- rnorm( N )
    z2 <- rnorm( N )
    # use Cholesky to blend in correlation
    a1 <- z1 * sigma1
    a2 <- ( rho*z1 + sqrt( 1-rho^2 )*z2 )*sigma2
    > cor(z1,z2)
    [1] -0.0005542644
    > cor(a1,a2)
    [1] 0.5999334
    > sd(a1)
    [1] 1.997036
    > sd(a2)
    [1] 0.4989456
    que de l':krm6e, tu6 pendant la grande guerre, a imagin6, au
    cours de recherches sur la compensation des r6seaux g6od6si-
    ques, un proc6d6 tr~s ing,~nieux de r6solution des 6quations
    dites normales, obtenues par application de la m6thode des
    moindres carr6s '~ des 6quations lin6aires en hombre inf6rieur
    h celui des inconnues, lien a conclu une m6thode g6n6rale de
    r6solution des 6quations lin6aires.
    Nous suivrons, pour la d6monstration de cette m6thode, la
    progression m6me qui a ser~i au Commandant Cholesky pour
    l'imaginer.
    x. De l'~krtillerie coloniale, ancien officier g6od6sien au Service
    g~ographique de l'[email protected] et au Service g6ographique de l'Indo-Chine,
    Membre du Comit6 national franc ais de Gdoddsie et G6oph3-sique.
    2. Sur le Commandant Cholesky, tu6 h l'ennemi le 3~ ao6t I9~8,
    voir la notice biographique insdr6e dans le volume du B~dlelin g~o-
    d~sique de x922 intitul6 : Unior~ 9~od~siq~le et ggophysique inlernalio-
    hale, Premibre Assemblde 9~n~rale, Rome, mai 1929, Seclion de G~o-
    dgsie, Toulouse, Privat, I9~, in-8 ~ 2~I p., pp. x59 ~ x6i.
    5
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    View Slide

  55. P
    i
    ∼ Bernoulli(p
    i
    )
    mean + Actor-treatment + Block-treatment
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    compute alpha from non-centered pieces
    compute beta from non-centered pieces
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]

    View Slide

  56. P
    i
    ∼ Bernoulli(p
    i
    )
    mean + Actor-treatment + Block-treatment
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    compute alpha from non-centered pieces
    compute beta from non-centered pieces
    matrix of treatment-actor z-scores
    matrix of treatment-block z-scores
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]

    View Slide

  57. P
    i
    ∼ Bernoulli(p
    i
    )
    z ¯
    A.j
    , z¯
    B,k
    ∼ Normal(0,1)
    mean + Actor-treatment + Block-treatment
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    compute alpha from non-centered pieces
    compute beta from non-centered pieces
    matrix of treatment-actor z-scores
    matrix of treatment-block z-scores
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]
    ¯
    α = z ¯
    A
    τ
    A
    , ¯
    β = z¯
    B
    τ
    B
    mean actor and block z-scores
    compute mean effects

    View Slide

  58. P
    i
    ∼ Bernoulli(p
    i
    )
    z ¯
    A.j
    , z¯
    B,k
    ∼ Normal(0,1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    mean + Actor-treatment + Block-treatment
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    compute alpha from non-centered pieces
    compute beta from non-centered pieces
    matrix of treatment-actor z-scores
    matrix of treatment-block z-scores
    each standard deviation gets same prior
    correlation matrix prior
    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[i]
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    ¯
    α = z ¯
    A
    τ
    A
    , ¯
    β = z¯
    B
    τ
    B
    mean actor and block z-scores
    compute mean effects

    View Slide

  59. m14.3 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors - non-centered
    transpars> matrix[A,4]:a <-
    compose_noncentered( sigma_A , L_Rho_A , zA ),
    transpars> matrix[B,4]:b <-
    compose_noncentered( sigma_B , L_Rho_B , zB ),
    matrix[4,A]:zA ~ normal( 0 , 1 ),
    matrix[4,B]:zB ~ normal( 0 , 1 ),
    zAbar[A] ~ normal(0,1),
    zBbar[B] ~ normal(0,1),
    transpars> vector[A]:abar <<- zAbar*tau_A,
    transpars> vector[B]:bbar <<- zBbar*tau_B,
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    vector[4]:sigma_A ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_A ~ lkj_corr_cholesky(4),
    vector[4]:sigma_B ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_B ~ lkj_corr_cholesky(4),
    # compute ordinary correlation matrixes
    gq> matrix[4,4]:Rho_A <<- Chol_to_Corr(L_Rho_A),
    gq> matrix[4,4]:Rho_B <<- Chol_to_Corr(L_Rho_B)
    ) , data=dat , chains=4 , cores=4 , log_lik=TRUE )
    P
    i
    ∼ Bernoulli(p
    i
    )
    z ¯
    A.j
    , z¯
    B,k
    ∼ Normal(0,1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    ¯
    α = z ¯
    A
    τ
    A
    , ¯
    β = z¯
    B
    τ
    B

    View Slide

  60. m14.3 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors - non-centered
    transpars> matrix[A,4]:a <-
    compose_noncentered( sigma_A , L_Rho_A , zA ),
    transpars> matrix[B,4]:b <-
    compose_noncentered( sigma_B , L_Rho_B , zB ),
    matrix[4,A]:zA ~ normal( 0 , 1 ),
    matrix[4,B]:zB ~ normal( 0 , 1 ),
    zAbar[A] ~ normal(0,1),
    zBbar[B] ~ normal(0,1),
    transpars> vector[A]:abar <<- zAbar*tau_A,
    transpars> vector[B]:bbar <<- zBbar*tau_B,
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    vector[4]:sigma_A ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_A ~ lkj_corr_cholesky(4),
    vector[4]:sigma_B ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_B ~ lkj_corr_cholesky(4),
    # compute ordinary correlation matrixes
    gq> matrix[4,4]:Rho_A <<- Chol_to_Corr(L_Rho_A),
    gq> matrix[4,4]:Rho_B <<- Chol_to_Corr(L_Rho_B)
    ) , data=dat , chains=4 , cores=4 , log_lik=TRUE )
    P
    i
    ∼ Bernoulli(p
    i
    )
    z ¯
    A.j
    , z¯
    B,k
    ∼ Normal(0,1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    ¯
    α = z ¯
    A
    τ
    A
    , ¯
    β = z¯
    B
    τ
    B

    View Slide

  61. m14.3 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors - non-centered
    transpars> matrix[A,4]:a <-
    compose_noncentered( sigma_A , L_Rho_A , zA ),
    transpars> matrix[B,4]:b <-
    compose_noncentered( sigma_B , L_Rho_B , zB ),
    matrix[4,A]:zA ~ normal( 0 , 1 ),
    matrix[4,B]:zB ~ normal( 0 , 1 ),
    zAbar[A] ~ normal(0,1),
    zBbar[B] ~ normal(0,1),
    transpars> vector[A]:abar <<- zAbar*tau_A,
    transpars> vector[B]:bbar <<- zBbar*tau_B,
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    vector[4]:sigma_A ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_A ~ lkj_corr_cholesky(4),
    vector[4]:sigma_B ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_B ~ lkj_corr_cholesky(4),
    # compute ordinary correlation matrixes
    gq> matrix[4,4]:Rho_A <<- Chol_to_Corr(L_Rho_A),
    gq> matrix[4,4]:Rho_B <<- Chol_to_Corr(L_Rho_B)
    ) , data=dat , chains=4 , cores=4 , log_lik=TRUE )
    P
    i
    ∼ Bernoulli(p
    i
    )
    z ¯
    A.j
    , z¯
    B,k
    ∼ Normal(0,1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    ¯
    α = z ¯
    A
    τ
    A
    , ¯
    β = z¯
    B
    τ
    B

    View Slide

  62. m14.3 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors - non-centered
    transpars> matrix[A,4]:a <-
    compose_noncentered( sigma_A , L_Rho_A , zA ),
    transpars> matrix[B,4]:b <-
    compose_noncentered( sigma_B , L_Rho_B , zB ),
    matrix[4,A]:zA ~ normal( 0 , 1 ),
    matrix[4,B]:zB ~ normal( 0 , 1 ),
    zAbar[A] ~ normal(0,1),
    zBbar[B] ~ normal(0,1),
    transpars> vector[A]:abar <<- zAbar*tau_A,
    transpars> vector[B]:bbar <<- zBbar*tau_B,
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    vector[4]:sigma_A ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_A ~ lkj_corr_cholesky(4),
    vector[4]:sigma_B ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_B ~ lkj_corr_cholesky(4),
    # compute ordinary correlation matrixes
    gq> matrix[4,4]:Rho_A <<- Chol_to_Corr(L_Rho_A),
    gq> matrix[4,4]:Rho_B <<- Chol_to_Corr(L_Rho_B)
    ) , data=dat , chains=4 , cores=4 , log_lik=TRUE )
    P
    i
    ∼ Bernoulli(p
    i
    )
    z ¯
    A.j
    , z¯
    B,k
    ∼ Normal(0,1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    ¯
    α = z ¯
    A
    τ
    A
    , ¯
    β = z¯
    B
    τ
    B

    View Slide

  63. m14.3 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors - non-centered
    transpars> matrix[A,4]:a <-
    compose_noncentered( sigma_A , L_Rho_A , zA ),
    transpars> matrix[B,4]:b <-
    compose_noncentered( sigma_B , L_Rho_B , zB ),
    matrix[4,A]:zA ~ normal( 0 , 1 ),
    matrix[4,B]:zB ~ normal( 0 , 1 ),
    zAbar[A] ~ normal(0,1),
    zBbar[B] ~ normal(0,1),
    transpars> vector[A]:abar <<- zAbar*tau_A,
    transpars> vector[B]:bbar <<- zBbar*tau_B,
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    vector[4]:sigma_A ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_A ~ lkj_corr_cholesky(4),
    vector[4]:sigma_B ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_B ~ lkj_corr_cholesky(4),
    # compute ordinary correlation matrixes
    gq> matrix[4,4]:Rho_A <<- Chol_to_Corr(L_Rho_A),
    gq> matrix[4,4]:Rho_B <<- Chol_to_Corr(L_Rho_B)
    ) , data=dat , chains=4 , cores=4 , log_lik=TRUE )
    P
    i
    ∼ Bernoulli(p
    i
    )
    z ¯
    A.j
    , z¯
    B,k
    ∼ Normal(0,1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    ¯
    α = z ¯
    A
    τ
    A
    , ¯
    β = z¯
    B
    τ
    B

    View Slide

  64. m14.3 <- ulam(
    alist(
    P ~ bernoulli(p),
    logit(p) <- abar[A] + a[A,T] + bbar[B] + b[B,T],
    # adaptive priors - non-centered
    transpars> matrix[A,4]:a <-
    compose_noncentered( sigma_A , L_Rho_A , zA ),
    transpars> matrix[B,4]:b <-
    compose_noncentered( sigma_B , L_Rho_B , zB ),
    matrix[4,A]:zA ~ normal( 0 , 1 ),
    matrix[4,B]:zB ~ normal( 0 , 1 ),
    zAbar[A] ~ normal(0,1),
    zBbar[B] ~ normal(0,1),
    transpars> vector[A]:abar <<- zAbar*tau_A,
    transpars> vector[B]:bbar <<- zBbar*tau_B,
    # fixed priors
    c(tau_A,tau_B) ~ exponential(1),
    vector[4]:sigma_A ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_A ~ lkj_corr_cholesky(4),
    vector[4]:sigma_B ~ exponential(1),
    cholesky_factor_corr[4]:L_Rho_B ~ lkj_corr_cholesky(4),
    # compute ordinary correlation matrixes
    gq> matrix[4,4]:Rho_A <<- Chol_to_Corr(L_Rho_A),
    gq> matrix[4,4]:Rho_B <<- Chol_to_Corr(L_Rho_B)
    ) , data=dat , chains=4 , cores=4 , log_lik=TRUE )
    P
    i
    ∼ Bernoulli(p
    i
    )
    z ¯
    A.j
    , z¯
    B,k
    ∼ Normal(0,1)
    R
    A
    , R
    B
    ∼ LKJcorr(4)
    Z
    T,A
    ∼ Normal(0,1)
    Z
    T,B
    ∼ Normal(0,1)
    α = (diag(S
    A
    )L
    A
    Z
    T,A)

    β = (diag(S
    B
    )L
    B
    Z
    T,B)

    logit(p
    i
    ) = ¯
    α
    A[i]
    + α
    A[i],T[i]
    + ¯
    β
    B[i]
    + β
    B[i],T[
    S
    A,j
    , S
    B,j
    , τ
    A
    , τ
    B
    ∼ Exponential(1)
    ¯
    α = z ¯
    A
    τ
    A
    , ¯
    β = z¯
    B
    τ
    B

    View Slide

  65. 0 500 1000 1500 2000 2500
    0 500 1000 2000
    effective samples (centered model)
    effective samples (non-centered model)
    n_eff = 1191
    sigma_A[4] n_eff = NaN
    L_Rho_A[1,1]
    n_eff = 1718
    L_Rho_A[2,1] n_eff = 1912
    L_Rho_A[3,1]
    n_eff = 1863
    L_Rho_A[4,1] n_eff = NaN
    L_Rho_A[1,2]
    n_eff = 1235
    L_Rho_A[2,2] n_eff = 1716
    L_Rho_A[3,2]
    n_eff = 2206
    L_Rho_A[4,2] n_eff = NaN
    L_Rho_A[1,3]

    View Slide

  66. 0.0 0.2 0.4 0.6 0.8 1.0
    actor-treatment
    probability pull left
    1
    2
    3
    4
    5
    6
    7
    0.0 0.2 0.4 0.6 0.8 1.0
    block-treatment
    probability pull left
    1 2 3 4 5
    6

    View Slide

  67. 0.0 0.2 0.4 0.6 0.8 1.0
    actor-treatment
    probability pull left
    1
    2
    3
    4
    5
    6
    7
    0.0 0.2 0.4 0.6 0.8 1.0
    block-treatment
    probability pull left
    1 2 3 4 5
    6
    0 1 2 3 4 5
    0.0 1.0 2.0
    standard dev among actors
    Density
    actor means
    by treatment
    0 1 2 3 4 5
    0.0 1.0 2.0 3.0
    standard dev among blocks
    Density
    block means
    by treatment

    View Slide

  68. Correlated Varying Effects
    Priors that learn correlation structure:

    (1) partial pooling across features

    (2) exploit correlations for prediction &
    causal inference
    Varying effects can be correlated even if
    the prior doesn’t learn the correlations!
    Ethical obligation to do our best

    View Slide

  69. Course Schedule
    Week 1 Bayesian inference Chapters 1, 2, 3
    Week 2 Linear models & Causal Inference Chapter 4
    Week 3 Causes, Confounds & Colliders Chapters 5 & 6
    Week 4 Overfitting / MCMC Chapters 7, 8, 9
    Week 5 Generalized Linear Models Chapters 10, 11
    Week 6 Ordered categories & Multilevel models Chapters 12 & 13
    Week 7 More Multilevel models Chapters 13 & 14
    Week 8 Multilevel Tactics & Gaussian Processes Chapter 14
    Week 9 Measurement & Missingness Chapter 15
    Week 10 Generalized Linear Madness Chapter 16
    https://github.com/rmcelreath/stat_rethinking_2022

    View Slide

  70. View Slide