Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking - Lecture 18

Statistical Rethinking - Lecture 18

Lecture 18 - Multilevel models (3) - Varying slopes - Statistical Rethinking: A Bayesian Course with R Examples

Richard McElreath

March 05, 2015
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Multilevel overdispersion • Overdispersion: Count data with residual variation greater

    than expectation • Implies unmodeled heterogeneity across cases • Can estimate that heterogeneity with varying intercepts on each case • Estimate varying intercept for each observation in the data
  2. Multilevel islands • Recall Oceanic tools model  .6-5*-&7&- 1045&3*03

    13&%*$5*0/4 UIF NPEFM 5J ∼ 1PJTTPO(µJ) MPH(µJ) = α + αĶŀĹĮĻı[J] + β1 MPH 1J α ∼ /PSNBM(, ) β1 ∼ /PSNBM(, ) αĶŀĹĮĻı ∼ /PSNBM(, σĶŀĹĮĻı) σĶŀĹĮĻı ∼ )BMG$BVDIZ(, ) 5 JT /*/'Ǿ/**'. 1 JT +*+0'/$*) BOE J JOEFYFT FBDI JTMBOE JOUFSDFQU NPEFM CVU XJUI B WBSZJOH JOUFSDFQU GPS FWFSZ PCTFSWBUJP VQ CFJOH BO FTUJNBUF PG UIF PWFSEJTQFSTJPO BNPOH JTMBOET "OP UIBU UIF WBSZJOH JOUFSDFQUT αĶŀĹĮĻı BSF SFTJEVBMT GPS FBDI JTMBOE # USJCVUJPO PG UIFTF SFTJEVBMT XF HFU BO FTUJNBUF PG UIF FYDFTT WBSJBU Distribution of island intercepts informs amount of excess variation culture population contact total_tools mean_TU 1 Malekula 1100 low 13 3.2 2 Tikopia 1500 low 22 4.7 3 Santa Cruz 3600 low 24 4.0 4 Yap 4791 high 43 5.0 5 Lau Fiji 7400 high 33 5.0 6 Trobriand 8000 high 19 4.0 7 Chuuk 9200 high 40 3.8 8 Manus 13000 low 28 6.6 9 Tonga 17500 high 55 5.4 10 Hawaii 275000 low 71 6.6
  3. 7 8 9 10 11 12 20 30 40 50

    60 70 log population total tools 'ĶĴłĿIJ ƉƊƎ 1PTUFSJPS QSFEJDUJPOT GPS UIF PWFSEJTQFSTFE 1PJT NPEFM (ǎǏǡǓ ćF TIBEFE SFHJPOT BSF JOTJEF UP PVU   JOUFSWBMT PG UIF FYQFDUFE NFBO .BSHJOBMJ[JOH PWFS UIF WBSZJOH SFTVMUT JO B NVDI XJEFS QSFEJDUJPO SFHJPO UIBO XFE FYQFDU VO 1PJTTPO QSPDFTT 50% 80% 95% 7 8 9 10 11 12 20 30 40 50 60 70 log population total tools No varying intercepts (no overdispersion) Varying intercepts (overdispersion) WAIC pWAIC dWAIC weight SE dSE m12.6 70.0 4.9 0.0 1 2.65 NA m10.12 84.4 3.8 14.5 0 8.94 7.29 m10.12 m12.6 60 65 70 75 80 85 90 deviance WAIC
  4. Kinds of varying effects • Varying intercepts: means differ by

    cluster • Varying slopes: effects of predictors vary by cluster • Any parameter can be made into a varying effect • (1) split into vector of parameters by cluster • (2) define population distribution -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6
  5. Varying slopes • Why varying slopes? • drugs affect people

    differently • after school programs don’t work for everyone • not every unit has same relationship to predictor • variation is important, whether for intervention or inference • Average effect misleading? • Pooling, shrinkage, mnesia
  6. Café Robot • Robot programmed to visit cafés, order coffee,

    record wait time • Visits in morning and afternoon • Intercepts: avg morning wait • Slopes: avg difference btw afternoon and morning • Are intercepts and slopes related? • Yes => pooling across parameter types!   .6-5* 2 4 6 8 wait time (minutes) M A M A M A M A M A 2 4 6 8 wait time (minutes) M A M A M A M A M A Café A Café B
  7. Population of Cafés -3 -2 -1 0 1 2 3

    0.0 0.2 0.4 intercept Density -3 -2 -1 0 1 2 3 0.0 0.4 0.8 slope Density -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 intercept slope intercepts slopes population
  8. Population of Cafés • 2-dimensional Gaussian distribution • vector of

    means • variance-covariance matrix -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 intercept slope  .6-5*-&7&- .0%&-4 **  JT UIF NFBO JOUFSDFQU UIF XBJU JO UIF NPSOJOH "OE UIF WBMVF JO  JT ČFSFODF JO XBJU CFUXFFO BęFSOPPO BOE NPSOJOH BODFT BOE DPWBSJBODFT JT BSSBOHFE MJLF UIJT σ α σασβρ σασβρ σ β QUT JT σ α BOE UIF WBSJBODF JO TMPQFT JT σ β  ćFTF BSF GPVOE BMPOH UIF ćF PUIFS UXP FMFNFOUT PG UIF NBUSJY BSF UIF TBNF σασβρ ćJT JT UIF FSDFQUT BOE TMPQFT *UT KVTU UIF QSPEVDU PG UIF UXP TUBOEBSE EFWJBUJPOT intercepts variance slopes variance covariance correlation
  9. Simulated Cafés   .6-5*-&7&- .0%&-4 ** 2 3 4

    5 6 -2.0 -1.5 -1.0 -0.5 intercepts (a_cafe) slopes (b_cafe) 'ĶĴłĿIJ ƉƋƊ  DBGÏT TBN UJTUJDBM QPQVMBUJPO ćF UIF JOUFSDFQU BWFSBHF NPSO DBGF ćF WFSUJDBM BYJT JT EJČFSFODF CFUXFFO BęFSOP XBJU GPS FBDI DBGÏ ćF HSB UIF NVMUJWBSJBUF (BVTTJBO Q DFQUT BOE TMPQFT 20 cafés 5 days morning & afternoon 200 observations
  10. Varying slopes model  7"3:*/( 4-01&4 #: $0/4536$5*0/ WBSZJOH JOUFSDFQUT

    ćJT JT UIF WBSZJOH TMPQFT NPEFM XJUI FYQMBOBUJPO UP GPMMPX 8J ∼ /PSNBM(µJ, σ) µJ = αİĮij˦[J] + βİĮij˦[J] .J >OLQ αİĮij˦ βİĮij˦ ∼ .7/PSNBM α β , 4 >SRSXODWLRQRIYDU\ 4 = σα   σβ 3 σα   σβ >FRQVWUXFWFRYDULD α ∼ /PSNBM(, ) >SULRUIRUDYHUDJH β ∼ /PSNBM(, ) >SULRUIRUDYHU σ ∼ )BMG$BVDIZ(, ) >SULRUVWGGHYZ σα ∼ )BMG$BVDIZ(, ) >SULRUVWGGHYDPRQJ σβ ∼ )BMG$BVDIZ(, ) >SULRUVWGGHYDPR 3 ∼ -,+DPSS() >SULRUIRUFRUUHOD ćF MJLFMJIPPE BOE MJOFBS NPEFM OFFE OP FYQMBOBUJPO BU UIJT QPJOU JO UIF CPPL #VU MJOF XIJDI EFĕOFT UIF QPQVMBUJPO PG WBSZJOH JOUFSDFQUT BOE TMPQFT EFTFSWFT BUUFOU
  11. WBSZJOH JOUFSDFQUT ćJT JT UIF WBSZJOH TMPQFT NPEFM XJUI FYQMB

    8J ∼ /PSNBM(µJ, σ) µJ = αİĮij˦[J] + βİĮij˦[J] .J αİĮij˦ βİĮij˦ ∼ .7/PSNBM α β , 4 4 = σα   σβ 3 σα   σβ α ∼ /PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ )BMG$BVDIZ(, ) σα ∼ )BMG$BVDIZ(, ) σβ ∼ )BMG$BVDIZ(, ) 3 ∼ -,+DPSS() ćF MJLFMJIPPE BOE MJOFBS NPEFM OFFE OP FYQMBOBUJPO BU UIJT QP varying intercepts varying slopes
  12. WBSZJOH JOUFSDFQUT ćJT JT UIF WBSZJOH TMPQFT NPEFM XJUI FYQMB

    8J ∼ /PSNBM(µJ, σ) µJ = αİĮij˦[J] + βİĮij˦[J] .J αİĮij˦ βİĮij˦ ∼ .7/PSNBM α β , 4 4 = σα   σβ 3 σα   σβ α ∼ /PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ )BMG$BVDIZ(, ) σα ∼ )BMG$BVDIZ(, ) σβ ∼ )BMG$BVDIZ(, ) 3 ∼ -,+DPSS() ćF MJLFMJIPPE BOE MJOFBS NPEFM OFFE OP FYQMBOBUJPO BU UIJT QP multivariate prior
  13. WBSZJOH JOUFSDFQUT ćJT JT UIF WBSZJOH TMPQFT NPEFM XJUI FYQMB

    8J ∼ /PSNBM(µJ, σ) µJ = αİĮij˦[J] + βİĮij˦[J] .J αİĮij˦ βİĮij˦ ∼ .7/PSNBM α β , 4 4 = σα   σβ 3 σα   σβ α ∼ /PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ )BMG$BVDIZ(, ) σα ∼ )BMG$BVDIZ(, ) σβ ∼ )BMG$BVDIZ(, ) 3 ∼ -,+DPSS() ćF MJLFMJIPPE BOE MJOFBS NPEFM OFFE OP FYQMBOBUJPO BU UIJT QP pop avg intercept pop avg slope covariance matrix
  14. Covariance matrix shuffle • m-by-m covariance matrix requires estimating •

    m standard deviations (or variances) • (m2 – m)/2 correlations (or covariances) • total of m(m + 1)/2 parameters • Several ways specify priors • Conjugate: inverse-Wishart (inv_wishart) • inverse-Wishart cannot pull apart stddev and correlations • Better to decompose: α ∼ /PSNBM(, ) βN ∼ /PSNBM(, ) αK ∼ /PSNBM(, σ) K = ... σ ∼ $BVDIZ(, ) "JK ∼ #JOPNJBM(OJ, QJK) MPHJU QJK = α + αK + (βN + βNK)NJK α ∼ /PSNBM(, ) βN ∼ /PSNBM(, ) αK βNK ∼ .7/PSNBM   , Σ K = ... Σ = σ α ρσα σβ ρσα σβ σ β = σα   σβ  ρ ρ  σα   σβ = 434 S R {
  15. WBSZJOH JOUFSDFQUT ćJT JT UIF WBSZJOH TMPQFT NPEFM XJUI FYQMB

    8J ∼ /PSNBM(µJ, σ) µJ = αİĮij˦[J] + βİĮij˦[J] .J αİĮij˦ βİĮij˦ ∼ .7/PSNBM α β , 4 4 = σα   σβ 3 σα   σβ α ∼ /PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ )BMG$BVDIZ(, ) σα ∼ )BMG$BVDIZ(, ) σβ ∼ )BMG$BVDIZ(, ) 3 ∼ -,+DPSS() ćF MJLFMJIPPE BOE MJOFBS NPEFM OFFE OP FYQMBOBUJPO BU UIJT QP build cov matrix
  16. WBSZJOH JOUFSDFQUT ćJT JT UIF WBSZJOH TMPQFT NPEFM XJUI FYQMB

    8J ∼ /PSNBM(µJ, σ) µJ = αİĮij˦[J] + βİĮij˦[J] .J αİĮij˦ βİĮij˦ ∼ .7/PSNBM α β , 4 4 = σα   σβ 3 σα   σβ α ∼ /PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ )BMG$BVDIZ(, ) σα ∼ )BMG$BVDIZ(, ) σβ ∼ )BMG$BVDIZ(, ) 3 ∼ -,+DPSS() ćF MJLFMJIPPE BOE MJOFBS NPEFM OFFE OP FYQMBOBUJPO BU UIJT QP fixed (non-adaptive) priors
  17. WBSZJOH JOUFSDFQUT ćJT JT UIF WBSZJOH TMPQFT NPEFM XJUI FYQMB

    8J ∼ /PSNBM(µJ, σ) µJ = αİĮij˦[J] + βİĮij˦[J] .J αİĮij˦ βİĮij˦ ∼ .7/PSNBM α β , 4 4 = σα   σβ 3 σα   σβ α ∼ /PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ )BMG$BVDIZ(, ) σα ∼ )BMG$BVDIZ(, ) σβ ∼ )BMG$BVDIZ(, ) 3 ∼ -,+DPSS() ćF MJLFMJIPPE BOE MJOFBS NPEFM OFFE OP FYQMBOBUJPO BU UIJT QP correlation matrix prior
  18. LKJ Correlation prior • After Lewandowski, Kurowicka, and Joe (LKJ)

    2009 • One parameter, eta, specifies concentration or dispersion from identity matrix (no correlations) • eta = 1, uniform correlation matrices • eta > 1, stomps on extreme correlations • eta < 1, elevates extreme correlations -1.0 -0.5 0.0 0.5 1.0 0.0 0.2 0.4 0.6 correlation Density -1.0 -0.5 0.0 0.5 1.0 0.0 0.4 0.8 correlation Density eta = 1 -1.0 -0.5 0.0 0.5 1.0 0.0 1.0 2.0 correlation Density eta = 2 eta = 0.5
  19. Varying slopes estimation m13.1 <- map2stan( alist( wait ~ dnorm(

    mu , sigma ), mu <- a_cafe[cafe] + b_cafe[cafe]*afternoon, c(a_cafe,b_cafe)[cafe] ~ dmvnorm2(c(a,b),sigma_cafe,Rho), a ~ dnorm(0,10), b ~ dnorm(0,10), sigma_cafe ~ dcauchy(0,2), sigma ~ dcauchy(0,2), Rho ~ dlkjcorr(2) ) , data=d , iter=5000 , warmup=2000 , chains=2 )
  20. Varying slopes estimation m13.1 <- map2stan( alist( wait ~ dnorm(

    mu , sigma ), mu <- a_cafe[cafe] + b_cafe[cafe]*afternoon, c(a_cafe,b_cafe)[cafe] ~ dmvnorm2(c(a,b),sigma_cafe,Rho), a ~ dnorm(0,10), b ~ dnorm(0,10), sigma_cafe ~ dcauchy(0,2), sigma ~ dcauchy(0,2), Rho ~ dlkjcorr(2) ) , data=d , iter=5000 , warmup=2000 , chains=2 )
  21. Varying slopes estimation m13.1 <- map2stan( alist( wait ~ dnorm(

    mu , sigma ), mu <- a_cafe[cafe] + b_cafe[cafe]*afternoon, c(a_cafe,b_cafe)[cafe] ~ dmvnorm2(c(a,b),sigma_cafe,Rho), a ~ dnorm(0,10), b ~ dnorm(0,10), sigma_cafe ~ dcauchy(0,2), sigma ~ dcauchy(0,2), Rho ~ dlkjcorr(2) ) , data=d , iter=5000 , warmup=2000 , chains=2 )
  22. Posterior correlation ǿǢȀ B WFDUPS PG TUBOEBSE EFWJBUJPOT .$"(Ǿ! BOE

    B DPSSFMBUJPO NBUSJY #* *U DPO TUSVDUT UIF DPWBSJBODF NBUSJY JOUFSOBMMZ *G ZPV BSF JOUFSFTUFE JO UIF EFUBJMT ZPV DBO QFFL BU UIF SBX 4UBO DPEF XJUI ./)* ǿ(ǎǐǡǎȀ /PX JOTUFBE PG MPPLJOH BU UIF NBSHJOBM FTUJNBUFT JO UIF +- $. PVUQVU MFUT HP TUSBJHIU UP JOTQFDUJOH UIF QPTUFSJPS EJTUSJCVUJPO PG WBSZJOH FČFDUT 'JSTU MFUT FYBNJOF UIF QPTUFSJPS DPSSFMBUJPO CFUXFFO JOUFSDFQUT BOE TMPQFT 3 DPEF  +*./ ʚǶ 3/-/ǡ.(+' .ǿ(ǎǐǡǎȀ  ).ǿ +*./ɶ#*ȁǢǎǢǏȂ Ȁ   .6-5*-&7&- .0%&-4 ** -1.0 -0.5 0.0 0.5 1.0 0.0 0.5 1.0 1.5 2.0 2.5 correlation Density prior posterior 'ĶĴłĿIJ ƉƋƋ 1PTUFS DPSSFMBUJPO CFUXFFO #MVF 1PTUFSJPS EJTUSJ SFMJBCMZ CFMPX [FSP UJPO UIF -,+DPSS 
  23. Posterior shrinkage   .6-5 -1.0 -0.5 0.0 0.5 1.0

    0.0 0.5 1.0 1.5 2.0 2.5 correlation Density prior posterior QSJPS JT ĘBU PWFS BMM WBMJE DPSSFMBUJPO NBUS   .6-5*-&7&- .0%&-4 ** 2.5 3.0 3.5 4.0 4.5 5.0 5.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 intercept slope 3.0 3.5 4.0 4.5 5.0 5.5 2.0 2.5 3.0 3.5 4.0 morning wait (mins) afternoon wait (mins) 'ĶĴłĿIJ ƉƋƌ 4ISJOLBHF JO UXP EJNFOTJPOT -Fę SBX VOQPPMFE JOUFSDFQUT BOE TMPQFT ĕMMFE CMVF DPNQBSFE UP QBSUJBMMZ QPPMFE QPTUFSJPS NFBOT PQFO DJSDMFT  ćF HSBZ DPOUPVST TIPX UIF JOGFSSFE QPQVMBUJPO PG WBSZJOH FČFDUT
  24. 2.5 3.0 3.5 4.0 4.5 5.0 5.5 -2.5 -2.0 -1.5

    -1.0 -0.5 0.0 intercept slope 3 2.0 2.5 3.0 3.5 4.0 afternoon wait (mins) 10 30 50 80 99 unpooled pooled
  25.  .6-5*-&7&- .0%&-4 ** 2.5 3.0 3.5 4.0 4.5 5.0

    5.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 intercept slope 3.0 3.5 4.0 4.5 5.0 5.5 2.0 2.5 3.0 3.5 4.0 morning wait (mins) afternoon wait (mins) 'ĶĴłĿIJ ƉƋƌ 4ISJOLBHF JO UXP EJNFOTJPOT -Fę SBX VOQPPMFE JOUFSDFQUT BOE TMPQFT ĕMMFE CMVF DPNQBSFE UP QBSUJBMMZ QPPMFE QPTUFSJPS NFBOT PQFO DJSDMFT  ćF HSBZ DPOUPVST TIPX UIF JOGFSSFE QPQVMBUJPO PG WBSZJOH FČFDUT  .6-5*-&7&- .0%&-4 ** 2.5 3.0 3.5 4.0 4.5 5.0 5.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 intercept slope 3.0 3.5 4.0 4.5 5.0 5.5 2.0 2.5 3.0 3.5 4.0 morning wait (mins) afternoon wait (mins) 'ĶĴłĿIJ ƉƋƌ 4ISJOLBHF JO UXP EJNFOTJPOT -Fę SBX VOQPPMFE JOUFSDFQUT BOE TMPQFT ĕMMFE CMVF DPNQBSFE UP QBSUJBMMZ QPPMFE QPTUFSJPS NFBOT PQFO DJSDMFT  ćF HSBZ DPOUPVST TIPX UIF JOGFSSFE QPQVMBUJPO PG WBSZJOH FČFDUT parameter scale outcome scale
  26. Multi-dimensional shrinkage • Joint distribution of varying effects pools information

    across intercepts & slopes • Correlation btw effects => shrinkage in one dimension induces shrinkage in others • Improved accuracy, just like varying intercepts   .6-5*-& 2.5 3.0 3.5 4.0 4.5 5.0 5.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 intercept slope 'ĶĴłĿIJ ƉƋƌ 4ISJOLBHF JO UXP EJN BOE TMPQFT ĕMMFE CMVF DPNQBSFE UP DJSDMFT  ćF HSBZ DPOUPVST TIPX UIF 3JHIU ćF TBNF FTUJNBUFT PO UIF PV
  27. Cross-classified varying slopes • More slopes: Higher dimension covariance matrix

    • More clusters: More than one multivariate prior • Reconsider chimpanzees data QSJPS 4P UIJT NFBOT POF NPSF TUBOEBSE EFWJBUJPO QBSBNFUFS BOE POF NPSF DPSSFMBUJPO NBUSJY 'PS FYBNQMF TVQQPTF UIF 6$# BENJTTJPOT EBUB BMTP SFDPSEFE UIF UF BQQMJDBOU ćFO XF DPVME BMTP JODMVEF UFTU TDPSF BT B QSFEJDUPS #VU XF E QSFEJDUPS GPS UIFTF EBUB 4P JOTUFBE XFMM UVSO UP BOPUIFS EBUB TFU JO UIF O EFFQFS JOUP WBSZJOH TMPQFT  &YBNQMF $SPTTDMBTTJĕFE DIJNQBO[FFT XJUI WBSZJOH ćF GVMM NPEFM JODMVEFT CPUI DMVTUFS UZQFT WBSZJOH JOUFSDFQUT WBSZJOH T BOE WBSZJOH TMPQFT PO UIF JOUFSBDUJPO CFUXFFO +-*.*Ǿ' !/ BOE *)$/$ -J ∼ #JOPNJBM(, QJ) MPHJU(QJ) = AJ + (B1,J + B1$,J $J)1J >OLQ AJ = α + αĮİŁļĿ[J] + αįĹļİĸ[J] B1,J = β1 + β1,ĮİŁļĿ[J] + β1,įĹļİĸ[J] B1$,J = β1 + β1$,ĮİŁļĿ[J] + β1$,įĹļİĸ[J] >1 × 4JODF UIFSF BSF UXP DMVTUFS UZQFT BDUPST BOE CMPDLT UIFSF BSF UXP NVMUJWBS PST ćF NVMUJWBSJBUF (BVTTJBO QSJPST BSF CPUI EJNFOTJPOBM JO UIJT FYBN
  28. QSFEJDUPS GPS UIFTF EBUB 4P JOTUFBE XFMM UVSO UP BOPUIFS

    EBUB TFU JO UIF O EFFQFS JOUP WBSZJOH TMPQFT  &YBNQMF $SPTTDMBTTJĕFE DIJNQBO[FFT XJUI WBSZJOH ćF GVMM NPEFM JODMVEFT CPUI DMVTUFS UZQFT WBSZJOH JOUFSDFQUT WBSZJOH T BOE WBSZJOH TMPQFT PO UIF JOUFSBDUJPO CFUXFFO +-*.*Ǿ' !/ BOE *)$/$ -J ∼ #JOPNJBM(, QJ) MPHJU(QJ) = AJ + (B1,J + B1$,J $J)1J >OLQ AJ = α + αĮİŁļĿ[J] + αįĹļİĸ[J] B1,J = β1 + β1,ĮİŁļĿ[J] + β1,įĹļİĸ[J] B1$,J = β1 + β1$,ĮİŁļĿ[J] + β1$,įĹļİĸ[J] >1 × 4JODF UIFSF BSF UXP DMVTUFS UZQFT BDUPST BOE CMPDLT UIFSF BSF UXP NVMUJWBS PST ćF NVMUJWBSJBUF (BVTTJBO QSJPST BSF CPUI EJNFOTJPOBM JO UIJT FYBN FSBM ZPV DBO DIPPTF UP IBWF EJČFSFOU WBSZJOH FČFDUT JO EJČFSFOU DMVTUFS UZQ ⎡ ⎣ αĮİŁļĿ β1,ĮİŁļĿ β1$,ĮİŁļĿ ⎤ ⎦ ∼ .7/PSNBM ⎛ ⎝ ⎡ ⎣    ⎤ ⎦ , 4ĮİŁļĿ ⎞ ⎠
  29. QSFEJDUPS GPS UIFTF EBUB 4P JOTUFBE XFMM UVSO UP BOPUIFS

    EBUB TFU JO UIF O EFFQFS JOUP WBSZJOH TMPQFT  &YBNQMF $SPTTDMBTTJĕFE DIJNQBO[FFT XJUI WBSZJOH ćF GVMM NPEFM JODMVEFT CPUI DMVTUFS UZQFT WBSZJOH JOUFSDFQUT WBSZJOH T BOE WBSZJOH TMPQFT PO UIF JOUFSBDUJPO CFUXFFO +-*.*Ǿ' !/ BOE *)$/$ -J ∼ #JOPNJBM(, QJ) MPHJU(QJ) = AJ + (B1,J + B1$,J $J)1J >OLQ AJ = α + αĮİŁļĿ[J] + αįĹļİĸ[J] B1,J = β1 + β1,ĮİŁļĿ[J] + β1,įĹļİĸ[J] B1$,J = β1 + β1$,ĮİŁļĿ[J] + β1$,įĹļİĸ[J] >1 × 4JODF UIFSF BSF UXP DMVTUFS UZQFT BDUPST BOE CMPDLT UIFSF BSF UXP NVMUJWBS PST ćF NVMUJWBSJBUF (BVTTJBO QSJPST BSF CPUI EJNFOTJPOBM JO UIJT FYBN FSBM ZPV DBO DIPPTF UP IBWF EJČFSFOU WBSZJOH FČFDUT JO EJČFSFOU DMVTUFS UZQ ⎡ ⎣ αĮİŁļĿ β1,ĮİŁļĿ β1$,ĮİŁļĿ ⎤ ⎦ ∼ .7/PSNBM ⎛ ⎝ ⎡ ⎣    ⎤ ⎦ , 4ĮİŁļĿ ⎞ ⎠ average effects actor offsets block offsets
  30. Cross-classified varying slopes • Need two multivariate priors: actors and

    blocks • Each 3-dimensional with own covariance matrix  &9".1-& $3044$-"44*'*&% $)*.1"/;&&4 8*5) 7"3:*/( 4-01&4  WBSZJOH FČFDUT JO EJČFSFOU DMVTUFS UZQFT )FSF BSF UIF UXP QSJPST JO UIJT DBTF ⎡ ⎣ αĮİŁļĿ β1,ĮİŁļĿ β1$,ĮİŁļĿ ⎤ ⎦ ∼ .7/PSNBM ⎛ ⎝ ⎡ ⎣    ⎤ ⎦ , 4ĮİŁļĿ ⎞ ⎠ ⎡ ⎣ αįĹļİĸ β1,įĹļİĸ β1$,įĹļİĸ ⎤ ⎦ ∼ .7/PSNBM ⎛ ⎝ ⎡ ⎣    ⎤ ⎦ , 4įĹļİĸ ⎞ ⎠ 8IBU UIFTF QSJPST TUBUF JT UIBU BDUPST BOE CMPDLT DPNF GSPN UXP EJČFSFOU TUBUJTUJDBM QPQVMB UJPOT 8JUIJO FBDI UIF UISFF GFBUVSFT PG FBDI BDUPS PS CMPDL BSF SFMBUFE UISPVHI B DPWBSJBODF NBUSJY TQFDJĕD UP UIBU QPQVMBUJPO ćFSF BSF OP NFBOT JO UIFTF QSJPST KVTU CFDBVTF XF BM SFBEZ QMBDFE UIF BWFSBHF FČFDUT‰α β1 BOE β1$ ‰JO UIF MJOFBS NPEFMT "OE UIF (+Ǐ./) DPEF GPS UIJT NPEFM MPPLT BT ZPVE FYQFDU HJWFO QSFWJPVT FYBNQMFT
  31. Cross-classified varying slopes m13.6_prep <- map2stan( alist( pulled_left ~ dbinom(1,p),

    logit(p) <- A + (BP + BPC*condition)*prosoc_left, A <- a + a_actor[actor] + a_block[block_id], BP <- bp + bp_actor[actor] + bp_block[block_id], BPC <- bpc + bpc_actor[actor] + bpc_block[block_id], c(a,bp,bpc) ~ dnorm(0,1), c(a_actor,bp_actor,bpc_actor)[actor] ~ dmvnorm2(0,sigma_actor,Rho_actor), c(a_block,bp_block,bpc_block)[block_id] ~ dmvnorm2(0,sigma_block,Rho_block), sigma_actor ~ dcauchy(0,2), sigma_block ~ dcauchy(0,2), Rho_actor ~ dlkjcorr(4), Rho_block ~ dlkjcorr(4) ) , data=d , iter=2 ) m13.6 <- resample( m13.6_prep , warmup=5000 , iter=2e4 , chains=3 , cores=3 )
  32. Cross-classified varying slopes m13.6_prep <- map2stan( alist( pulled_left ~ dbinom(1,p),

    logit(p) <- A + (BP + BPC*condition)*prosoc_left, A <- a + a_actor[actor] + a_block[block_id], BP <- bp + bp_actor[actor] + bp_block[block_id], BPC <- bpc + bpc_actor[actor] + bpc_block[block_id], c(a,bp,bpc) ~ dnorm(0,1), c(a_actor,bp_actor,bpc_actor)[actor] ~ dmvnorm2(0,sigma_actor,Rho_actor), c(a_block,bp_block,bpc_block)[block_id] ~ dmvnorm2(0,sigma_block,Rho_block), sigma_actor ~ dcauchy(0,2), sigma_block ~ dcauchy(0,2), Rho_actor ~ dlkjcorr(4), Rho_block ~ dlkjcorr(4) ) , data=d , iter=2 ) m13.6 <- resample( m13.6_prep , warmup=5000 , iter=2e4 , chains=3 , cores=3 )
  33. Cross-classified varying slopes • 54 parameters • 3 average effects

    • 3x7 = 21 varying effects on actor • 3x6 = 18 varying effects on block • 6 standard deviations • 6 free correlation parameters • WAIC says pWAIC ≈ 18 1 1 1   .6-5*-&7&- .0%&-4 ** ćJT NPEFM IBT  QBSBNFUFST  BWFSBHF FČFDUT  ×  WBSZJOH FČFDUT PO BDUPS  ×  WBSZJOH FČFDUT PO CMPDL  TUBOEBSE EFWJBUJPOT BOE  GSFF DPSSFMBUJPO QBSBNFUFST :PV DBO DIFDL UIFN BMM GPS ZPVSTFMG XJUI +- $.ǿ(ǎǐǡǓǢ +/#ʙǏȀ #VU FČFDUJWFMZ UIF NPEFM IBT POMZ BCPVU  QBSBNFUFST‰DIFDL  ǿ(ǎǐǡǓȀ ćF UXP WBSZJOH FČFDUT QPQVMBUJPOT POF GPS BDUPST BOE POF GPS CMPDLT SFHVMBSJ[F UIF WBSZJOH FČFDUT UIFNTFMWFT 4P BT VTVBM FBDI WBSZJOH JOUFSDFQU PS TMPQF DPVOUT MFTT UIBO POF FČFDUJWF QBSBNFUFS 8F DBO JOTQFDU UIF TUBOEBSE EFWJBUJPO QBSBNFUFST UP HFU B TFOTF PG IPX BHHSFTTJWFMZ UIF WBSZJOH FČFDUT BSF CFJOH SFHVMBSJ[FE 3 DPEF  +- $.ǿ (ǎǐǡǓ Ǣ  +/#ʙǏ Ǣ +-.ʙǿǫ.$"(Ǿ/*-ǫǢǫ.$"(Ǿ'*&ǫȀ Ȁ  ) / 1 '*2 - Ǎǡǖǒ 0++ - Ǎǡǖǒ )Ǿ !! #/ .$"(Ǿ/*-ȁǎȂ Ǐǡǐǒ ǍǡǕǔ ǎǡǍǐ ǐǡǖǕ ǎǑǎǓ ǎ .$"(Ǿ/*-ȁǏȂ ǍǡǑǔ ǍǡǐǓ ǍǡǍǏ ǎǡǎǓ ǔǒǒ ǎ .$"(Ǿ/*-ȁǐȂ ǍǡǒǏ ǍǡǑǓ ǍǡǍǐ ǎǡǑǍ ǎǒǔǕ ǎ .$"(Ǿ'*&ȁǎȂ ǍǡǏǐ ǍǡǏǍ ǍǡǍǎ ǍǡǓǍ ǎǍǔǍ ǎ .$"(Ǿ'*&ȁǏȂ Ǎǡǒǔ ǍǡǑǎ ǍǡǍǏ ǎǡǐǏ ǎǍǓǕ ǎ .$"(Ǿ'*&ȁǐȂ Ǎǡǒǐ ǍǡǑǏ ǍǡǍǏ ǎǡǐǏ ǎǒǖǓ ǎ
  34. Multilevel heuristics • Begin with “empty” model with no predictors,

    but with varying intercepts on clusters of interest • Assess where in the model the variation is • Add in predictors and vary their slopes • Can drop varying effects with tiny sigmas • Consider two sorts of posterior prediction • Same units: What happened in these data? • New units: What might we expect for new units? • Your knowledge of domain trumps all of above