Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking - Lecture 05

Statistical Rethinking - Lecture 05

Lecture 05 - Multivariate Models - Statistical Rethinking: A Bayesian Course with R Examples

Richard McElreath

January 21, 2015
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Does Waffle House cause divorce?   .6-5*7"3 0 10

    20 30 40 50 6 8 10 12 14 Waffle Houses per million Divorce rate AL AR GA ME NJ OK SC UIBO POF UZQF PG JOĘVFODF XF T POF DBVTF DBO IJEF BOPUIFS .VM
  2. Does Waffle House cause divorce?   .6-5*7"3 0 10

    20 30 40 50 6 8 10 12 14 Waffle Houses per million Divorce rate AL AR GA ME NJ OK SC UIBO POF UZQF PG JOĘVFODF XF T POF DBVTF DBO IJEF BOPUIFS .VM
  3. The Efect of County Music on Suicide SIEVEN STACK, Wayne

    State University JiM GUNDLACH, Auburn University Abstract T7his article assesses the link between country music and metropolitan suicide rates. Country music is hypothesized to nurture a suicidal mood through its concerns with problems common in the suicidal population, such as marital discord, alcohol abuse, and alienation from work. The results of a multiple regression analysis of 49 metropolitan areas show that the greater the airtime devoted to country music, the greater the white suicide rate. The effect is independent of divorce, southernness, poverty, and gun availability. The existence of a country music subculture is thought to reinforce the link between country music and suicide. Our model explains 51% of the variance in urban white suicide rates. Sociological work on the relationship between art and society has been largely restricted to speculative, sociohistorical theories that are often mutually opposed. Some theorists see art as creating social structure (Adorno 1973), while Sorokin (1937) suggests that society and art are manifested in cyclical autono- mous spheres; and still others contend that art is a reflection of social structure (Albrecht 1954). Little empirical work has been done on the impact of music on social problems. While some research has linked music to criminal behavior (Singer, Levine & Jou 1990), the relationship between music and suicide remains largely unexplored. Music is not mentioned in reviews of the literature on suicide (Lester 1983; Stack 1982, 1990b); instead, the impact of art on suicide has been largely restricted to analyses of television movies and soap operas (for a review, see Stack 1990b). ty and art are manifested in cyclical autono- tend that art is a reflection of social structure ork has been done on the impact of music on arch has linked music to criminal behavior ationship between music and suicide remains mentioned in reviews of the literature on 90b); instead, the impact of art on suicide has s of television movies and soap operas (for a link between a particular form of popular opolitan suicide rates. We contend that the ter a suicidal mood among people already at eby associated with a high suicide rate. The ubculture and a link between this subculture creased suicide risk. her variables were provided by the Inter-University search, University of Michigan, Ann Arbor. We are pirations and helpful discussions, to the anonymous to Mitch Henryfor his help in gathering the data on Steven Stack, Department of Sociology, Wayne State s Social Forces, September 1992, 71(1):211-218
  4. Male Organ and Economic Growth: Does Size Matter*? Abstract This

    paper explores the link between economic development and penile length between 1960 and 1985. It estimates an augmented Solow model utilizing the Mankiw-Romer-Weil 121 country dataset. The size of male organ is found to have an inverse U-shaped relationship with the level of GDP in 1985. It can alone explain over 15% of the variation in GDP. The GDP maximizing size is around 13.5 centimetres, and a collapse in economic development is identified as the size of male organ exceeds 16 centimetres. Economic growth between 1960 and 1985 is negatively associated with the size of male organ, and it alone explains 20% of the variation in GDP growth. With due reservations it is also found to be more important determinant of GDP growth than country's political regime type. Controlling for male organ slows convergence and mitigates the negative effect of population growth on economic development slightly. Although all evidence is suggestive at this stage, the `male organ hypothesis' put forward here is robust to exhaustive set of controls and rests on surprisingly strong correlations. JEL Classification: O10, O47 Keywords: economic growth, development, male organ, penile length, Solow model Tatu Westling Department of Political and Economic Studies University of Helsinki P.O. Box 17 (Arkadiankatu 7) FI-00014 University of Helsinki FINLAND e-mail: [email protected] • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 10 12 14 16 18 0 5000 10000 15000 20000 Male organ (cm) GDP 1985 ($) Figure 2: GDP ratio between 1985 and 1960 and the size of male organ countries, ORGAN in linear form, ¯ R2=0.20 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 10 12 14 16 18 0 1 2 3 4 5 6 Male organ (cm) GDP 1985/1960 12 GDP 1985/1960 Male organ (cm)
  5. Regression march • Last week: Linear regression • This week:

    Multivariate linear regression • Next week: Model comparison • Week 5: Interactions • Week 6: MCMC • Week 7: Generalized linear models (GLMs)
  6. Goals this week • Multivariate Gaussian models • The good:

    • Reveal spurious correlation • Uncover masked association • The bad: • Correlated predictors • Overfitting looms large   .6-5*7"3 0 10 20 30 40 50 6 8 10 12 14 Waffle Houses per million Divorce rate AL AR GA ME NJ OK SC ' Q Q B Q  B UIBO POF UZQF PG JOĘVFODF XF T POF DBVTF DBO IJEF BOPUIFS .VM  *OUFSBDUJPOT &WFO XIFO WBSJBCMF FBDI NBZ TUJMM EFQFOE VQPO UIF P
  7. Spurious association • Does marriage cause divorce?  4163*064 -1

    0 1 2 6 8 10 12 Marriage.s Divorce 'ĶĴłĿIJ ƍƊ %JWPSDF SBUF JT BTTPDJBUF NFEJBO BHF BU NBSSJBHF SJHIU  #PUI Q UIJT FYBNQMF ćF BWFSBHF NBSSJBHF S NFEJBO BHF BU NBSSJBHF JT 
  8. Spurious association  4163*064 "440$*"5*0/  -1 0 1 2

    6 8 10 12 Marriage.s Divorce -2 -1 0 1 2 3 6 8 10 12 MedianAgeMarriage.s Divorce 'ĶĴłĿIJ ƍƊ %JWPSDF SBUF JT BTTPDJBUFE XJUI CPUI NBSSJBHF SBUF MFę BOE NFEJBO BHF BU NBSSJBHF SJHIU  #PUI QSFEJDUPS WBSJBCMFT BSF TUBOEBSEJ[FE JO UIJT FYBNQMF ćF BWFSBHF NBSSJBHF SBUF BDSPTT 4UBUFT JT  BOE UIF BWFSBHF NFEJBO BHF BU NBSSJBHF JT 
  9. Multivariate divorce • Want to know: what is value of

    a predictor, once we know the other predictors? • What is value of knowing marriage rate, once we already know median age at marriage? • What is value of knowing median age marriage, once we know marriage rate? U TFFN WBSJBUF OPUBUJPO .VMUJWBSJBUF SFHSFTTJPO GPSNVMBT MPPL B MPU MJLF UIF NPEFMT BU UIF FOE PG UIF QSFWJPVT DIBQUFS‰UIFZ BEE NPSF QBSBNFUFST T UP UIF EFĕOJUJPO PG µJ ćF TUSBUFHZ JT TUSBJHIUGPSXBSE NJOBUF UIF QSFEJDUPS WBSJBCMFT ZPV XBOU JO UIF MJOFBS NPEFM PG UIF BO S FBDI QSFEJDUPS NBLF B QBSBNFUFS UIBU XJMM NFBTVSF JUT BTTPDJBUJPO I UIF PVUDPNF VMUJQMZ UIF QBSBNFUFS CZ UIF WBSJBCMF BOE BEE UIBU UFSN UP UIF MJOFBS EFM F BMXBZT OFDFTTBSZ TP IFSF JT UIF NPEFM UIF QSFEJDUT EJWPSDF SBUF VT SSJBHF SBUF BOE BHF BU NBSSJBHF %J ∼ /PSNBM(µJ, σ) >OLNHOLKRRG@ µJ = α + βN NJ + βB BJ >OLQHDUPRGHO@
  10. S FBDI QSFEJDUPS NBLF B QBSBNFUFS UIBU XJMM NFBTVSF JUT

    BTTPDJBUJPO I UIF PVUDPNF VMUJQMZ UIF QBSBNFUFS CZ UIF WBSJBCMF BOE BEE UIBU UFSN UP UIF MJOFBS EFM F BMXBZT OFDFTTBSZ TP IFSF JT UIF NPEFM UIF QSFEJDUT EJWPSDF SBUF VT SSJBHF SBUF BOE BHF BU NBSSJBHF %J ∼ /PSNBM(µJ, σ) >OLNHOLKRRG@ µJ = α + βN NJ + βB BJ >OLQHDUPRGHO@ divorce rate marriage rate median age marriage “slope” for marriage rate “slope” for median age marriage
  11. sigma ba bm -2 0 2 4 6 8 10

    Estimate 'ĶĴłĿIJ ƍƋ 2VBESBUJD BQQSPYJNBUF QPTUFSJPS GPS NPEFM (ǀǏƾ 1PJOUT TIPX FBDI QPTUFSJPS NPEF BOE TPMJE CMBDL MJOF TFHNFOUT TIPX QFSDFOUJMF JOUFS WBMT 3 DPEF  (ǀǏƾ ʄǤ (+ǭ '$./ǭ $1*- ʋ )*-(ǭ (0 ǐ .$"( Ǯ ǐ (0 ʄǤ  ɾ (Ƿ--$" Ǐ. ɾ Ƿ $)" --$" Ǐ. ǐ  ʋ )*-(ǭ Ƽƻ ǐ Ƽƻ Ǯ ǐ ( ʋ )*-(ǭ ƻ ǐ Ƽ Ǯ ǐ  ʋ )*-(ǭ ƻ ǐ Ƽ Ǯ ǐ .$"( ʋ 0)$!ǭ ƻ ǐ Ƽƻ Ǯ Ǯ ǐ / ʃ  Ǯ +- $.ǭ (ǀǏƾ Ǯ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ  DŽǏǁDŽ ƻǏƽƻ DŽǏƽDŽ ƼƻǏƻDŽ ( ǤƻǏƼƾ ƻǏƽǃ ǤƻǏǁǃ ƻǏƿƽ  ǤƼǏƼƾ ƻǏƽǃ ǤƼǏǁǃ ǤƻǏǀDŽ .$"( ƼǏƿƿ ƻǏƼƿ ƼǏƼǁ ƼǏǂƽ BOE BHF BU NBSSJBHF %J ∼ /PSNBM(µJ, σ) >OLNHOLKRRG@ µJ = α + βN NJ + βB BJ >OLQHDUPRGHO@ α ∼ /PSNBM(, ) >SULRUIRU α@ βN ∼ /PSNBM(, ) >SULRUIRU βN@ βB ∼ /PSNBM(, ) >SULRUIRU βB@ σ ∼ 6OJGPSN(, ) >SULRUIRU σ@ IBUFWFS TZNCPMT ZPV MJLF GPS UIF QBSBNFUFST BOE WBSJBCMFT CVU IFSF *WF DIPTFO F SBUF BOE B GPS BHF BU NBSSJBHF SFVTJOH UIFTF TZNCPMT BT TVCTDSJQUT GPS UIF
  12. 3 DPEF  (ǀǏƾ ʄǤ (+ǭ '$./ǭ $1*- ʋ )*-(ǭ

    (0 ǐ .$"( Ǯ ǐ (0 ʄǤ  ɾ (Ƿ--$" Ǐ. ɾ Ƿ $)" --$" Ǐ. ǐ  ʋ )*-(ǭ Ƽƻ ǐ Ƽƻ Ǯ ǐ ( ʋ )*-(ǭ ƻ ǐ Ƽ Ǯ ǐ  ʋ )*-(ǭ ƻ ǐ Ƽ Ǯ ǐ .$"( ʋ 0)$!ǭ ƻ ǐ Ƽƻ Ǯ Ǯ ǐ / ʃ  Ǯ +- $.ǭ (ǀǏƾ Ǯ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ  DŽǏǁDŽ ƻǏƽƻ DŽǏƽDŽ ƼƻǏƻDŽ ( ǤƻǏƼƾ ƻǏƽǃ ǤƻǏǁǃ ƻǏƿƽ  ǤƼǏƼƾ ƻǏƽǃ ǤƼǏǁǃ ǤƻǏǀDŽ .$"( ƼǏƿƿ ƻǏƼƿ ƼǏƼǁ ƼǏǂƽ ćF QPTUFSJPS NFBO GPS NBSSJBHF SBUF ( JT OPX DMPTF UP [FSP XJUI QMFOUZ PG QSPCBCJMJUZ PG CPUI TJEFT PG [FSP ćF QPTUFSJPS NFBO GPS BHF BU NBSSJBHF  IBT BDUVBMMZ HPUUFO TMJHIUMZ GBSUIFS GSPN [FSP CVU JT FTTFOUJBMMZ VODIBOHFE *U NBZ BMTP IFMQ UP WJTVBMJ[F UIFTF QPTUFSJPS EJTUSJCVUJPO FTUJNBUFT 3 DPEF  +'*/ǭ +- $.ǭ (ǀǏƾ Ǯ Ǯ ćF SFTVMU JO 'ĶĴłĿIJ ƍƋ TIPXT UIF TBNF JOGPSNBUJPO BT JO UIF UBCMF BCPWF #VU OPX UIF ."1T BSF TIPXO CZ UIF QPJOUT BOE UIF QFSDFOUJMF JOUFSWBMT CZ UIF TPMJE IPSJ[POUBM MJOFT :PV DBO JOUFSQSFU UIFTF FTUJNBUFT BT TBZJOH (ǀǏƾ ʄǤ (+ǭ '$./ǭ $1*- ʋ )*-(ǭ (0 ǐ .$"( Ǯ ǐ (0 ʄǤ  ɾ (Ƿ--$" Ǐ. ɾ Ƿ $)" --$" Ǐ. ǐ  ʋ )*-(ǭ Ƽƻ ǐ Ƽƻ Ǯ ǐ ( ʋ )*-(ǭ ƻ ǐ Ƽ Ǯ ǐ  ʋ )*-(ǭ ƻ ǐ Ƽ Ǯ ǐ .$"( ʋ 0)$!ǭ ƻ ǐ Ƽƻ Ǯ Ǯ ǐ / ʃ  Ǯ +- $.ǭ (ǀǏƾ Ǯ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ  DŽǏǁDŽ ƻǏƽƻ DŽǏƽDŽ ƼƻǏƻDŽ ( ǤƻǏƼƾ ƻǏƽǃ ǤƻǏǁǃ ƻǏƿƽ  ǤƼǏƼƾ ƻǏƽǃ ǤƼǏǁǃ ǤƻǏǀDŽ .$"( ƼǏƿƿ ƻǏƼƿ ƼǏƼǁ ƼǏǂƽ ćF QPTUFSJPS NFBO GPS NBSSJBHF SBUF ( JT OPX DMPTF UP [FSP XJUI QMFOUZ PG QSPCBCJMJUZ PG CPUI TJEFT PG [FSP ćF QPTUFSJPS NFBO GPS BHF BU NBSSJBHF  IBT BDUVBMMZ HPUUFO TMJHIUMZ GBSUIFS GSPN [FSP CVU JT FTTFOUJBMMZ VODIBOHFE *U NBZ BMTP IFMQ UP WJTVBMJ[F UIFTF QPTUFSJPS EJTUSJCVUJPO FTUJNBUFT +'*/ǭ +- $.ǭ (ǀǏƾ Ǯ Ǯ ćF SFTVMU JO 'ĶĴłĿIJ ƍƋ TIPXT UIF TBNF JOGPSNBUJPO BT JO UIF UBCMF BCPWF #VU OPX UIF
  13.  DŽǏǁDŽ ƻǏƽƻ DŽǏƽDŽ ƼƻǏƻDŽ  ǤƻǏƻǁ ƻǏƻǃ ǤƻǏƽƼ ƻǏƼƻ

     ǤƼǏƻƻ ƻǏƽƿ ǤƼǏƿǂ ǤƻǏǀƾ .$"( ƼǏƿƿ ƻǏƼƿ ƼǏƼǁ ƼǏǂƽ ćF FTUJNBUF GPS NBSSJBHF SBUF  JT OPX DMPTF UP [FSP XJUI QMFOUZ PG QSPCB CJMJUZ PG CPUI TJEFT PG [FSP ćF FTUJNBUF GPS BHF BU NBSSJBHF  IBT BDUVBMMZ HPUUFO TMJHIUMZ GBSUIFS GSPN [FSP CVU JT FTTFOUJBMMZ VODIBOHFE *U NBZ BMTP IFMQ UP WJTVBMJ[F UIFTF QPTUFSJPS EFOTJUZ FTUJNBUFT 3 DPEF  +'*/ǭ +- $.ǭ (ǀǏƾ Ǯ Ǯ ćF SFTVMU JO 'ĶĴłĿIJ ƍƋ TIPXT UIF TBNF JOGPSNBUJPO BT JO UIF UBCMF BCPWF #VU OPX UIF ."1T BSF TIPXO CZ UIF QPJOUT BOE UIF DPOĕEFODF JOUFSWBMT CZ UIF TPMJE IPSJ[POUBM MJOFT :PV DBO JOUFSQSFU UIFTF FTUJNBUFT BT TBZJOH 0ODF XF LOPX NFEJBO BHF BU NBSSJBHF GPS B 4UBUF UIFSF JT MJUUMF PS OP BEEJUJPOBM QSFEJDUJWF QPXFS JO BMTP LOPXJOH UIF SBUF PG NBSSJBHF JO UIBU 4UBUF /PUF UIBU UIJT EPFT OPU NFBO UIBU UIFSF JT OP WBMVF JO LOPXJOH NBSSJBHF SBUF *G ZPV EJEOU IBWF BDDFTT UP BHFBUNBSSJBHF EBUB UIFO ZPVE EFĕOJUFMZ ĕOE WBMVF JO LOPXJOH UIF NBSSJBHF SBUF #VU JG ZPV BSF JOUFSFTUFE JO B DBVTBM JOUFSQSFUBUJPO BOE NPTU SFTFBSDI JT BU TPNF QPJOU UIFO UIJT SFTVMU JT BU MFBTU DPOTJTUFOU XJUI UIF OPUJPO UIBU UIF FČFDU PG +- $.ǭ (ǀǏƾ Ǯ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ  DŽǏǁDŽ ƻǏƽƻ DŽǏƽDŽ ƼƻǏƻDŽ ( ǤƻǏƼƾ ƻǏƽǃ ǤƻǏǁǃ ƻǏƿƽ  ǤƼǏƼƾ ƻǏƽǃ ǤƼǏǁǃ ǤƻǏǀDŽ .$"( ƼǏƿƿ ƻǏƼƿ ƼǏƼǁ ƼǏǂƽ ćF QPTUFSJPS NFBO GPS NBSSJBHF SBUF ( JT OPX DMPTF UP [FSP XJUI QMFOUZ PG QSPCBCJMJUZ PG CPUI TJEFT PG [FSP ćF QPTUFSJPS NFBO GPS BHF BU NBSSJBHF  IBT BDUVBMMZ HPUUFO TMJHIUMZ GBSUIFS GSPN [FSP CVU JT FTTFOUJBMMZ VODIBOHFE *U NBZ BMTP IFMQ UP WJTVBMJ[F UIFTF QPTUFSJPS EJTUSJCVUJPO FTUJNBUFT +'*/ǭ +- $.ǭ (ǀǏƾ Ǯ Ǯ ćF SFTVMU JO 'ĶĴłĿIJ ƍƋ TIPXT UIF TBNF JOGPSNBUJPO BT JO UIF UBCMF BCPWF #VU OPX UIF ."1T BSF TIPXO CZ UIF QPJOUT BOE UIF QFSDFOUJMF JOUFSWBMT CZ UIF TPMJE IPSJ[POUBM MJOFT :PV DBO JOUFSQSFU UIFTF FTUJNBUFT BT TBZJOH 0ODF XF LOPX NFEJBO BHF BU NBSSJBHF GPS B 4UBUF UIFSF JT MJUUMF PS OP BEEJ UJPOBM QSFEJDUJWF QPXFS JO BMTP LOPXJOH UIF SBUF PG NBSSJBHF JO UIBU 4UBUF  4163*064 "440$*"5*0/  sigma ba bm a -2 0 2 4 6 8 10 Estimate 'ĶĴłĿIJ ƍƋ 2VBESBUJD BQQSPYJNBUF QPTUFSJPS GPS NPEFM (ǀǏƾ 1PJOUT TIPX
  14. Multivariate divorce • Once we know median age marriage, little

    additional value in knowing marriage rate. • Once we know marriage rate, still value in knowing median age marriage. • If we don’t know median age marriage, still useful to know marriage rate.  4163*064 "440$* sigma ba bm a -2 0 2 4 Estimate 'ĶĴłĿIJ ƍƋ 2VBESBUJD BQQSPYJNBUF QPTUFSJP FBDI QPTUFSJPS NPEF BOE TPMJE CMBDL MJOF T WBMT
  15. Plotting multivariate models • Lots of plotting options now 1.

    Predictor residual plots 2. Counterfactual plots 3. Posterior prediction plots 4. invent your own  4163*064 "440$ -1 0 1 2 6 8 10 12 Marriage.s Divorce 6 8 10 12 Divorce 'ĶĴłĿIJ ƍƊ %JWPSDF SBUF JT BTTPDJBUFE XJU NFEJBO BHF BU NBSSJBHF SJHIU  #PUI QSFEJD UIJT FYBNQMF ćF BWFSBHF NBSSJBHF SBUF BDS NFEJBO BHF BU NBSSJBHF JT  ĕHVSF #VU EPFT NBSSJBHF DBVTF EJWPSDF *O B USJW HFU B EJWPSDF XJUIPVU ĕSTU HFUUJOH NBSSJFE #VU UIFS SJBHF SBUF UP CF DPSSFMBUFE XJUI EJWPSDF‰JUT FBTZ UP IJHI DVMUVSBM WBMVBUJPO PG NBSSJBHF BOE UIFSFGPSF CF TPNFUIJOH JT TVTQJDJPVT IFSF "OPUIFS QSFEJDUPS BTTPDJBUFE XJUI EJWPSDF JT UI UIF SJHIUIBOE QMPU JO 'ĶĴłĿIJ ƍƊ "HF BU NBSSJBHF J IJHIFS BHF BU NBSSJBHF QSFEJDUT MFTT EJWPSDF :PV DBO CZ ĕUUJOH UIJT MJOFBS SFHSFTTJPO NPEFM  4163*064 "440$*"5*0/  -1 0 1 2 6 8 10 12 Marriage.s Divorce -2 -1 0 1 2 3 6 8 10 12 MedianAgeMarriage.s Divorce 'ĶĴłĿIJ ƍƊ %JWPSDF SBUF JT BTTPDJBUFE XJUI CPUI NBSSJBHF SBUF MFę BOE
  16. Predictor residual plots • Goal: Show association of each predictor

    with outcome, “controlling” for other predictors • Recipe: 1. Regress predictor on other predictors 2. Compute predictor residuals 3. Regress outcome on residuals
  17. 1. Predictor on predictor • Regress marriage rate on median

    age marriage KVTU MFBWFT JO UIF WBSJBUJPO UIBU JT OPU FYQFDUFE CZ UIF NPEFM PG UIF NFBO µ BT B GVODUJPO PG UIF PUIFS QSFEJDUPST *O PVS NVMUJWBSJBUF NPEFM PG EJWPSDF SBUF XF IBWF UXP QSFEJDUPST  NBSSJBHF SBUF --$" Ǐ. BOE  NFEJBO BHF BU NBSSJBHF  $)" --$" Ǐ.  5P DPNQVUF QSF EJDUPS SFTJEVBMT GPS FJUIFS XF KVTU VTF UIF PUIFS QSFEJDUPS UP NPEFM JU 4P GPS NBSSJBHF SBUF UIJT JT UIF NPEFM XF OFFE NJ ∼ /PSNBM(µJ, σ) µJ = α + βBJ α ∼ /PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ 6OJGPSN(, )  4163*064 "440$*"5*0/  "T CFGPSF N JT NBSSJBHF SBUF BOE B JT NFEJBO BHF BU NBSSJBHF /PUF UIBU TJODF XF TUBOEBSEJ[FE CPUI WBSJBCMFT XF BMSFBEZ FYQFDU UIF NFBO α UP CF BSPVOE [FSP 4P *WF DFOUFSFE αT QSJPS UIFSF CVU JUT TUJMM TP ĘBU UIBU JU IBSEMZ NBUUFST ćJT DPEF XJMM ĕU UIF NPEFM 3 DPEF  (ǀǏƿ ʄǤ (+ǭ '$./ǭ --$" Ǐ. ʋ )*-(ǭ (0 ǐ .$"( Ǯ ǐ (0 ʄǤ  ɾ Ƿ $)" --$" Ǐ. ǐ  ʋ )*-(ǭ ƻ ǐ Ƽƻ Ǯ ǐ  ʋ )*-(ǭ ƻ ǐ Ƽ Ǯ ǐ .$"( ʋ 0)$!ǭ ƻ ǐ Ƽƻ Ǯ Ǯ ǐ / ʃ  Ǯ "OE UIFO XF DPNQVUF UIF SFTJEVBMT CZ TVCUSBDUJOH UIF PCTFSWFE NBSSJBHF SBUF JO FBDI 4UBUF
  18. -3 -1 0 1 2 3 -5 0 5 10

    MedianAgeMarriage.c Marriage.c
  19. 2. Compute residuals • Residual: distance of each outcome from

    expectation 3 DPEF  (ǀǏƿ ʄǤ (+ǭ '$./ǭ --$" Ǐ. ʋ )*-(ǭ (0 ǐ .$"( Ǯ ǐ (0 ʄǤ  ɾ Ƿ $)" --$" Ǐ. ǐ  ʋ )*-(ǭ ƻ ǐ Ƽƻ Ǯ ǐ  ʋ )*-(ǭ ƻ ǐ Ƽ Ǯ ǐ .$"( ʋ 0)$!ǭ ƻ ǐ Ƽƻ Ǯ Ǯ ǐ / ʃ  Ǯ "OE UIFO XF DPNQVUF UIF SFTJEVBMT CZ TVCUSBDUJOH UIF PCTFSWFE NBSSJBHF SBUF JO FBDI 4UBUF GSPN UIF QSFEJDUFE SBUF CBTFE VQPO VTJOH BHF BU NBSSJBHF 3 DPEF  ȃ *(+0/ 3+ /  1'0 / ǐ !*- # // (0 ʄǤ * !ǭ(ǀǏƿǮǯǘǘǰ ɾ * !ǭ(ǀǏƿǮǯǘǘǰǷɠ $)" --$" Ǐ. ȃ *(+0/ - .$0' !*- # // (Ǐ- .$ ʄǤ ɠ--$" Ǐ. Ǥ (0 8IFO B SFTJEVBM JT QPTJUJWF UIBU NFBOT UIBU UIF PCTFSWFE SBUF XBT JO FYDFTT PG XIBU XFE FYQFDU HJWFO UIF NFEJBO BHF BU NBSSJBHF JO UIBU 4UBUF 8IFO B SFTJEVBM JT OFHBUJWF UIBU NFBOT UIF PCTFSWFE SBUF XBT CFMPX XIBU XFE FYQFDU *O TJNQMFS UFSNT 4UBUFT XJUI QPTJUJWF SFTJEVBMT NBSSZ GBTU GPS UIFJS BHF PG NBSSJBHF XIJMF 4UBUFT XJUI OFHBUJWF SFTJEVBMT NBSSZ TMPX GPS UIFJS BHF PG NBSSJBHF *UMM IFMQ UP QMPU UIF SFMBUJPOTIJQ CFUXFFO UIFTF UXP WBSJBCMFT BOE TIPX UIF SFTJEVBMT BT XFMM )FSFT TPNF DPEF UP EP KVTU UIBU ESBXJOH B HSBZ MJOF TFHNFOU GPS FBDI SFTJEVBM GPS FBDI 4UBUF 3 DPEF  +'*/ǭ --$" Ǐ. ʋ  $)" --$" Ǐ. ǐ  ǐ *'ʃ-)"$ƽ Ǯ '$) ǭ (ǀǏƿ Ǯ
  20. residual -2 -1 0 1 2 3 -1 0 1

    2 MedianAgeMarriage.s Marriage.s 'ĶĴłĿIJ ƍƌ 3 BęFS BDDPVOU NFEJBO BHF B JT B SFTJEVBM SJBHF SBUF GSP QSFEJDU NBSSJ BMPOF 4P 4UBU MJOF IBWF IJHI BDDPSEJOH UP B IBWF MPXFS SB
  21.   .6-5*7"3*"5& -*/&"3 .0%&-4 -2 -1 0 1 2

    3 -1 0 1 2 MedianAgeMarriage.s Marriage.s 'ĶĴłĿIJ ƍƌ 3FTJEVBM NBSSJBHF SBUF JO FBDI 4UBUF BęFS BDDPVOUJOH GPS UIF MJOFBS BTTPDJBUJPO XJUI NFEJBO BHF BU NBSSJBHF &BDI HSBZ MJOF TFHNFOU JT B SFTJEVBM UIF EJTUBODF PG FBDI PCTFSWFE NBS SJBHF SBUF GSPN UIF FYQFDUFE WBMVF BUUFNQUJOH UP QSFEJDU NBSSJBHF SBUF XJUI NFEJBO BHF BU NBSSJBHF BMPOF 4P 4UBUFT UIBU MJF BCPWF UIF CMBDL SFHSFTTJPO MJOF IBWF IJHIFS SBUFT PG NBSSJBHF UIBO FYQFDUFE BDDPSEJOH UP BHF BU NBSSJBHF ćPTF CFMPX UIF MJOF IBWF MPXFS SBUFT UIBO FYQFDUFE 0 12 14 rce rate faster slower 0 12 14 rce rate older younger marriage rate < expectation “low rate for age of marriage” marriage rate > expectation “high rate for age of marriage”
  22. slower faster   .6-5*7"3*"5& -*/&"3 .0%&-4 -2 -1 0

    1 2 3 -1 0 1 2 MedianAgeMarriage.s Marriage.s 'ĶĴłĿIJ ƍƌ 3FTJEVBM NBSSJBHF SBUF JO FBDI 4UB BęFS BDDPVOUJOH GPS UIF MJOFBS BTTPDJBUJPO XJ NFEJBO BHF BU NBSSJBHF &BDI HSBZ MJOF TFHNF JT B SFTJEVBM UIF EJTUBODF PG FBDI PCTFSWFE NB SJBHF SBUF GSPN UIF FYQFDUFE WBMVF BUUFNQUJOH QSFEJDU NBSSJBHF SBUF XJUI NFEJBO BHF BU NBSSJB BMPOF 4P 4UBUFT UIBU MJF BCPWF UIF CMBDL SFHSFTTJP MJOF IBWF IJHIFS SBUFT PG NBSSJBHF UIBO FYQFDUF BDDPSEJOH UP BHF BU NBSSJBHF ćPTF CFMPX UIF MJ IBWF MPXFS SBUFT UIBO FYQFDUFE 2 14 e faster slower 2 14 e older younger -2 -1 0 1 2 3 MedianAgeMarriage.s IBWF MPXF -1.5 -0.5 0.5 1.0 1.5 6 8 10 12 14 Marriage rate residuals Divorce rate faster slower Divorce rate 'ĶĴłĿIJ ƍƍ 1SFEJDUPS SFTJEVBM QMPUT GPS UI GBTU NBSSJBHF SBUFT GPS UIFJS NFEJBO BHF P EJWPSDF SBUFT BT EP 4UBUFT XJUI TMPX NBSSJ NFEJBO BHF PG NBSSJBHF GPS UIFJS NBSSJBH XIJMF 4UBUFT XJUI ZPVOH NFEJBO BHF PG NBS
  23. 3. Outcome on residuals • How is divorce associated with

    residual marriage rate? States with fast/slow rates of marriage (for age of marriage) do not (on average) have fast/slow divorce rates -2 -1 0 1 2 3 -1 0 1 MedianAgeMarriage.s Marriage. JT B S SJBHF QSFEJ BMPOF MJOF I BDDPS IBWF -1.5 -0.5 0.5 1.0 1.5 6 8 10 12 14 Marriage rate residuals Divorce rate faster slower 'ĶĴłĿIJ ƍƍ 1SFEJDUPS SFTJEVBM QMPUT G
  24.   .6-5*7"3*"5& -*/&"3 .0%&-4 -6 -2 0 2 4

    6 6 8 10 12 14 Marriage rate residuals Divorce rate faster slower -1 0 1 2 3 6 8 10 12 14 Age of marriage residuals Divorce rate older younger 'ĶĴłĿIJ ƍƌ 1SFEJDUPS SFTJEVBM QMPUT GPS UIF EJWPSDF EBUB -Fę   .6-5*7"3*"5& -*/&"3 .0%&-4 -6 -2 0 2 4 6 6 8 10 12 14 Marriage rate residuals Divorce rate faster slower -1 0 1 2 3 6 8 10 12 14 Age of marriage residuals Divorce rate older younger 'ĶĴłĿIJ ƍƌ 1SFEJDUPS SFTJEVBM QMPUT GPS UIF EJWPSDF EBUB -Fę Figure 5.5 -3 -1 0 1 2 3 -5 0 5 10 MedianAgeMarriage.c Marriage.c -5 0 5 10 -3 -1 1 2 3 Marriage.c MedianAgeMarriage.c
  25. Statistical “control” • Multiple linear regression answers question: How is

    each predictor associated with outcome, once we know all the other predictors? • Uses model to build expected outcomes — not magic! • Don’t get cocky: Marriage rate may still be associated with divorce, for some subset of States • Can’t make strong causal inferences from averages; need data on individuals -1.5 -0.5 0.5 1.0 1.5 6 8 10 12 14 Marriage rate residuals Divorce rate faster slower 'ĶĴłĿIJ ƍƍ 1SFEJDUPS SFTJEVBM QMPUT GP GBTU NBSSJBHF SBUFT GPS UIFJS NFEJBO BH EJWPSDF SBUFT BT EP 4UBUFT XJUI TMPX N NFEJBO BHF PG NBSSJBHF GPS UIFJS NBS XIJMF 4UBUFT XJUI ZPVOH NFEJBO BHF PG 4P 4UBUFT UP UIF SJHIU PG UIF MJOF NBSSZ GBTUFS UIB TMPXFS UIBO FYQFDUFE "WFSBHF EJWPSDF SBUF PO C UIF SFHSFTTJPO MJOF EFNPOTUSBUFT MJUUMF SFMBUJPOT TMPQF PG UIF SFHSFTTJPO MJOF JT −. FYBDUMZ XI ćF SJHIUIBOE QMPU JO 'ĶĴłĿIJ ƍƍ EJTQMBZT NFEJBO BHF BU NBSSJBHF iDPOUSPMMJOHw GPS NBSSJ EBTIFE MJOF IBWF PMEFS UIBO FYQFDUFE NFEJBO ZPVOHFS UIBO FYQFDUFE NFEJBO BHF BU NBSSJBHF PO UIF SJHIU JT MPXFS UIBO UIF SBUF PO UIF MFę -2 -1 0 1 2 3 -1 0 1 2 MedianAgeMarriage.s Marriage.s 'ĶĴłĿIJ ƍƌ 3FTJEVBM NBSSJBHF SBUF JO FBDI 4U BęFS BDDPVOUJOH GPS UIF MJOFBS BTTPDJBUJPO X NFEJBO BHF BU NBSSJBHF &BDI HSBZ MJOF TFHN JT B SFTJEVBM UIF EJTUBODF PG FBDI PCTFSWFE N SJBHF SBUF GSPN UIF FYQFDUFE WBMVF BUUFNQUJOH QSFEJDU NBSSJBHF SBUF XJUI NFEJBO BHF BU NBSSJ BMPOF 4P 4UBUFT UIBU MJF BCPWF UIF CMBDL SFHSFTT MJOF IBWF IJHIFS SBUFT PG NBSSJBHF UIBO FYQFDU BDDPSEJOH UP BHF BU NBSSJBHF ćPTF CFMPX UIF M IBWF MPXFS SBUFT UIBO FYQFDUFE -1.5 -0.5 0.5 1.0 1.5 6 8 10 12 14 Marriage rate residuals Divorce rate faster slower -1 0 1 2 6 8 10 12 14 Age of marriage residuals Divorce rate older younger 'ĶĴłĿIJ ƍƍ 1SFEJDUPS SFTJEVBM QMPUT GPS UIF EJWPSDF EBUB -Fę 4UBUFT XJUI GBTU NBSSJBHF SBUFT GPS UIFJS NFEJBO BHF PG NBSSJBHF IBWF BCPVU UIF TBNF
  26. Counterfactual plots • Goal: Explore model implications for outcomes •

    Fix other predictor(s) • Compute predictions across values of predictor • Compute for unobserved (impossible?) cases, hence “counterfactual” Figure 5.6   .6-5*7"3*"5& -*/&"3 .0%&-4 -1 0 1 2 6 8 10 12 Marriage.s Divorce MedianAgeMarriage.s = 0 -2 -1 0 1 2 3 6 8 10 12 MedianAgeMarriage.s Divorce Marriage.s = 0 'ĶĴłĿIJ ƍƎ $PVOUFSGBDUVBM QMPUT GPS UIF NVMUJWBSJBUF EJWPSDF NPEFM (ǀǏƾ
  27. Figure 5.6 Change marriage rate, without changing median age marriage?

    Change median age marriage, without changing marriage rate? -1 0 1 2 6 8 10 12 Marriage.s Divorce MedianAgeMarriage.s = 0 -2 -1 0 1 2 3 6 8 10 12 MedianAgeMarriage.s Divorce Marriage.s = 0 'ĶĴłĿIJ ƍƎ $PVOUFSGBDUVBM QMPUT GPS UIF NVMUJWBSJBUF EJWPSDF NPEFM (ǀǏƾ &BDI QMPU TIPXT UIF DIBOHF JO QSFEJDUFE NFBO BDSPTT WBMVFT PG B TJOHMF QSF EJDUPS IPMEJOH UIF PUIFS QSFEJDUPS DPOTUBOU BU JUT NFBO WBMVF [FSP JO CPUI DBTFT  4IBEFE SFHJPOT TIPX  QFSDFOUJMF JOUFSWBMT PG UIF NFBO EBSL OBSSPX BOE  QSFEJDUJPO JOUFSWBMT MJHIU XJEF  .# ǭ (0Ǐ ǐ Ǐ. , Ǯ .# ǭ Ǐ ǐ Ǐ. , Ǯ ćF TUSBUFHZ BCPWF JT UP CVJME B OFX MJTU PG EBUB UIBU EFTDSJCF UIF DPVOUFSGBDUVBM DBTFT XF XJTI UP TJNVMBUF QSFEJDUJPOT GPS ćF '$./ OBNFE +- Ǐ/ IPMET UIFTF DBTFT /PUF UIBU UIF PCTFSWFE WBMVFT GPS  $)" --$" Ǐ. BSF OPU VTFE *OTUFBE XF DPNQVUF UIF BWFS BHF WBMVF BOE UIFO VTF UIJT BWFSBHF JOTJEF UIF MJOFBS NPEFM 4P --$" Ǐ. DIBOHFT BDSPTT
  28. Posterior predictions • Goal: Compute implied predictions for observed cases

    • Check model fit — golems do make mistakes • Find model failures, stimulate new ideas • Always average over the posterior distribution • Using only MAP leads to overconfidence • Embrace the uncertainty
  29. 4 6 8 10 12 14 case Divorce 1 3

    5 7 9 11 13 15 17 19 21 23 25 Posterior validation check 4 6 8 10 12 14 case Divorce 26 28 30 32 34 36 38 40 42 44 46 48 50 Posterior validation check postcheck(m5.3)
  30. Figure 5.7 Predicted compared to observed  4163*064 "440$*"5*0/ (a)

    (c) (b) 6 8 10 12 6 8 10 12 Observed divorce Predicted divorce ID UT TX MI DE DC NC OH IA KS MD MA WA NM WV VT OR SD AZ TN NH IN MS LA RI CO OK GA KY AK AL AR ME 4 ME
  31. Figure 5.7 Distribution of residuals for each State • negative

    residual: less divorce than expected • positive residual: more divorce than expected (c) 6 8 10 12 6 8 10 12 Observed divorce Predicted divorce ID UT ID NJ MN ND CT UT NE SC WI PA NY CA FL MT IL WY MO VA HI TX MI DE DC NC OH IA KS MD MA WA NM WV VT OR SD AZ TN NH IN MS LA RI CO OK GA KY AK AL AR ME -6 -4 -2 0 2 4 0 10 20 30 40 -4 -2 0 2 4 Waffles per capita Divorce error AL AR GA ID ME MS SC
  32. Figure 5.7 (c) 6 8 10 12 6 8 10

    12 Observed divorce Predicted divorce ID UT ID NJ MN ND CT UT NE SC WI PA NY CA FL MT IL WY MO VA HI TX MI DE DC NC OH IA KS MD MA WA NM WV VT OR SD AZ TN NH IN MS LA RI CO OK GA KY AK AL AR ME -6 -4 -2 0 2 4 0 10 20 30 40 -4 -2 0 2 4 Waffles per capita Divorce error AL AR GA ID ME MS SC (c) 6 8 10 12 Observed divorce ID NJ MN ND CT UT NE SC WI PA NY CA FL MT IL WY MO VA HI TX MI DE DC NC OH IA KS MD MA WA NM WV VT -6 -4 -2 0 2 4 0 10 20 30 40 -4 -2 0 2 4 Waffles per capita Divorce error AL AR GA ID ME MS SC 'ĶĴłĿIJ ƍƏ 1PTUFSJPS QSFEJDUJWF QMPUT GPS UIF NVMUJWBSJBUF EJWPSDF NPEFM (ǀǏƾ B 1SFEJDUFE EJWPSDF SBUF BHBJOTU PCTFSWFE XJUI  DPOĕEFODF JO UFSWBMT PG UIF BWFSBHF QSFEJDUJPO ćF EBTIFE MJOF TIPXT QFSGFDU QSFEJDUJPO