Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking Fall 2017 Lecture 18

Statistical Rethinking Fall 2017 Lecture 18

Week 10, Lecture 18, Statistical Rethinking: A Bayesian Course with Examples in R and Stan. This lecture covers Chapter 14 of the book.

A0f2f64b2e58f3bfa48296fb9ed73853?s=128

Richard McElreath

January 24, 2018
Tweet

Transcript

  1. Week 10: Missing Data & Other Opportunities Richard McElreath Statistical

    Rethinking
  2. 1 2 3

  3. 1 2 3 You are served: Probability other side is

    burnt?
  4. Avoid being clever • Intuition terrible guide to probability •

    No need to be clever; just ruthlessly apply conditional probability • Pr(want to know|already know)
  5. 1S(XBOU UP LOPX|BMSFBEZ LOPX) DBTF XF LOPX UIF VQ TJEF

    JT CVSOU 8F XBOU UP LOPX XIFUIFS PS OPU UIF EPX U ćF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ UFMMT VT 1S(CVSOU EPXO|CVSOU VQ) = 1S(CVSOU VQ, CVSOU EPXO) 1S(CVSOU VQ) KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QS OU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNB JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ CVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPC VSOUCVSOU QBODBLF JT  CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSP VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF F ćJT JT VSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) = SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT B SPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS
  6. 1S(XBOU UP LOPX|BMSFBEZ LOPX) DBTF XF LOPX UIF VQ TJEF

    JT CVSOU 8F XBOU UP LOPX XIFUIFS PS OPU UIF EPX U ćF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ UFMMT VT 1S(CVSOU EPXO|CVSOU VQ) = 1S(CVSOU VQ, CVSOU EPXO) 1S(CVSOU VQ) KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QS OU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNB JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ CVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPC VSOUCVSOU QBODBLF JT  CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSP VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF F ćJT JT VSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) = SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT B SPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS ćJT JT KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI P 8F XBOU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO X TJEF JT CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TU MJHIU PG JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU CVSOUCVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJE PG UIF CVSOUCVSOU QBODBLF JT  CFDBVTF B QBODBLF XBT TFMFDUFE BU SB JUZ UIF VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B QBODBLF ćJT JT 1S(CVSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + "MM UIBU SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU P JT  GSPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS 1S(CVSOU EPXO|CVSOU VQ) = / / =   *G ZPV EPOU RVJUF CFMJFWF UIJT BOTXFS ZPV DBO EP B RVJDL TJNVMBUJPO UP 
  7. 1S(XBOU UP LOPX|BMSFBEZ LOPX) DBTF XF LOPX UIF VQ TJEF

    JT CVSOU 8F XBOU UP LOPX XIFUIFS PS OPU UIF EPX U ćF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ UFMMT VT 1S(CVSOU EPXO|CVSOU VQ) = 1S(CVSOU VQ, CVSOU EPXO) 1S(CVSOU VQ) KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QS OU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNB JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ CVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPC VSOUCVSOU QBODBLF JT  CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSP VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF F ćJT JT VSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) = SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT B SPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS ćJT JT KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI P 8F XBOU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO X TJEF JT CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TU MJHIU PG JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU CVSOUCVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJE PG UIF CVSOUCVSOU QBODBLF JT  CFDBVTF B QBODBLF XBT TFMFDUFE BU SB JUZ UIF VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B QBODBLF ćJT JT 1S(CVSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + "MM UIBU SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU P JT  GSPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS 1S(CVSOU EPXO|CVSOU VQ) = / / =   *G ZPV EPOU RVJUF CFMJFWF UIJT BOTXFS ZPV DBO EP B RVJDL TJNVMBUJPO UP  1S(CVSOU VQ) JUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QSPCMFN F JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU UIF VQ JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNBUJPO JO IBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ PG UIF F QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPCBCJMJUZ FDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSPCBCJM F PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF PG UIF #6)(.) + 1S(66)() = (/) + (/)(/) = . G HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT BOE UIJT 4P BMM UPHFUIFS EPXO|CVSOU VQ) = / / =   S ZPV DBO EP B RVJDL TJNVMBUJPO UP DPOĕSN JU
  8. 1S(XBOU UP LOPX|BMSFBEZ LOPX) DBTF XF LOPX UIF VQ TJEF

    JT CVSOU 8F XBOU UP LOPX XIFUIFS PS OPU UIF EPX U ćF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ UFMMT VT 1S(CVSOU EPXO|CVSOU VQ) = 1S(CVSOU VQ, CVSOU EPXO) 1S(CVSOU VQ) KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QS OU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNB JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ CVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPC VSOUCVSOU QBODBLF JT  CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSP VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF F ćJT JT VSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) = SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT B SPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS ćJT JT KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI P 8F XBOU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO X TJEF JT CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TU MJHIU PG JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU CVSOUCVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJE PG UIF CVSOUCVSOU QBODBLF JT  CFDBVTF B QBODBLF XBT TFMFDUFE BU SB JUZ UIF VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B QBODBLF ćJT JT 1S(CVSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + "MM UIBU SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU P JT  GSPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS 1S(CVSOU EPXO|CVSOU VQ) = / / =   *G ZPV EPOU RVJUF CFMJFWF UIJT BOTXFS ZPV DBO EP B RVJDL TJNVMBUJPO UP  1S(CVSOU VQ) JUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QSPCMFN F JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU UIF VQ JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNBUJPO JO IBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ PG UIF F QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPCBCJMJUZ FDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSPCBCJM F PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF PG UIF #6)(.) + 1S(66)() = (/) + (/)(/) = . G HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT BOE UIJT 4P BMM UPHFUIFS EPXO|CVSOU VQ) = / / =   S ZPV DBO EP B RVJDL TJNVMBUJPO UP DPOĕSN JU VTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF Q U UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIB VSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSN JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJ VSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSP VSOUCVSOU QBODBLF JT  CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF Q Q TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJE  ćJT JT SOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT PN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS 1S(CVSOU EPXO|CVSOU VQ) = / / =   POU RVJUF CFMJFWF UIJT BOTXFS ZPV DBO EP B RVJDL TJNVMBUJPO UP DPOĕSN JU 
  9. Getting Ruthless • Express information as constraints and distributions =>

    let logic discover implications • No need to be clever • Examples: • Measurement error • Missing data
  10. Decolonizing Bayes • Bayes taught using non-Bayesian vocabulary • “data”:

    An observed variable • “parameter”: An unobserved variable • “likelihood”: Probability assignment for observed var • “prior”: Probability assignment for unobserved var • Even term “Bayesian” not Bayesian! • Distinction btw data and parameter relevant after observation • Can exploit this fact to address common modeling issues Sir Ronald Fisher (1890–1962) named it “Bayesian”
  11. Measurement error • Measurement always entails error • Typical linear

    regression: interpret sigma as “error” on outcome • What if error isn’t constant? • What if error is on predictors?
  12. Error on outcome • data(WaffleDivorce) • Consider error on outcome,

    divorce rate • Heterogeneity in error • Small State => large error 23 24 25 26 27 28 29 4 6 8 10 12 14 Median age marriage Divorce rate 'ĶĴłĿIJ ƉƌƉ -Fę %JWPSDF SBUF CZ NF 6OJUFE 4UBUFT 7FSUJDBM CBST TIPX QMVT PG UIF (BVTTJBO VODFSUBJOUZ JO NFBTVSF BHBJO XJUI TUBOEBSE EFWJBUJPOT BHBJOTU M 4UBUFT QSPEVDF NPSF VODFSUBJO FTUJNBUF BEEJUJPOBM JOGPSNBUJPO #VU XF EPOU IBWF BOZ JU JT "T BMXBZT UIF (BVTTJBO DIPJDF JT OPU FRV FSSPS JT BDUVBMMZ (BVTTJBO *UT KVTU UIF NPTU DPOT WBSJBODF )FSFT IPX UP EFĕOF UIF EJTUSJCVUJPO GPS F %ļįŀ,J UIFSF XJMM CF POF QBSBNFUFS %IJŀŁ,J EFĕOF   .*44*/( %"5" "/% 05)&3 0110356/*5*&4 23 24 25 26 27 28 29 4 6 8 10 12 14 Median age marriage Divorce rate 0 1 2 3 4 6 8 10 12 14 log population Divorce rate
  13. Error on outcome • Approach: • Treat true divorce rate

    as unknown parameter • Observed rate is sample from Gaussian distribution: BJO XJUI TUBOEBSE EFWJBUJPOT BHBJOTU MPH QPQVMBUJPO PG FBDI 4UBUF 4NB BUFT QSPEVDF NPSF VODFSUBJO FTUJNBUFT JOGPSNBUJPO #VU XF EPOU IBWF BOZ BEEJUJPOBM JOGPSNBUJPO IFSF T MXBZT UIF (BVTTJBO DIPJDF JT OPU FRVJWBMFOU UP BTTVNJOH UIBU UIF N UVBMMZ (BVTTJBO *UT KVTU UIF NPTU DPOTFSWBUJWF BTTVNQUJPO HJWFO POMZ IPX UP EFĕOF UIF EJTUSJCVUJPO GPS FBDI EJWPSDF SBUF 'PS FBDI PCT SF XJMM CF POF QBSBNFUFS %IJŀŁ,J EFĕOFE CZ %ļįŀ,J ∼ /PSNBM(%IJŀŁ,J, %ŀIJ,J) PFT JT EFĕOF UIF QBSBNFUFST %ļįŀ,J BT IBWJOH UIF TQFDJĕFE (BVTTJBO E PO %IJŀŁ,J  4P UIJT NFBOT UIF BCPWF EFĕOFT B QSPCBCJMJUZ GPS FBDI 4UBU UF HJWFO B LOPXO NFBTVSFNFOU FSSPS BOE USJFT UP FTUJNBUF UIF QMBVTJCMF XJUI UIF PCTFSWBUJPO IFO XF BMTP VTF UIFTF %IJŀŁ WBMVFT BT EBUB JO UIF SFHSFTTJPO FRVBUJPO ć observed (data) true (parameter) std error (data)
  14. Error on outcome: model EJWPSDF SBUF HJWFO B LOPXO NFBTVSFNFOU

    FSSPS BOE USJFT UP FTUJNBUF UIF QMBVTJCM DPOTJTUFOU XJUI UIF PCTFSWBUJPO "OE UIFO XF BMTP VTF UIFTF %IJŀŁ WBMVFT BT EBUB JO UIF SFHSFTTJPO FRVBUJPO POMZ BMMPX VT UP FTUJNBUF DPFďDJFOUT GPS QSFEJDUJPOT UIBU UBLF JOUP BDDPVOU UIF JO UIF PVUDPNF CVU JU XJMM BMTP VQEBUF UIF QSJPS GPS EJWPSDF SBUF JO FBDI 4UBUF )FSFT XIBU UIF NPEFM MPPLT MJLF %IJŀŁ,J ∼ /PSNBM(µJ, σ) >´OLNHOLKRRG µJ = α + β" "J + β3 3J %ļįŀ,J ∼ /PSNBM(%IJŀŁ,J, %ŀIJ,J) >SULR α ∼ /PSNBM(, ) β" ∼ /PSNBM(, ) β3 ∼ /PSNBM(, ) σ ∼ $BVDIZ(, .)
  15. Error on outcome: model divorce rate estimates EJWPSDF SBUF HJWFO

    B LOPXO NFBTVSFNFOU FSSPS BOE USJFT UP FTUJNBUF UIF QMBVTJCM DPOTJTUFOU XJUI UIF PCTFSWBUJPO "OE UIFO XF BMTP VTF UIFTF %IJŀŁ WBMVFT BT EBUB JO UIF SFHSFTTJPO FRVBUJPO POMZ BMMPX VT UP FTUJNBUF DPFďDJFOUT GPS QSFEJDUJPOT UIBU UBLF JOUP BDDPVOU UIF JO UIF PVUDPNF CVU JU XJMM BMTP VQEBUF UIF QSJPS GPS EJWPSDF SBUF JO FBDI 4UBUF )FSFT XIBU UIF NPEFM MPPLT MJLF %IJŀŁ,J ∼ /PSNBM(µJ, σ) >´OLNHOLKRRG µJ = α + β" "J + β3 3J %ļįŀ,J ∼ /PSNBM(%IJŀŁ,J, %ŀIJ,J) >SULR α ∼ /PSNBM(, ) β" ∼ /PSNBM(, ) β3 ∼ /PSNBM(, ) σ ∼ $BVDIZ(, .)
  16. EJWPSDF SBUF HJWFO B LOPXO NFBTVSFNFOU FSSPS BOE USJFT UP

    FTUJNBUF UIF QMBVTJCM DPOTJTUFOU XJUI UIF PCTFSWBUJPO "OE UIFO XF BMTP VTF UIFTF %IJŀŁ WBMVFT BT EBUB JO UIF SFHSFTTJPO FRVBUJPO POMZ BMMPX VT UP FTUJNBUF DPFďDJFOUT GPS QSFEJDUJPOT UIBU UBLF JOUP BDDPVOU UIF JO UIF PVUDPNF CVU JU XJMM BMTP VQEBUF UIF QSJPS GPS EJWPSDF SBUF JO FBDI 4UBUF )FSFT XIBU UIF NPEFM MPPLT MJLF %IJŀŁ,J ∼ /PSNBM(µJ, σ) >´OLNHOLKRRG µJ = α + β" "J + β3 3J %ļįŀ,J ∼ /PSNBM(%IJŀŁ,J, %ŀIJ,J) >SULR α ∼ /PSNBM(, ) β" ∼ /PSNBM(, ) β3 ∼ /PSNBM(, ) σ ∼ $BVDIZ(, .) Error on outcome: model likelihood for each observation likelihood for each estimate estimate standard error of observation
  17. " DPPM JNQMJDBUJPO UIBU XJMM BSJTF IFSF JT UIBU JOGPSNBUJPO

    ĘPXT JO CPUI EJSFDUJPOT‰UIF VO DFSUBJOUZ JO NFBTVSFNFOU JOĘVFODFT UIF SFHSFTTJPO QBSBNFUFST JO UIF MJOFBS NPEFM BOE UIF SFHSFTTJPO QBSBNFUFST JO UIF MJOFBS NPEFM BMTP JOĘVFODF UIF VODFSUBJOUZ JO UIF NFBTVSF NFOUT )FSF JT UIF (+Ǐ./) WFSTJPO PG UIF NPEFM 3 DPEF  '$./ ʚǶ '$./ǿ $1Ǿ*.ʙɶ$1*- Ǣ $1Ǿ.ʙɶ$1*- ǡǢ ʙɶ--$" Ǣ ʙɶ $)" --$" Ȁ (ǎǑǡǎ ʚǶ (+Ǐ./)ǿ '$./ǿ $1Ǿ ./ ʡ )*-(ǿ(0Ǣ.$"(ȀǢ (0 ʚǶ  ʔ ȉ ʔ ȉǢ $1Ǿ*. ʡ )*-(ǿ$1Ǿ ./Ǣ$1Ǿ.ȀǢ  ʡ )*-(ǿǍǢǎǍȀǢ  ʡ )*-(ǿǍǢǎǍȀǢ  ʡ )*-(ǿǍǢǎǍȀǢ .$"( ʡ 0#4ǿǍǢǏǡǒȀ Ȁ Ǣ /ʙ'$./ Ǣ ./-/ʙ'$./ǿ$1Ǿ ./ʙ'$./ɶ$1Ǿ*.Ȁ Ǣ  ʙ  Ǣ $/ -ʙǒǍǍǍ Ǣ #$).ʙǏ Ȁ ćFSF BSF UXP UIJOHT UP OPUF JO UIJT DPEF 'JSTU *WF UVSOFE PČ 8"*$ DBMDVMBUJPO CFDBVTF UIF EFGBVMU DPEF JO   XJMM OPU DPNQVUF UIF MJLFMJIPPE DPSSFDUMZ CZ JOUFHSBUJOH PWFS UIF Error on outcome: fitting %ļįŀ,J UIFSF XJMM CF POF QBSBNFUFS %IJŀŁ,J EFĕOFE CZ %ļįŀ,J ∼ /PSNBM(%IJŀŁ,J, %ŀIJ,J) "MM UIJT EPFT JT EFĕOF UIF QBSBNFUFST %ļįŀ,J BT IBWJOH UIF TQFD DFOUFSFE PO %IJŀŁ,J  4P UIJT NFBOT UIF BCPWF EFĕOFT B QSPCBCJM EJWPSDF SBUF HJWFO B LOPXO NFBTVSFNFOU FSSPS BOE USJFT UP FTUJN DPOTJTUFOU XJUI UIF PCTFSWBUJPO "OE UIFO XF BMTP VTF UIFTF %IJŀŁ WBMVFT BT EBUB JO UIF SFHSFT POMZ BMMPX VT UP FTUJNBUF DPFďDJFOUT GPS QSFEJDUJPOT UIBU UBLF J JO UIF PVUDPNF CVU JU XJMM BMTP VQEBUF UIF QSJPS GPS EJWPSDF SBUF )FSFT XIBU UIF NPEFM MPPLT MJLF %IJŀŁ,J ∼ /PSNBM(µJ, σ) µJ = α + β" "J + β3 3J %ļįŀ,J ∼ /PSNBM(%IJŀŁ,J, %ŀIJ,J) α ∼ /PSNBM(, ) β" ∼ /PSNBM(, ) β3 ∼ /PSNBM(, ) σ ∼ $BVDIZ(, .)
  18.   .*44*/( %"5" "/% 05)&3 0110356/*5*&4 23 24 25

    26 27 28 29 4 6 8 10 12 14 Median age marriage Divorce rate 0 1 2 3 4 6 8 10 12 14 log population Divorce rate 'ĶĴłĿIJ ƉƌƉ -Fę %JWPSDF SBUF CZ NFEJBO BHF PG NBSSJBHF 4UBUFT PG UIF 6OJUFE 4UBUFT 7FSUJDBM CBST TIPX QMVT BOE NJOVT POF TUBOEBSE EFWJBUJPO PG UIF (BVTTJBO VODFSUBJOUZ JO NFBTVSFE EJWPSDF SBUF 3JHIU %JWPSDF SBUF BHBJO XJUI TUBOEBSE EFWJBUJPOT BHBJOTU MPH QPQVMBUJPO PG FBDI 4UBUF 4NBMMFS 4UBUFT QSPEVDF NPSF VODFSUBJO FTUJNBUFT   .*44*/( %"5" "/% 05)&3 0110356/*5*&4 0.5 1.0 1.5 2.0 2.5 -2 -1 0 1 2 Divorce observed standard error Divorce estimated - divorce observed 23 24 25 26 27 28 29 4 6 8 10 12 14 Median age marriage Divorce rate (posterior) 'ĶĴłĿIJ ƉƌƊ -Fę 4ISJOLBHF SFTVMUJOH GSPN NPEFMJOH UIF NFBTVSFNFOU FSSPS ćF MFTT FSSPS JO UIF PSJHJOBM NFBTVSFNFOU UIF MFTT TISJOLBHF JO UIF QPTUFSJPS FTUJNBUF 3JHIU $PNQBSJTPO PG SFHSFTTJPO UIBU JHOPSFT NFBTVSF NFOU FSSPS EBTIFE MJOF BOE HSBZ TIBEJOH XJUI SFHSFTTJPO UIBU JODPSQPSBUFT NFBTVSFNFOU FSSPS CMVF MJOF BOE TIBEJOH  ćF QPJOUT BOE MJOF TFHNFOUT
  19. Error on outcome: results • Divorce rate estimates move from

    observed values. • Why?   .*44*/( %"5" "/% 05)&3 0110356/*5*&4 0.5 1.0 1.5 2.0 2.5 -2 -1 0 1 2 Divorce observed standard error Divorce estimated - divorce observed 23 24 25 26 27 28 29 4 6 8 10 12 14 Median age marriage Divorce rate (posterior) 'ĶĴłĿIJ ƉƌƊ -Fę 4ISJOLBHF SFTVMUJOH GSPN NPEFMJOH UIF NFBTVSFNFOU FSSPS ćF MFTT FSSPS JO UIF PSJHJOBM NFBTVSFNFOU UIF MFTT TISJOLBHF JO UIF
  20. Error on outcome: results • Q: Why do divorce rate

    estimates move? • A: Pooling! • Small States have highly uncertain rates => low influence on regression • Large States have more certain rates => high influence on regression • Divorce estimates should be consistent with regression => update estimates of each State’s divorce rate • Noisier estimates shrink more   .*44*/( %"5" "/% 0.5 1.0 1.5 2.0 2.5 -2 -1 0 1 2 Divorce observed standard error Divorce estimated - divorce observed 'ĶĴłĿIJ ƉƌƊ -Fę 4ISJOLBHF SFTVMU FSSPS ćF MFTT FSSPS JO UIF PSJHJOBM N QPTUFSJPS FTUJNBUF 3JHIU $PNQBSJT
  21. Error on predictor • What about error on predictor? •

    Many procedures invented • errors-in-variables • reduced major axis • total least squares • Our approach will be logical • State information • Deduce implications • Garbage in? You know what comes out. 0 1 2 3 15 20 25 30 log population Marriage rate
  22. Error on predictor: model  &SSPS PO CPUI PVUDPNF BOE

    QSFEJDUPS 8IBU IBQQFOT XIFO UIF FSSPS PO QSFEJDUPS WBSJBCMFT BT XFMM ćF BQQSPBDI JT UIF TBNF XF EFĕOF FUFST POF GPS FBDI PCTFSWFE WBMVF BOE UIFO NBLF UIPTF QBSBNFUFST UIF N (BVTTJBO EJTUSJCVUJPOT XJUI LOPXO TUBOEBSE EFWJBUJPOT *O UIF EJWPSDF EBUB UIF NBSSJBHF SBUF QSFEJDUPS WBMVFT BMTP DPNF XJ 4P MFUT JODPSQPSBUF UIBU JOGPSNBUJPO BT XFMM )FSFT UIF OFX NPEFM %IJŀŁ,J ∼ /PSNBM(µJ, σ) >OLNHOLKRRG µJ = α + β" "J + β3 3IJŀŁ,J >OLQHDUPRGHOXVL %ļįŀ,J ∼ /PSNBM(%IJŀŁ,J, %ŀIJ,J) >SULRU 3ļįŀ,J ∼ /PSNBM(3IJŀŁ,J, 3ŀIJ,J) >SULRU α ∼ /PSNBM(, ) β" ∼ /PSNBM(, ) β3 ∼ /PSNBM(, ) σ ∼ $BVDIZ(, .) ćF 3IJŀŁ QBSBNFUFST XJMM IPME UIF QPTUFSJPS EJTUSJCVUJPOT PG UIF USVF N ĕUUJOH UIF NPEFM JT NVDI MJLF CFGPSF '$./ ʚǶ '$./ǿ $1Ǿ*.ʙɶ$1*- Ǣ
  23.  &SSPS PO CPUI PVUDPNF BOE QSFEJDUPS 8IBU IBQQFOT XIFO

    UIF FSSPS PO QSFEJDUPS WBSJBCMFT BT XFMM ćF BQQSPBDI JT UIF TBNF XF EFĕOF FUFST POF GPS FBDI PCTFSWFE WBMVF BOE UIFO NBLF UIPTF QBSBNFUFST UIF N (BVTTJBO EJTUSJCVUJPOT XJUI LOPXO TUBOEBSE EFWJBUJPOT *O UIF EJWPSDF EBUB UIF NBSSJBHF SBUF QSFEJDUPS WBMVFT BMTP DPNF XJ 4P MFUT JODPSQPSBUF UIBU JOGPSNBUJPO BT XFMM )FSFT UIF OFX NPEFM %IJŀŁ,J ∼ /PSNBM(µJ, σ) >OLNHOLKRRG µJ = α + β" "J + β3 3IJŀŁ,J >OLQHDUPRGHOXVL %ļįŀ,J ∼ /PSNBM(%IJŀŁ,J, %ŀIJ,J) >SULRU 3ļįŀ,J ∼ /PSNBM(3IJŀŁ,J, 3ŀIJ,J) >SULRU α ∼ /PSNBM(, ) β" ∼ /PSNBM(, ) β3 ∼ /PSNBM(, ) σ ∼ $BVDIZ(, .) ćF 3IJŀŁ QBSBNFUFST XJMM IPME UIF QPTUFSJPS EJTUSJCVUJPOT PG UIF USVF N ĕUUJOH UIF NPEFM JT NVDI MJLF CFGPSF '$./ ʚǶ '$./ǿ $1Ǿ*.ʙɶ$1*- Ǣ Error on predictor: model likelihood for each observed rate estimated marriage rate standard error of marriage rate use estimates in regression
  24. filled circles: observed open circles: estimated lines connect points for

    same State  .*44*/( %"5" "/% 05)&3 0110356/*5*&4 2.5 3.5 standard error 15 20 25 30 4 6 8 10 12 14 Marriage rate (posterior) Divorce rate (posterior) ę 4ISJOLBHF GPS UIF QSFEJDUPS WBSJBCMF NBSSJBHF SBUF /P
  25. Error on predictor • Both divorce rate and marriage rate

    shrink • Divorce shrinks much more. Why? • Marriage rate not strongly associated with outcome => not much pooling through regression => not much shrinkage 0.5 1.5 2.5 3.5 -1.0 -0.5 0.0 Marriage rate standard error Marriage rate estimated - observed 15 20 25 30 4 6 8 10 12 14 Marriage rate (posterior) Divorce rate (posterior) 'ĶĴłĿIJ ƉƌƋ -Fę 4ISJOLBHF GPS UIF QSFEJDUPS WBSJBCMF NBSSJBHF SBUF /P UJDF UIBU TISJOLBHF JT OPU CBMBODFE CVU SBUIFS UIBU UIF NPEFM CFMJFWFT UIF PCTFSWFE WBMVFT UFOEFE UP CF PWFSFTUJNBUFT 3JHIU 4ISJOLBHF PG CPUI UIF PVUDPNF EJWPSDF SBUF BOE NBSSJBHF SBUF 4PMJE QPJOUT BSF UIF PCTFSWFE WBM VFT 0QFO QPJOUT BSF QPTUFSJPS NFBOT -JOFT DPOOFDU QBJST PG QPJOUT GPS UIF TBNF 4UBUF "MTP OPUF UIBU TJODF UIFSF JTOU NVDI BTTPDJBUJPO CFUXFFO EJWPSDF BOE NBSSJBHF SBUF UIFSF JT MFTT NPWFNFOU PG UIF NBSSJBHF SBUF FTUJNBUFT ćBU JT UP TBZ UIBU UIFSF JTOU NVD JOGPSNBUJPO JO EJWPSDF SBUF UP IFMQ VT JNQSPWF FTUJNBUFT PG NBSSJBHF SBUF *O DPOUSBTU TJO UIF SFMBUJPOTIJQ CFUXFFO EJWPSDF BOE NFEJBO BHF BU NBSSJBHF JT TUSPOH UIFSFT B MPU PG JOGP NBUJPO JO BHF BU NBSSJBHF UP IFMQ VT JNQSPWF FTUJNBUFT PG EJWPSDF SBUF ćBUT XIZ EJWPS   .*44*/( %"5" "/% 0.5 1.5 2.5 3.5 -1.0 -0.5 0.0 Marriage rate standard error Marriage rate estimated - observed
  26. Measurement error • Common malady: “data” come from uncertain procedure,

    but uncertainty discarded at analysis • Examples: • Predicting with averages; use posterior of average • DNA sequence data: respect error rate • Parentage analysis: probability distribution over possible parents • Phylogenetics: distribution of trees • Archaeology/paleontology/forensics: identification, sexing, aging, dating • Propagate uncertainty
  27. Missing data • Missing values commonplace • Usual approach: complete-case

    analysis • drop all cases with any missing values • Discards a lot of information • Alternatives • replace missing with mean of column: NEVER DO THIS • Multiple imputation • Bayesian imputation • others