Upgrade to Pro — share decks privately, control downloads, hide ads and more …

L20 Statistical Rethinking Winter 2019

L20 Statistical Rethinking Winter 2019

Lecture 20 of the Dec 2018 through March 2019 edition of Statistical Rethinking. Covers Chapter 15, measurement error and missing data.

Richard McElreath

March 01, 2019
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Missing Data &
    Other Opportunities
    Statistical Rethinking
    Winter 2019
    Lecture 20 / Week 10

    View Slide

  2. 1
    2
    3

    View Slide

  3. 1
    2
    3
    You are served:
    Probability other side is burnt?

    View Slide

  4. Avoid being clever
    • Intuition terrible guide to probability
    • No need to be clever; just ruthlessly
    apply conditional probability
    • Pr(want to know|already know)

    View Slide

  5. 1S(XBOU UP LOPX|BMSFBEZ LOPX)
    DBTF XF LOPX UIF VQ TJEF JT CVSOU 8F XBOU UP LOPX XIFUIFS PS OPU UIF EPX
    U ćF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ UFMMT VT
    1S(CVSOU EPXO|CVSOU VQ) =
    1S(CVSOU VQ, CVSOU EPXO)
    1S(CVSOU VQ)
    KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QS
    OU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU
    CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNB
    JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ
    CVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPC
    VSOUCVSOU QBODBLF JT CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSP
    VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF
    F ćJT JT
    VSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) =
    SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT B
    SPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS

    View Slide

  6. 1S(XBOU UP LOPX|BMSFBEZ LOPX)
    DBTF XF LOPX UIF VQ TJEF JT CVSOU 8F XBOU UP LOPX XIFUIFS PS OPU UIF EPX
    U ćF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ UFMMT VT
    1S(CVSOU EPXO|CVSOU VQ) =
    1S(CVSOU VQ, CVSOU EPXO)
    1S(CVSOU VQ)
    KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QS
    OU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU
    CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNB
    JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ
    CVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPC
    VSOUCVSOU QBODBLF JT CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSP
    VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF
    F ćJT JT
    VSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) =
    SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT B
    SPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS
    ćJT JT KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI P
    8F XBOU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO X
    TJEF JT CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TU
    MJHIU PG JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU
    CVSOUCVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJE
    PG UIF CVSOUCVSOU QBODBLF JT CFDBVTF B QBODBLF XBT TFMFDUFE BU SB
    JUZ UIF VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B
    QBODBLF ćJT JT
    1S(CVSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) +
    "MM UIBU SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU P
    JT GSPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS
    1S(CVSOU EPXO|CVSOU VQ) =
    /
    / =


    *G ZPV EPOU RVJUF CFMJFWF UIJT BOTXFS ZPV DBO EP B RVJDL TJNVMBUJPO UP

    View Slide

  7. 1S(XBOU UP LOPX|BMSFBEZ LOPX)
    DBTF XF LOPX UIF VQ TJEF JT CVSOU 8F XBOU UP LOPX XIFUIFS PS OPU UIF EPX
    U ćF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ UFMMT VT
    1S(CVSOU EPXO|CVSOU VQ) =
    1S(CVSOU VQ, CVSOU EPXO)
    1S(CVSOU VQ)
    KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QS
    OU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU
    CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNB
    JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ
    CVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPC
    VSOUCVSOU QBODBLF JT CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSP
    VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF
    F ćJT JT
    VSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) =
    SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT B
    SPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS
    ćJT JT KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI P
    8F XBOU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO X
    TJEF JT CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TU
    MJHIU PG JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU
    CVSOUCVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJE
    PG UIF CVSOUCVSOU QBODBLF JT CFDBVTF B QBODBLF XBT TFMFDUFE BU SB
    JUZ UIF VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B
    QBODBLF ćJT JT
    1S(CVSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) +
    "MM UIBU SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU P
    JT GSPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS
    1S(CVSOU EPXO|CVSOU VQ) =
    /
    / =


    *G ZPV EPOU RVJUF CFMJFWF UIJT BOTXFS ZPV DBO EP B RVJDL TJNVMBUJPO UP

    1S(CVSOU VQ)
    JUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QSPCMFN
    F JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU UIF VQ
    JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNBUJPO JO
    IBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ PG UIF
    F QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPCBCJMJUZ
    FDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSPCBCJM
    F PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF PG UIF
    #6)(.) + 1S(66)() = (/) + (/)(/) = .
    G HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT BOE UIJT
    4P BMM UPHFUIFS
    EPXO|CVSOU VQ) =
    /
    / =


    S ZPV DBO EP B RVJDL TJNVMBUJPO UP DPOĕSN JU

    View Slide

  8. 1S(XBOU UP LOPX|BMSFBEZ LOPX)
    DBTF XF LOPX UIF VQ TJEF JT CVSOU 8F XBOU UP LOPX XIFUIFS PS OPU UIF EPX
    U ćF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ UFMMT VT
    1S(CVSOU EPXO|CVSOU VQ) =
    1S(CVSOU VQ, CVSOU EPXO)
    1S(CVSOU VQ)
    KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QS
    OU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU
    CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNB
    JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ
    CVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPC
    VSOUCVSOU QBODBLF JT CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSP
    VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF
    F ćJT JT
    VSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/) =
    SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT B
    SPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS
    ćJT JT KVTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI P
    8F XBOU UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO X
    TJEF JT CVSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TU
    MJHIU PG JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU
    CVSOUCVSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJE
    PG UIF CVSOUCVSOU QBODBLF JT CFDBVTF B QBODBLF XBT TFMFDUFE BU SB
    JUZ UIF VQ TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B
    QBODBLF ćJT JT
    1S(CVSOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) +
    "MM UIBU SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU P
    JT GSPN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS
    1S(CVSOU EPXO|CVSOU VQ) =
    /
    / =


    *G ZPV EPOU RVJUF CFMJFWF UIJT BOTXFS ZPV DBO EP B RVJDL TJNVMBUJPO UP

    1S(CVSOU VQ)
    JUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF QSPCMFN
    F JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIBU UIF VQ
    JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSNBUJPO JO
    IBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJUZ PG UIF
    F QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSPCBCJMJUZ
    FDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF QSPCBCJM
    F PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJEF PG UIF
    #6)(.) + 1S(66)() = (/) + (/)(/) = .
    G HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT BOE UIJT
    4P BMM UPHFUIFS
    EPXO|CVSOU VQ) =
    /
    / =


    S ZPV DBO EP B RVJDL TJNVMBUJPO UP DPOĕSN JU
    VTU UIF EFĕOJUJPO PG DPOEJUJPOBM QSPCBCJMJUZ MBCFMFE XJUI PVS QBODBLF Q
    U UP LOPX JG UIF EPXO TJEF JT CVSOU BOE UIF JOGPSNBUJPO XF IBWF JT UIB
    VSOU 8F DPOEJUJPO PO UIF JOGPSNBUJPO TP XF VQEBUF PVS TUBUF PG JOGPSN
    JU ćF EFĕOJUJPO UFMMT VT UIBU UIF QSPCBCJMJUZ XF XBOU JT KVTU UIF QSPCBCJMJ
    VSOU QBODBLF EJWJEFE CZ UIF QSPCBCJMJUZ PG TFFJOH B CVSOU TJEF VQ ćF QSP
    VSOUCVSOU QBODBLF JT CFDBVTF B QBODBLF XBT TFMFDUFE BU SBOEPN ćF Q
    Q TJEF JT CVSOU NVTU BWFSBHF PWFS FBDI XBZ XF DBO HFU EFBMU B CVSOU UPQ TJE
    ćJT JT
    SOU VQ) = 1S(##)() + 1S(#6)(.) + 1S(66)() = (/) + (/)(/)
    SFNBJOT JT UIF QSPCBCJMJUZ PG HFUUJOH UIF QBODBLF UIBU JT CVSOU PO CPUI TJEFT
    PN UIF QSPCMFN EFĕOJUJPO 4P BMM UPHFUIFS
    1S(CVSOU EPXO|CVSOU VQ) =
    /
    / =


    POU RVJUF CFMJFWF UIJT BOTXFS ZPV DBO EP B RVJDL TJNVMBUJPO UP DPOĕSN JU

    View Slide

  9. 1
    2
    3
    You are served:
    Probability other side is burnt?

    View Slide

  10. Getting Ruthless
    • Express information as constraints and
    distributions => let logic discover implications
    • No need to be clever
    • Examples:
    • Measurement error
    • Missing data

    View Slide

  11. Measurement error
    • Measurement always entails error
    • Typical linear regression: interpret
    sigma as “error” on outcome
    • What if error isn’t constant?
    • What if error is on predictors?

    View Slide

  12. Error on outcome
    • data(WaffleDivorce)
    • Consider error on outcome,
    divorce rate
    • Heterogeneity in error
    • Small State => large error
    .*44*/( %"5" "/%
    23 24 25 26 27 28 29
    4 6 8 10 12 14
    Median age marriage
    Divorce rate
    'ĶĴłĿIJ ƉƌƉ -Fę %JWPSDF SBUF CZ
    6OJUFE 4UBUFT 7FSUJDBM CBST TIPX QM
    PG UIF (BVTTJBO VODFSUBJOUZ JO NFBT
    BHBJO XJUI TUBOEBSE EFWJBUJPOT BHBJO

    View Slide

  13. 23 24 25 26 27 28 29
    4 6 8 10 12 14
    Median age marriage
    Divorce rate
    0 1 2 3
    4 6 8 10 12 14
    log population
    Divorce rate
    'ĶĴłĿIJ ƉƍƉ -Fę %JWPSDF SBUF CZ NFEJBO BHF PG NBSSJBHF 4UBUFT PG UIF
    6OJUFE 4UBUFT 7FSUJDBM CBST TIPX QMVT BOE NJOVT POF TUBOEBSE EFWJBUJPO
    PG UIF (BVTTJBO VODFSUBJOUZ JO NFBTVSFE EJWPSDF SBUF 3JHIU %JWPSDF SBUF
    BHBJO XJUI TUBOEBSE EFWJBUJPOT BHBJOTU MPH QPQVMBUJPO PG FBDI 4UBUF 4NBMMFS
    4UBUFT QSPEVDF NPSF VODFSUBJO FTUJNBUFT

    View Slide

  14. 23 24 25 26 27 28 29
    4 6 8 10 12 14
    Median age marriage
    Divorce rate
    0 1 2 3
    4 6 8 10 12 14
    log population
    Divorce rate
    'ĶĴłĿIJ ƉƍƉ -Fę %JWPSDF SBUF CZ NFEJBO BHF PG NBSSJBHF 4UBUFT PG UIF
    6OJUFE 4UBUFT 7FSUJDBM CBST TIPX QMVT BOE NJOVT POF TUBOEBSE EFWJBUJPO
    PG UIF (BVTTJBO VODFSUBJOUZ JO NFBTVSFE EJWPSDF SBUF 3JHIU %JWPSDF SBUF
    BHBJO XJUI TUBOEBSE EFWJBUJPOT BHBJOTU MPH QPQVMBUJPO PG FBDI 4UBUF 4NBMMFS
    4UBUFT QSPEVDF NPSF VODFSUBJO FTUJNBUFT
    IF NFBTVSFNFOU FSSPS BSJTFT *U JT KVTU QBSU PG UIF TUBUJTUJDBM NPEFM BOE
    BVTBM NPEFM
    NPEFM PG UIF EJWPSDF FYBNQMF GSPN $IBQUFS -FUT UBLF UIBU TBNF
    PCTFSWBUJPO FSSPS PO UIF PVUDPNF
    A D
    D_obs
    M
    N
    O IFSF #VU XF DBO QSPDFFE POF TUFQ BU B UJNF ćF UPQ USJBOHMF PG UIJT
    FN UIBU XF XPSLFE XJUI CBDL JO $IBQUFS "HF BU NBSSJBHF "
    JOĘV
    UI EJSFDUMZ BOE JOEJSFDUMZ QBTTJOH UISPVHI NBSSJBHF SBUF .
    ćFO XF
    NPEFM ćF USVF EJWPSDF SBUF % JT PCTFSWFE BT %ļįŀ
    XIJDI JT B GVODUJPO
    OE UIF QPQVMBUJPO TJ[F PG FBDI 4UBUF / 4UBUFT XJUI TNBMMFS QPQVMBUJPOT
    O UIF USVF SBUF CFDBVTF UIFSF JT MFTT EBUB
    $1*- Ȁ UIF SFQPSUFE TUBOEBSE FSSPST XFSF NFBTVSFE VTJOH UIJT GBDU
    JOGPSNBUJPO JO B TUBUJTUJDBM NPEFM UIFO *UT KVTU MJLF B TJNVMBUJPO CVU

    View Slide

  15. Error on outcome
    • Approach:
    • Treat true divorce rate as unknown parameter
    • Observed rate is sample from Gaussian distribution:
    observed
    (data)
    true
    (parameter)
    std error
    (data)
    Z ∼ /PSNBM(, )
    BTVSFNFOU FSSPS IFSF TISJOLT BMM UIF QSPCBCJMJUZ QJMFT VQ PO #VU XIFO
    Z NFBTVSFNFOUT BSF NPSF BOE MFTT QMBVTJCMF ćJT JT XIBU * NFBO CZ TB
    BUB BSF B TQFDJBM DBTF PG B EJTUSJCVUJPO "OE IFSF JT UIF LFZ JOTJHIU *G XF EP
    MVF JO UIJT FYBNQMF
    UIFO XF DBO KVTU QVU B QBSBNFUFS UIFSF BOE MFU
    O UFSNT PG UIF %"( BCPWF LOPXJOH / MFUT VT BTTJHO B TUBOEBSE EFWJBUJP
    O QSPDFTT
    IPX UP EFĕOF UIF EJTUSJCVUJPO GPS FBDI EJWPSDF SBUF 'PS FBDI PCTFSW
    SF XJMM CF POF QBSBNFUFS %ŁĿłIJ,J
    EFĕOFE CZ
    %ļįŀ,J ∼ /PSNBM(%ŁĿłIJ,J, %ŀIJ,J)
    FT JT EFĕOF UIF NFBTVSFNFOU %ļįŀ,J
    BT IBWJOH UIF TQFDJĕFE (BVTTJBO EJT
    O UIF VOLOPXO QBSBNFUFS %IJŀŁ,J
    4P UIF BCPWF EFĕOFT B QSPCBCJMJUZ GPS F
    E EJWPSDF SBUF HJWFO B LOPXO NFBTVSFNFOU FSSPS
    B MPU UP UBLF JO #VU XFMM HP POF TUFQ BU B UJNF 3FDBMM UIBU UIF HPBM JT
    F % BT B MJOFBS GVODUJPO PG BHF BU NBSSJBHF " BOE NBSSJBHF SBUF . )FSFT
    LT MJLF XJUI UIF NFBTVSFNFOU FSSPST IJHIMJHIUFE JO CMVF

    View Slide

  16. Error on outcome: model
    4UBUFT QSPEVDF NPSF VODFSUBJO FTUJNBUFT
    QSPDFTT JUTFMG XIFSF UIF NFBTVSFNFOU FSSPS BSJTFT *U JT KVTU QBSU
    MJLFXJTF QBSU PG UIF DBVTBM NPEFM
    3FDBMM UIF DBVTBM NPEFM PG UIF EJWPSDF FYBNQMF GSPN $IBQ
    NPEFM BOE OPX BEE PCTFSWBUJPO FSSPS PO UIF PVUDPNF
    A D
    D_obs
    M
    N
    ćFSFT B MPU HPJOH PO IFSF #VU XF DBO QSPDFFE POF TUFQ BU B UJN
    %"( JT UIF TBNF TZTUFN UIBU XF XPSLFE XJUI CBDL JO $IBQUFS
    FODFT EJWPSDF %
    CPUI EJSFDUMZ BOE JOEJSFDUMZ QBTTJOH UISPVHI
    IBWF UIF PCTFSWBUJPO NPEFM ćF USVF EJWPSDF SBUF % JT PCTFSWFE
    PG CPUI UIF USVF SBUF BOE UIF QPQVMBUJPO TJ[F PG FBDI 4UBUF / 4UBU
    ļįŀ,J
    E PO UIF VOLOPXO QBSBNFUFS %IJŀŁ,J
    4P UIF BCPWF EFĕOFT B QSPCBCJMJUZ GPS
    SWFE EJWPSDF SBUF HJWFO B LOPXO NFBTVSFNFOU FSSPS
    JT JT B MPU UP UBLF JO #VU XFMM HP POF TUFQ BU B UJNF 3FDBMM UIBU UIF HPBM J
    SBUF % BT B MJOFBS GVODUJPO PG BHF BU NBSSJBHF " BOE NBSSJBHF SBUF . )FSF
    MPPLT MJLF XJUI UIF NFBTVSFNFOU FSSPST IJHIMJHIUFE JO CMVF
    %ļįŀ,J ∼ /PSNBM(%ŁĿłIJ,J, %ŀIJ,J) [distribution for obse
    %ŁĿłIJ,J ∼ /PSNBM(µJ, σ) [distribution fo
    µJ = α + β"
    "J + β.
    .J [linear model to ass
    α ∼ /PSNBM(, .)
    β" ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    MJLF B MJOFBS SFHSFTTJPO CVU XJUI UIF BEEJUJPO PG UIF UPQ MJOF UIBU DPOOFDU
    UP UIF USVF WBMVF &BDI %ŁĿłIJ
    QBSBNFUFS BMTP HFUT B TFDPOE SPMF BT UIF NFBO
    VUJPO POF UIBU QSFEJDUT UIF PCTFSWFE NFBTVSFNFOU " DPPM JNQMJDBUJPO UIB
    UIBU JOGPSNBUJPO ĘPXT JO CPUI EJSFDUJPOT‰UIF VODFSUBJOUZ JO NFBTVSFNFOU

    View Slide

  17. 4UBUFT QSPEVDF NPSF VODFSUBJO FTUJNBUFT
    QSPDFTT JUTFMG XIFSF UIF NFBTVSFNFOU FSSPS BSJTFT *U JT KVTU QBSU
    MJLFXJTF QBSU PG UIF DBVTBM NPEFM
    3FDBMM UIF DBVTBM NPEFM PG UIF EJWPSDF FYBNQMF GSPN $IBQ
    NPEFM BOE OPX BEE PCTFSWBUJPO FSSPS PO UIF PVUDPNF
    A D
    D_obs
    M
    N
    ćFSFT B MPU HPJOH PO IFSF #VU XF DBO QSPDFFE POF TUFQ BU B UJN
    %"( JT UIF TBNF TZTUFN UIBU XF XPSLFE XJUI CBDL JO $IBQUFS
    FODFT EJWPSDF %
    CPUI EJSFDUMZ BOE JOEJSFDUMZ QBTTJOH UISPVHI
    IBWF UIF PCTFSWBUJPO NPEFM ćF USVF EJWPSDF SBUF % JT PCTFSWFE
    PG CPUI UIF USVF SBUF BOE UIF QPQVMBUJPO TJ[F PG FBDI 4UBUF / 4UBU
    ļįŀ,J
    E PO UIF VOLOPXO QBSBNFUFS %IJŀŁ,J
    4P UIF BCPWF EFĕOFT B QSPCBCJMJUZ GPS
    SWFE EJWPSDF SBUF HJWFO B LOPXO NFBTVSFNFOU FSSPS
    JT JT B MPU UP UBLF JO #VU XFMM HP POF TUFQ BU B UJNF 3FDBMM UIBU UIF HPBM J
    SBUF % BT B MJOFBS GVODUJPO PG BHF BU NBSSJBHF " BOE NBSSJBHF SBUF . )FSF
    MPPLT MJLF XJUI UIF NFBTVSFNFOU FSSPST IJHIMJHIUFE JO CMVF
    %ļįŀ,J ∼ /PSNBM(%ŁĿłIJ,J, %ŀIJ,J) [distribution for obse
    %ŁĿłIJ,J ∼ /PSNBM(µJ, σ) [distribution fo
    µJ = α + β"
    "J + β.
    .J [linear model to ass
    α ∼ /PSNBM(, .)
    β" ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    MJLF B MJOFBS SFHSFTTJPO CVU XJUI UIF BEEJUJPO PG UIF UPQ MJOF UIBU DPOOFDU
    UP UIF USVF WBMVF &BDI %ŁĿłIJ
    QBSBNFUFS BMTP HFUT B TFDPOE SPMF BT UIF NFBO
    VUJPO POF UIBU QSFEJDUT UIF PCTFSWFE NFBTVSFNFOU " DPPM JNQMJDBUJPO UIB
    UIBU JOGPSNBUJPO ĘPXT JO CPUI EJSFDUJPOT‰UIF VODFSUBJOUZ JO NFBTVSFNFOU
    estimate
    standard error
    of observation

    View Slide

  18. Error on outcome: fitting
    m15.1 <- ulam(
    alist(
    D_obs ~ dnorm( D_true , D_sd ),
    vector[N]:D_true ~ dnorm( mu , sigma ),
    mu <- a + bA*A + bM*M,
    a ~ dnorm(0,0.2),
    bA ~ dnorm(0,0.5),
    bM ~ dnorm(0,0.5),
    sigma ~ dexp(1)
    ) , data=dlist , chains=4 , cores=4 )

    View Slide

  19. Error on outcome: fitting
    m15.1 <- ulam(
    alist(
    D_obs ~ dnorm( D_true , D_sd ),
    vector[N]:D_true ~ dnorm( mu , sigma ),
    mu <- a + bA*A + bM*M,
    a ~ dnorm(0,0.2),
    bA ~ dnorm(0,0.5),
    bM ~ dnorm(0,0.5),
    sigma ~ dexp(1)
    ) , data=dlist , chains=4 , cores=4 )

    View Slide

  20. • Divorce rate estimates move from observed values. Why?
    1.0 1.2 1.4
    AK
    DC
    E
    RI
    SD
    VT
    WY
    -2 -1 0 1 2 3
    -2 -1 0 1 2
    median age marriage (std)
    divorce rate (std)
    AR
    ID
    ME
    MN
    ND
    RI
    WY
    SJOLBHF SFTVMUJOH GSPN NPEFMJOH UIF NFBTVSFNFOU FS
    IF PSJHJOBM NFBTVSFNFOU UIF MFTT TISJOLBHF JO UIF QPT

    View Slide

  21. 0.2 0.4 0.6 0.8 1.0 1.2 1.4
    -1.0 -0.5 0.0 0.5 1.0 1.5
    D_sd
    D_est – D_obs
    AL
    AK
    AR
    DC
    ID
    ME
    NH
    ND
    RI
    SD
    UT
    VT
    WY
    -2 -1 0 1 2 3
    -2 -1 0 1 2
    median age marriage (std)
    divorce rate (std)
    AR
    ID
    ME
    MN
    ND
    RI
    WY
    'ĶĴłĿIJ ƉƍƊ -Fę 4ISJOLBHF SFTVMUJOH GSPN NPEFMJOH UIF NFBTVSFNFOU FS
    SPS ćF MFTT FSSPS JO UIF PSJHJOBM NFBTVSFNFOU UIF MFTT TISJOLBHF JO UIF QPT
    UFSJPS FTUJNBUF 3JHIU $PNQBSJTPO PG SFHSFTTJPO UIBU JHOPSFT NFBTVSFNFOU
    FSSPS EBTIFE MJOF BOE HSBZ TIBEJOH
    XJUI SFHSFTTJPO UIBU JODPSQPSBUFT NFB
    TVSFNFOU FSSPS CMVF MJOF BOE TIBEJOH
    ćF QPJOUT BOE MJOF TFHNFOUT TIPX
    • Shrinkage! Uncertain or extreme states shrink to
    regression line.

    View Slide

  22. Error on predictor
    • What about error on predictor?
    • Many procedures invented
    • errors-in-variables
    • reduced major axis
    • total least squares
    • Our approach will be logical
    • State information
    • Deduce implications
    • Garbage in? You know what
    comes out.
    0 1 2 3
    15 20 25 30
    log population
    Marriage rate

    View Slide

  23. Error on predictor: model
    .*44*/( %"5" "/% 05)&3 0110356/*5*&4
    GPS NBSSJBHF SBUF 3 )FSFT UIF VQEBUFE NPEFM XJUI UIF OFX CJUT JO CMV
    %ļįŀ,J ∼ /PSNBM(%ŁĿłIJ,J, %ŀIJ,J) [distributio
    %ŁĿłIJ,J ∼ /PSNBM(µJ, σ) [distr
    µJ = α + β"
    "J + β.
    .ŁĿłIJ,J
    .ļįŀ,J ∼ /PSNBM(.ŁĿłIJ,J, .ŀIJ,J) [distribution
    .ŁĿłIJ,J ∼ /PSNBM(, ) [distri
    α ∼ /PSNBM(, .)
    β" ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    ćF .ŁĿłIJ
    QBSBNFUFST XJMM IPME UIF QPTUFSJPS EJTUSJCVUJPOT PG UIF USVF
    ĕUUJOH UIF NPEFM JT NVDI MJLF CFGPSF

    View Slide

  24. Error on predictor: model
    .*44*/( %"5" "/% 05)&3 0110356/*5*&4
    GPS NBSSJBHF SBUF 3 )FSFT UIF VQEBUFE NPEFM XJUI UIF OFX CJUT JO CMV
    %ļįŀ,J ∼ /PSNBM(%ŁĿłIJ,J, %ŀIJ,J) [distributio
    %ŁĿłIJ,J ∼ /PSNBM(µJ, σ) [distr
    µJ = α + β"
    "J + β.
    .ŁĿłIJ,J
    .ļįŀ,J ∼ /PSNBM(.ŁĿłIJ,J, .ŀIJ,J) [distribution
    .ŁĿłIJ,J ∼ /PSNBM(, ) [distri
    α ∼ /PSNBM(, .)
    β" ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    ćF .ŁĿłIJ
    QBSBNFUFST XJMM IPME UIF QPTUFSJPS EJTUSJCVUJPOT PG UIF USVF
    ĕUUJOH UIF NPEFM JT NVDI MJLF CFGPSF
    use estimates
    in regression
    estimated
    marriage rate
    standard error of
    marriage rate
    likelihood for
    observed rate

    View Slide

  25. Error on predictor: model
    .*44*/( %"5" "/% 05)&3 0110356/*5*&4
    GPS NBSSJBHF SBUF 3 )FSFT UIF VQEBUFE NPEFM XJUI UIF OFX CJUT JO CMV
    %ļįŀ,J ∼ /PSNBM(%ŁĿłIJ,J, %ŀIJ,J) [distributio
    %ŁĿłIJ,J ∼ /PSNBM(µJ, σ) [distr
    µJ = α + β"
    "J + β.
    .ŁĿłIJ,J
    .ļįŀ,J ∼ /PSNBM(.ŁĿłIJ,J, .ŀIJ,J) [distribution
    .ŁĿłIJ,J ∼ /PSNBM(, ) [distri
    α ∼ /PSNBM(, .)
    β" ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    ćF .ŁĿłIJ
    QBSBNFUFST XJMM IPME UIF QPTUFSJPS EJTUSJCVUJPOT PG UIF USVF
    ĕUUJOH UIF NPEFM JT NVDI MJLF CFGPSF
    prior rates
    Not the best approach:
    M and A are associated!
    Will do better later on.
    23 24 25 26 27 28 29
    4
    Median age marriage
    0 1
    4
    log popu
    'ĶĴłĿIJ ƉƍƉ -Fę %JWPSDF SBUF CZ NFEJBO BHF PG NBSSJBHF 4
    6OJUFE 4UBUFT 7FSUJDBM CBST TIPX QMVT BOE NJOVT POF TUBOEBS
    PG UIF (BVTTJBO VODFSUBJOUZ JO NFBTVSFE EJWPSDF SBUF 3JHIU %
    BHBJO XJUI TUBOEBSE EFWJBUJPOT BHBJOTU MPH QPQVMBUJPO PG FBDI 4U
    4UBUFT QSPEVDF NPSF VODFSUBJO FTUJNBUFT
    QSPDFTT JUTFMG XIFSF UIF NFBTVSFNFOU FSSPS BSJTFT *U JT KVTU QBSU PG UIF T
    MJLFXJTF QBSU PG UIF DBVTBM NPEFM
    3FDBMM UIF DBVTBM NPEFM PG UIF EJWPSDF FYBNQMF GSPN $IBQUFS
    NPEFM BOE OPX BEE PCTFSWBUJPO FSSPS PO UIF PVUDPNF
    A D
    D_obs
    M
    N
    ćFSFT B MPU HPJOH PO IFSF #VU XF DBO QSPDFFE POF TUFQ BU B UJNF ć

    View Slide

  26. filled circles: observed
    open circles: estimated
    lines connect points for same State
    .*44*/( %"5"
    -1 0 1 2
    -2 -1 0 1 2
    marriage rate (std)
    divorce rate (std)
    'ĶĴłĿIJ ƉƍƋ
    BOE NBSSJBH
    TFSWFE WBMVF
    NFBOT -JOF
    TBNF 4UBUF #
    UIF JOGFSSFE

    View Slide

  27. Error on predictor
    • Both divorce rate and marriage
    rate shrink
    • Divorce shrinks more. Why?
    • Marriage rate not strongly
    associated with outcome => not
    much pooling through regression
    => not much shrinkage
    .
    -1 0 1 2
    -2 -1 0 1 2
    marriage rate (std)
    divorce rate (std)
    +*$)/.ΰ ά ./ ΁ ά ./ α
    !*- ΰ $ $) ͠΂)-*2ΰα α

    View Slide

  28. Measurement error
    • Common malady: “data” come from uncertain
    procedure, but uncertainty discarded at analysis
    • Examples:
    • Predicting with averages
    • Parentage analysis
    • Phylogenetics: distribution of trees
    • Archaeology/paleontology/forensics: identification,
    sexing, aging, dating
    • Propagate uncertainty

    View Slide

  29. Missing data
    • Missing values commonplace
    • Usual approach: complete-case analysis
    • drop all cases with any missing values
    • Discards a lot of information
    • Alternatives
    • replace missing with mean of column: NEVER DO THIS
    • Multiple imputation
    • Bayesian imputation
    • others

    View Slide

  30. Why impute?
    • Missingness can be a confound
    OE OPU DPOEJUJPOJOH PO UIF SJHIU WBSJBCMFT BOE Ķ弳ŁĶĻĴ UIF NJTTJOH WBMVFT
    "MM UIJT XJMM CFDPNF DMFBS JG XF ESBX TPNF EJBHSBNT -FUT SFUVSO UP UIF QSJNBUF NJM
    YBNQMF GSPN $IBQUFS 8F VTFE /ΰ($'&α UP JMMVTUSBUF NBTLJOH VTJOH CPUI OFPDPSUF
    FSDFOU BOE CPEZ NBTT UP QSFEJDU NJML FOFSHZ 0OF BTQFDU PG UIPTF EBUB BSF NJTTJO
    BMVFT JO UIF ) **-/ 3΀+ - DPMVNO 8F VTFE B İļĺĽĹIJŁIJİĮŀIJ BOBMZTJT CBDL UIFO
    IJDI NFBOT XF ESPQQFE UIPTF DBTFT GSPN UIF BOBMZTJT ćBU NFBOT XF BMTP ESPQQFE
    FSGFDUMZ HPPE CPEZ NBTT BOE NJML FOFSHZ WBMVFT ćBU MFę VT XJUI POMZ DBTFT UP XPS
    JUI 8BT UIBU B CBE JEFB
    5P BOTXFS UIBU RVFTUJPO XF OFFE UP UIJOL NPSF DMFBSMZ BCPVU XIZ UIPTF WBMVFT BSF NJT
    OH ćF CBTJD %"( GSPN UIJT FYBNQMF JT
    B
    K
    M U
    IFSF . JT CPEZ NBTT # JT OFPDPSUFY QFSDFOU , JT NJML FOFSHZ BOE 6 JT TPNF VOPCTFSWF
    BSJBCMF UIBU SFOEFST . BOE # QPTJUJWFMZ DPSSFMBUFE 8F XBOU UP BEE NJTTJOHOFTT UP UI
    SBQI 8IBU UIBU NFBOT JT SFBMJ[JOH UIBU XF IBWFOU PCTFSWFE # OFPDPSUFY QFSDFOU
    8FW
    OTUFBE PCTFSWFE #PCT
    B QBSUJBMMZ PCTFSWFE TFU PG WBMVFT HFOFSBUFE CZ # BOE TPNF QSPDFT
    FUT OBNF UIF QSPDFTT UIBU HFOFSBUFT NJTTJOH WBMVFT 3#
    BOE OPX BEE JU UP PVS HSBQI
    Body mass Proportion brain
    neocortex
    Milk energy (kcal)

    View Slide

  31. Three Types of Missingness
    K
    JT OFPDPSUFY QFSDFOU , JT NJML FOFSHZ BOE 6 JT TPNF VOPCTFSWFE
    OE # QPTJUJWFMZ DPSSFMBUFE 8F XBOU UP BEE NJTTJOHOFTT UP UIJT
    T SFBMJ[JOH UIBU XF IBWFOU PCTFSWFE # OFPDPSUFY QFSDFOU
    8FWF
    QBSUJBMMZ PCTFSWFE TFU PG WBMVFT HFOFSBUFE CZ # BOE TPNF QSPDFTT
    U HFOFSBUFT NJTTJOH WBMVFT 3#
    BOE OPX BEE JU UP PVS HSBQI
    B
    B_obs
    K
    M
    R_B
    U
    UIJOL PG UIF PCTFSWFEXJUINJTTJOHOFTT #ļįŀ
    BT CFJOH B GVODUJPO
    CTFSWFE # BOE UIF NJTTJOHOFTT QSPDFTT 3#
    8F DBO USFBU 3#
    MJLF
    E ćFO XF DBO VTF PVS GSJFOE UIF CBDLEPPS DSJUFSJPO UP ĕHVSF PVU
    O BCPVU OFFE UP DPOEJUJPO PO
    3#
    *O UIF HSBQI BCPWF XF XBOU
    O , 5P ĕHVSF PVU XIFO UIF FTUJNBUF JT DPOGPVOEFE XF ĕOE BMM UIF
    OZ PG UIFN BSF CBDLEPPST XF OFFE UP DMPTF UIPTF UP HFU UIF DBVTBM
    BCPWF XF OFFE UP DPOEJUJPO PO . UP DMPTF UIF JOEJSFDU QBUI KVTU
    UIFSF JT OP QBUI UISPVHI 3#
    4P JU EPFTOU DPOGPVOE JOGFSFODF
    .*44*/( %"5"
    "OPUIFS QPTTJCJMJUZ JT UIBU TPNF PUIFS WBSJBCMF JOĘVFODFT UIF NJTTJOHOFTT QSPDFTT
    B
    B_obs
    K
    M
    R_B
    U
    /PX . JOĘVFODFT 3#
    XIJDI NFBOT GPS FYBNQMF UIBU TQFDJFT XJUI TNBMMFS CPEJFT BSF NPSF
    PS MFTT
    MJLFMZ UP IBWF NJTTJOH WBMVFT JO #ļįŀ
    ćJT DPVME IBQQFO JG SFTFBSDIFST BSF MFTT JOUFS
    FTUFE JO TNBMM TQFDJFT BOE TP EP OPU PęFO HP UISPVHI UIF USPVCMF PG NBLJOH EFUBJMFE CSBJO
    NFBTVSFNFOUT GPS UIFN 8IBU IBQQFOT JO UIJT DBTF ćFSF JT OPX B CBDLEPPS QBUI GSPN
    #ļįŀ
    UIPVHI 3#
    UP , 4P UIF NJTTJOHOFTT QSPDFTT DBO DPOGPVOE PVS JOGFSFODF VOMFTT XF DBO
    DMPTF UIF CBDLEPPS *O UIJT DBTF XF DBO TIVU UIF CBDLEPPS CZ DPOEJUJPOJOH PO . 8F NJHIU
    IBWF EPOF UIJT BOZXBZ CFDBVTF XF XBOU UP UIF EJSFDU JOĘVFODF PG # PO , ćJT UZQF PG NJTT
    JOHOFTT JT LOPXO CZ BOPUIFS VOGPSUVOBUFMZ BXLXBSE OBNF ĺĶŀŀĶĻĴ ĮŁ ĿĮĻıļĺ ."3

    8F EPOU OFFE UP EJTDPWFS UIF NJTTJOHOFTT QSPDFTT BCPWF #VU UIFSF JT TPNFUIJOH FMTF XF
    JU EJEOU QSPEVDF CJBT /PX JU XJMM QSPEVDF CJBT CFDBVTF JU SFNPW
    PUIFS WBSJBCMFT BOE DPOGPVOET PVS JOGFSFODF #VU XF DBO MVDLJMZ
    BOE QPTTJCMZ NBLF WBMJE JOGFSFODFT 8JUI .$"3 UIF ĕSTU UZQF
    J
    VTFGVM 8JUI ."3 UIJT TFDPOE UZQF
    JU JT NBOEBUPSZ
    ćF UIJSE UZQF PG NJTTJOHOFTT JT B USVF UFSSPS 4VQQPTF OPX UI
    PG # BSF UIF POFT UIBU UFOE UP CF NJTTJOH ćJT DPVME IBQQFO GPS F
    PG OFPDPSUFY BSF TUVEJFE FYBDUMZ GPS UIBU SFBTPO ćFSF XPVME CF N
    BCPVU TVDI CSBJOT CVU GFX QSFDJTF NFBTVSFNFOUT BCPVU CSBJOT XJUI
    NJHIU MPPL MJLF UIJT
    B
    B_obs
    K
    M
    R_B
    U
    ćJT JT B SFBM QSPCMFN /PX UIFSF JT B CBDLEPPS GSPN #PCT
    UISPVHI
    XBZ UP , ćFSF JT OP XBZ UP DMPTF UIJT CBDLEPPS CFDBVTF XF DBOU D
    WBSJBCMF # $POEJUJPOJOH PO . EPFTOU IFMQ *G XF DBO NPEFM UIF N
    UIFSF JT TUJMM IPQF #VU JO HFOFSBM UIFSF BSF OP HVBSBOUFFT IFSF ć
    MISSING COMPLETELY
    AT RANDOM
    MISSING
    AT RANDOM
    MISSING NOT
    AT RANDOM
    MCAR MAR MNAR
    Possibly most confusing statistical terms ever invented.

    View Slide

  32. XIFSF . JT CPEZ NBTT # JT OFPDPSUFY QFSDFOU , JT NJML FOFSHZ BOE 6 JT TPNF VOPCTFSWF
    WBSJBCMF UIBU SFOEFST . BOE # QPTJUJWFMZ DPSSFMBUFE 8F XBOU UP BEE NJTTJOHOFTT UP UI
    HSBQI 8IBU UIBU NFBOT JT SFBMJ[JOH UIBU XF IBWFOU PCTFSWFE # OFPDPSUFY QFSDFOU
    8F
    OTUFBE PCTFSWFE #PCT
    B QBSUJBMMZ PCTFSWFE TFU PG WBMVFT HFOFSBUFE CZ # BOE TPNF QSPDFT
    -FUT OBNF UIF QSPDFTT UIBU HFOFSBUFT NJTTJOH WBMVFT 3#
    BOE OPX BEE JU UP PVS HSBQI
    B
    B_obs
    K
    M
    R_B
    U
    ćF XBZ UP SFBE UIJT JT UP UIJOL PG UIF PCTFSWFEXJUINJTTJOHOFTT #ļįŀ
    BT CFJOH B GVODUJP
    PG UIF DPNQMFUFCVUVOPCTFSWFE # BOE UIF NJTTJOHOFTT QSPDFTT 3#
    8F DBO USFBU 3#
    MJ
    BOPUIFS QPTTJCMF DPOGPVOE ćFO XF DBO VTF PVS GSJFOE UIF CBDLEPPS DSJUFSJPO UP ĕHVSF P
    XIFO XF OFFE JOGPSNBUJPO BCPVU OFFE UP DPOEJUJPO PO
    3#
    *O UIF HSBQI BCPWF XF XB
    OGFS UIF JOĘVFODF PG # PO , 5P ĕHVSF PVU XIFO UIF FTUJNBUF JT DPOGPVOEFE XF ĕOE BMM UI
    QBUIT GSPN # UP , *G BOZ PG UIFN BSF CBDLEPPST XF OFFE UP DMPTF UIPTF UP HFU UIF DBVT
    Missingness
    mechanism Observed B
    True (unobserved) B

    View Slide

  33. XIFSF . JT CPEZ NBTT # JT OFPDPSUFY QFSDFOU , JT NJML FOFSHZ BOE 6 JT TPNF VOPCTFSWF
    WBSJBCMF UIBU SFOEFST . BOE # QPTJUJWFMZ DPSSFMBUFE 8F XBOU UP BEE NJTTJOHOFTT UP UI
    HSBQI 8IBU UIBU NFBOT JT SFBMJ[JOH UIBU XF IBWFOU PCTFSWFE # OFPDPSUFY QFSDFOU
    8F
    OTUFBE PCTFSWFE #PCT
    B QBSUJBMMZ PCTFSWFE TFU PG WBMVFT HFOFSBUFE CZ # BOE TPNF QSPDFT
    -FUT OBNF UIF QSPDFTT UIBU HFOFSBUFT NJTTJOH WBMVFT 3#
    BOE OPX BEE JU UP PVS HSBQI
    B
    B_obs
    K
    M
    R_B
    U
    ćF XBZ UP SFBE UIJT JT UP UIJOL PG UIF PCTFSWFEXJUINJTTJOHOFTT #ļįŀ
    BT CFJOH B GVODUJP
    PG UIF DPNQMFUFCVUVOPCTFSWFE # BOE UIF NJTTJOHOFTT QSPDFTT 3#
    8F DBO USFBU 3#
    MJ
    BOPUIFS QPTTJCMF DPOGPVOE ćFO XF DBO VTF PVS GSJFOE UIF CBDLEPPS DSJUFSJPO UP ĕHVSF P
    XIFO XF OFFE JOGPSNBUJPO BCPVU OFFE UP DPOEJUJPO PO
    3#
    *O UIF HSBQI BCPWF XF XB
    OGFS UIF JOĘVFODF PG # PO , 5P ĕHVSF PVU XIFO UIF FTUJNBUF JT DPOGPVOEFE XF ĕOE BMM UI
    QBUIT GSPN # UP , *G BOZ PG UIFN BSF CBDLEPPST XF OFFE UP DMPTF UIPTF UP HFU UIF DBVT
    Are there any backdoors from B_obs to K?

    View Slide

  34. WBSJBCMF UIBU SFOEFST . BOE # QPTJUJWFMZ DPSSFMBUFE 8F XBOU UP BEE NJTTJOHOFTT UP UI
    HSBQI 8IBU UIBU NFBOT JT SFBMJ[JOH UIBU XF IBWFOU PCTFSWFE # OFPDPSUFY QFSDFOU
    8F
    OTUFBE PCTFSWFE #PCT
    B QBSUJBMMZ PCTFSWFE TFU PG WBMVFT HFOFSBUFE CZ # BOE TPNF QSPDFT
    -FUT OBNF UIF QSPDFTT UIBU HFOFSBUFT NJTTJOH WBMVFT 3#
    BOE OPX BEE JU UP PVS HSBQI
    B
    B_obs
    K
    M
    R_B
    U
    ćF XBZ UP SFBE UIJT JT UP UIJOL PG UIF PCTFSWFEXJUINJTTJOHOFTT #ļįŀ
    BT CFJOH B GVODUJP
    PG UIF DPNQMFUFCVUVOPCTFSWFE # BOE UIF NJTTJOHOFTT QSPDFTT 3#
    8F DBO USFBU 3#
    MJ
    BOPUIFS QPTTJCMF DPOGPVOE ćFO XF DBO VTF PVS GSJFOE UIF CBDLEPPS DSJUFSJPO UP ĕHVSF P
    XIFO XF OFFE JOGPSNBUJPO BCPVU OFFE UP DPOEJUJPO PO
    3#
    *O UIF HSBQI BCPWF XF XB
    OGFS UIF JOĘVFODF PG # PO , 5P ĕHVSF PVU XIFO UIF FTUJNBUF JT DPOGPVOEFE XF ĕOE BMM UI
    QBUIT GSPN #ļįŀ
    UP , *G BOZ PG UIFN BSF CBDLEPPST XF OFFE UP DMPTF UIPTF UP HFU UIF DBVT
    OĘVFODF PG # *O UIF DBTF BCPWF XF OFFE UP DPOEJUJPO PO . UP DMPTF UIF JOEJSFDU QBUI KV
    BT FBSMJFS JO UIF CPPL #VU UIFSF JT OP QBUI UISPVHI 3 4P JU EPFTOU DPOGPVOE JOGFSFODF
    Can condition on M for direct effect.
    Either way, R_B is ignorable.

    View Slide

  35. Missing Completely At Random
    • MCAR: K is unconditionally
    independent of R_B
    • Do not need to condition on
    anything for R_B not to be a
    confound
    • On right, no path through R_B,
    conditioning on B_obs
    • Do not NEED to impute
    • But imputation adds precision
    K
    XIFSF . JT CPEZ NBTT # JT OFPDPSUFY QFSDFOU , JT NJML FOFSHZ BOE
    WBSJBCMF UIBU SFOEFST . BOE # QPTJUJWFMZ DPSSFMBUFE 8F XBOU UP B
    HSBQI 8IBU UIBU NFBOT JT SFBMJ[JOH UIBU XF IBWFOU PCTFSWFE # OF
    JOTUFBE PCTFSWFE #PCT
    B QBSUJBMMZ PCTFSWFE TFU PG WBMVFT HFOFSBUFE
    -FUT OBNF UIF QSPDFTT UIBU HFOFSBUFT NJTTJOH WBMVFT 3#
    BOE OPX BE
    B
    B_obs
    K
    M
    R_B
    U
    ćF XBZ UP SFBE UIJT JT UP UIJOL PG UIF PCTFSWFEXJUINJTTJOHOFTT #
    PG UIF DPNQMFUFCVUVOPCTFSWFE # BOE UIF NJTTJOHOFTT QSPDFTT 3#
    BOPUIFS QPTTJCMF DPOGPVOE ćFO XF DBO VTF PVS GSJFOE UIF CBDLEPP
    XIFO XF OFFE JOGPSNBUJPO BCPVU OFFE UP DPOEJUJPO PO
    3#
    *O UI
    JOGFS UIF JOĘVFODF PG # PO , 5P ĕHVSF PVU XIFO UIF FTUJNBUF JT DPO
    QBUIT GSPN #ļįŀ
    UP , *G BOZ PG UIFN BSF CBDLEPPST XF OFFE UP DMPT

    View Slide

  36. WBSJBCMF UIBU SFOEFST . BOE # QPTJUJWFMZ DPSSFMBUFE 8F XBOU UP BEE NJTTJOHOFTT UP UI
    HSBQI 8IBU UIBU NFBOT JT SFBMJ[JOH UIBU XF IBWFOU PCTFSWFE # OFPDPSUFY QFSDFOU
    8F
    OTUFBE PCTFSWFE #PCT
    B QBSUJBMMZ PCTFSWFE TFU PG WBMVFT HFOFSBUFE CZ # BOE TPNF QSPDFT
    -FUT OBNF UIF QSPDFTT UIBU HFOFSBUFT NJTTJOH WBMVFT 3#
    BOE OPX BEE JU UP PVS HSBQI
    B
    B_obs
    K
    M
    R_B
    U
    ćF XBZ UP SFBE UIJT JT UP UIJOL PG UIF PCTFSWFEXJUINJTTJOHOFTT #ļįŀ
    BT CFJOH B GVODUJP
    PG UIF DPNQMFUFCVUVOPCTFSWFE # BOE UIF NJTTJOHOFTT QSPDFTT 3#
    8F DBO USFBU 3#
    MJ
    BOPUIFS QPTTJCMF DPOGPVOE ćFO XF DBO VTF PVS GSJFOE UIF CBDLEPPS DSJUFSJPO UP ĕHVSF P
    XIFO XF OFFE JOGPSNBUJPO BCPVU OFFE UP DPOEJUJPO PO
    3#
    *O UIF HSBQI BCPWF XF XB
    OGFS UIF JOĘVFODF PG # PO , 5P ĕHVSF PVU XIFO UIF FTUJNBUF JT DPOGPVOEFE XF ĕOE BMM UI
    QBUIT GSPN #ļįŀ
    UP , *G BOZ PG UIFN BSF CBDLEPPST XF OFFE UP DMPTF UIPTF UP HFU UIF DBVT
    OĘVFODF PG # *O UIF DBTF BCPWF XF OFFE UP DPOEJUJPO PO . UP DMPTF UIF JOEJSFDU QBUI KV
    BT FBSMJFS JO UIF CPPL #VU UIFSF JT OP QBUI UISPVHI 3 4P JU EPFTOU DPOGPVOE JOGFSFODF
    Does MCAR ever happen in real data?
    Research assistant randomly deletes values?

    View Slide

  37. .*44*/( %"5"
    "OPUIFS QPTTJCJMJUZ JT UIBU TPNF PUIFS WBSJBCMF JOĘVFODFT UIF NJTTJOHOFTT QSPDFTT
    B
    B_obs
    K
    M
    R_B
    U
    /PX . JOĘVFODFT 3#
    XIJDI NFBOT GPS FYBNQMF UIBU TQFDJFT XJUI TNBMMFS CPEJFT BSF NPS
    PS MFTT
    MJLFMZ UP IBWF NJTTJOH WBMVFT JO #ļįŀ
    ćJT DPVME IBQQFO JG SFTFBSDIFST BSF MFTT JOUF
    TUFE JO TNBMM TQFDJFT BOE TP EP OPU PęFO HP UISPVHI UIF USPVCMF PG NBLJOH EFUBJMFE CSBJ
    NFBTVSFNFOUT GPS UIFN 8IBU IBQQFOT JO UIJT DBTF ćFSF JT OPX B CBDLEPPS QBUI GSPN
    ļįŀ
    UIPVHI 3#
    UP , 4P UIF NJTTJOHOFTT QSPDFTT DBO DPOGPVOE PVS JOGFSFODF VOMFTT XF DB
    MPTF UIF CBDLEPPS *O UIJT DBTF XF DBO TIVU UIF CBDLEPPS CZ DPOEJUJPOJOH PO . 8F NJHI
    Missing At Random
    Missingness more likely for specific values of M.
    How can this happen?

    View Slide

  38. .*44*/( %"5"
    "OPUIFS QPTTJCJMJUZ JT UIBU TPNF PUIFS WBSJBCMF JOĘVFODFT UIF NJTTJOHOFTT QSPDFTT
    B
    B_obs
    K
    M
    R_B
    U
    /PX . JOĘVFODFT 3#
    XIJDI NFBOT GPS FYBNQMF UIBU TQFDJFT XJUI TNBMMFS CPEJFT BSF NPS
    PS MFTT
    MJLFMZ UP IBWF NJTTJOH WBMVFT JO #ļįŀ
    ćJT DPVME IBQQFO JG SFTFBSDIFST BSF MFTT JOUF
    TUFE JO TNBMM TQFDJFT BOE TP EP OPU PęFO HP UISPVHI UIF USPVCMF PG NBLJOH EFUBJMFE CSBJ
    NFBTVSFNFOUT GPS UIFN 8IBU IBQQFOT JO UIJT DBTF ćFSF JT OPX B CBDLEPPS QBUI GSPN
    ļįŀ
    UIPVHI 3#
    UP , 4P UIF NJTTJOHOFTT QSPDFTT DBO DPOGPVOE PVS JOGFSFODF VOMFTT XF DB
    MPTF UIF CBDLEPPS *O UIJT DBTF XF DBO TIVU UIF CBDLEPPS CZ DPOEJUJPOJOH PO . 8F NJHI
    Backdoor path from B_obs to K?
    Missing At Random

    View Slide

  39. .*44*/( %"5"
    "OPUIFS QPTTJCJMJUZ JT UIBU TPNF PUIFS WBSJBCMF JOĘVFODFT UIF NJTTJOHOFTT QSPDFTT
    B
    B_obs
    K
    M
    R_B
    U
    /PX . JOĘVFODFT 3#
    XIJDI NFBOT GPS FYBNQMF UIBU TQFDJFT XJUI TNBMMFS CPEJFT BSF NPS
    PS MFTT
    MJLFMZ UP IBWF NJTTJOH WBMVFT JO #ļįŀ
    ćJT DPVME IBQQFO JG SFTFBSDIFST BSF MFTT JOUF
    TUFE JO TNBMM TQFDJFT BOE TP EP OPU PęFO HP UISPVHI UIF USPVCMF PG NBLJOH EFUBJMFE CSBJ
    NFBTVSFNFOUT GPS UIFN 8IBU IBQQFOT JO UIJT DBTF ćFSF JT OPX B CBDLEPPS QBUI GSPN
    ļįŀ
    UIPVHI 3#
    UP , 4P UIF NJTTJOHOFTT QSPDFTT DBO DPOGPVOE PVS JOGFSFODF VOMFTT XF DB
    MPTF UIF CBDLEPPS *O UIJT DBTF XF DBO TIVU UIF CBDLEPPS CZ DPOEJUJPOJOH PO . 8F NJHI
    Missing At Random
    Backdoor path from B_obs to K?
    Can condition on M to close.

    View Slide

  40. Missing (Simply) At Random
    • MAR: K is conditionally
    independent of R_B
    • Must to condition on M for R_B
    not to be a confound
    • Still must impute to de-bias
    estimates
    • Why? If you delete cases of M/K
    where B is missing, missingness
    obscures causation.
    .*44*/( %"5"
    "OPUIFS QPTTJCJMJUZ JT UIBU TPNF PUIFS WBSJBCMF JOĘVFODFT UI
    B
    B_obs
    K
    M
    R_B
    U
    /PX . JOĘVFODFT 3#
    XIJDI NFBOT GPS FYBNQMF UIBU TQFDJFT XJ
    PS MFTT
    MJLFMZ UP IBWF NJTTJOH WBMVFT JO #ļįŀ
    ćJT DPVME IBQQFO
    FTUFE JO TNBMM TQFDJFT BOE TP EP OPU PęFO HP UISPVHI UIF USPVC
    NFBTVSFNFOUT GPS UIFN 8IBU IBQQFOT JO UIJT DBTF ćFSF JT O
    #ļįŀ
    UIPVHI 3#
    UP , 4P UIF NJTTJOHOFTT QSPDFTT DBO DPOGPVOE P
    DMPTF UIF CBDLEPPS *O UIJT DBTF XF DBO TIVU UIF CBDLEPPS CZ DPO

    View Slide

  41. Missing Not At Random
    Missingness more likely for specific values of B.
    How can this happen?
    FGVM 8JUI ."3 UIJT TFDPOE UZQF
    JU JT NBOEBUPSZ
    ćF UIJSE UZQF PG NJTTJOHOFTT JT B USVF UFSSPS 4VQQPTF OPX UIBU TQFDJFT XJUI MPX WBMVFT
    # BSF UIF POFT UIBU UFOE UP CF NJTTJOH ćJT DPVME IBQQFO GPS FYBNQMF JG TQFDJFT XJUI MPUT
    OFPDPSUFY BSF TUVEJFE FYBDUMZ GPS UIBU SFBTPO ćFSF XPVME CF NBOZ QSFDJTF NFBTVSFNFOUT
    PVU TVDI CSBJOT CVU GFX QSFDJTF NFBTVSFNFOUT BCPVU CSBJOT XJUI MFTT OFPDPSUFY ćF %"(
    JHIU MPPL MJLF UIJT
    B
    B_obs
    K
    M
    R_B
    U
    JT JT B SFBM QSPCMFN /PX UIFSF JT B CBDLEPPS GSPN #PCT
    UISPVHI UIF NFDIBOJTN 3#
    BMM UIF
    BZ UP , ćFSF JT OP XBZ UP DMPTF UIJT CBDLEPPS CFDBVTF XF DBOU DPOEJUJPO PO UIF DPNQMFUF
    SJBCMF # $POEJUJPOJOH PO . EPFTOU IFMQ *G XF DBO NPEFM UIF NJTTJOHOFTT NFDIBOJTN 3#
    FSF JT TUJMM IPQF #VU JO HFOFSBM UIFSF BSF OP HVBSBOUFFT IFSF ćJT TDFOBSJP JT TPNFUJNFT

    View Slide

  42. Missing Not At Random
    No way to shut the backdoor!
    FGVM 8JUI ."3 UIJT TFDPOE UZQF
    JU JT NBOEBUPSZ
    ćF UIJSE UZQF PG NJTTJOHOFTT JT B USVF UFSSPS 4VQQPTF OPX UIBU TQFDJFT XJUI MPX WBMVFT
    # BSF UIF POFT UIBU UFOE UP CF NJTTJOH ćJT DPVME IBQQFO GPS FYBNQMF JG TQFDJFT XJUI MPUT
    OFPDPSUFY BSF TUVEJFE FYBDUMZ GPS UIBU SFBTPO ćFSF XPVME CF NBOZ QSFDJTF NFBTVSFNFOUT
    PVU TVDI CSBJOT CVU GFX QSFDJTF NFBTVSFNFOUT BCPVU CSBJOT XJUI MFTT OFPDPSUFY ćF %"(
    JHIU MPPL MJLF UIJT
    B
    B_obs
    K
    M
    R_B
    U
    JT JT B SFBM QSPCMFN /PX UIFSF JT B CBDLEPPS GSPN #PCT
    UISPVHI UIF NFDIBOJTN 3#
    BMM UIF
    BZ UP , ćFSF JT OP XBZ UP DMPTF UIJT CBDLEPPS CFDBVTF XF DBOU DPOEJUJPO PO UIF DPNQMFUF
    SJBCMF # $POEJUJPOJOH PO . EPFTOU IFMQ *G XF DBO NPEFM UIF NJTTJOHOFTT NFDIBOJTN 3#
    FSF JT TUJMM IPQF #VU JO HFOFSBM UIFSF BSF OP HVBSBOUFFT IFSF ćJT TDFOBSJP JT TPNFUJNFT

    View Slide

  43. Missing Not At Random
    Can also arise through unobserved variables (right).
    T TFDPOE UZQF
    JU JT NBOEBUPSZ
    NJTTJOHOFTT JT B USVF UFSSPS 4VQQPTF OPX UIBU TQFDJFT XJUI MPX WBMVFT
    FOE UP CF NJTTJOH ćJT DPVME IBQQFO GPS FYBNQMF JG TQFDJFT XJUI MPUT
    E FYBDUMZ GPS UIBU SFBTPO ćFSF XPVME CF NBOZ QSFDJTF NFBTVSFNFOUT
    FX QSFDJTF NFBTVSFNFOUT BCPVU CSBJOT XJUI MFTT OFPDPSUFY ćF %"(
    B
    B_obs
    K
    M
    R_B
    U
    /PX UIFSF JT B CBDLEPPS GSPN #PCT
    UISPVHI UIF NFDIBOJTN 3#
    BMM UIF
    BZ UP DMPTF UIJT CBDLEPPS CFDBVTF XF DBOU DPOEJUJPO PO UIF DPNQMFUF
    OH PO . EPFTOU IFMQ *G XF DBO NPEFM UIF NJTTJOHOFTT NFDIBOJTN 3#

    JO HFOFSBM UIFSF BSF OP HVBSBOUFFT IFSF ćJT TDFOBSJP JT TPNFUJNFT
    DBMMFE ĺĶŀŀĶĻĴ ĻļŁ ĮŁ ĿĮĻıļĺ ./"3
    * LOPX‰UIFTF UFSNT BSF
    GPDVT PO JT UIBU ./"3 BSJTFT XIFO UIFSF JT OP TFU PG WBSJBCMFT UP DPOEJ
    CBDLEPPST UISPVHI 3#
    -PUT PG EJČFSFOU HSBQIT DBO MFBE UP UIBU )FSF
    B
    B_obs
    K
    M
    R_B
    U1
    U2
    /PX JU JTOU UIF # WBMVFT UIFNTFMWFT UIBU QSPEVDF NJTTJOHOFTT 3BUIFS U
    WBSJBCMF 6 UIBU JOĘVFODFT CPUI # BOE NJTTJOHOFTT 6 DPVME CF GPS
    TJNJMBSJUZ UP IVNBOT )VNBOT IBWF BO VOSFBTPOBCMF BNPVOU PG OFPDPS
    QBZ BUUFOUJPO UP JU‰BOE PUIFS QSJNBUFT DMPTFMZ SFMBUFE UP VT BMTP UFOE UP
    *G UIPTF QSJNBUFT BSF TUVEJFE NPSF JOUFOTFMZ # WBMVFT XJMM CF NJTTJOH N

    View Slide

  44. Missing Not At Random
    • MNAR: K is unconditionally
    dependent on R_B
    • If you can model R_B, might be
    okay
    • No guarantees
    .*44*/( %"5" "/% 05)&3 0110356/*5*&4
    DBMMFE ĺĶŀŀĶĻĴ ĻļŁ ĮŁ ĿĮĻıļĺ ./"3
    * LOPX‰UIFTF UFSNT
    GPDVT PO JT UIBU ./"3 BSJTFT XIFO UIFSF JT OP TFU PG WBSJBCMFT UP DPO
    CBDLEPPST UISPVHI 3#
    -PUT PG EJČFSFOU HSBQIT DBO MFBE UP UIBU )
    B
    B_obs
    K
    M
    R_B
    U1
    U2
    /PX JU JTOU UIF # WBMVFT UIFNTFMWFT UIBU QSPEVDF NJTTJOHOFTT 3BUI
    WBSJBCMF 6 UIBU JOĘVFODFT CPUI # BOE NJTTJOHOFTT 6 DPVME CF G
    TJNJMBSJUZ UP IVNBOT )VNBOT IBWF BO VOSFBTPOBCMF BNPVOU PG OFP
    QBZ BUUFOUJPO UP JU‰BOE PUIFS QSJNBUFT DMPTFMZ SFMBUFE UP VT BMTP UFOE
    *G UIPTF QSJNBUFT BSF TUVEJFE NPSF JOUFOTFMZ # WBMVFT XJMM CF NJTTJO

    View Slide

  45. MISSING COMPLETELY
    AT RANDOM
    MISSING
    AT RANDOM
    MISSING NOT
    AT RANDOM
    H*
    A
    D
    H
    H*
    A
    D
    H
    H*
    A
    D
    H
    DOG EATS
    ANY
    HOMEWORK
    DOG EATS
    STUDENTS’
    HOMEWORK
    DOG EATS
    BAD
    HOMEWORK
    H: Homework
    H*: Homework with missing values
    A: Attribute of student
    D: Dog (missingness mechanism)

    View Slide

  46. Milk imputation
    • data(milk)
    • 12 missing values for neocortex
    • Suppose values are Missing At
    Random (MAR)
    • Distribution of observed values
    provides information
    • Can use to impute missing values
    • Same procedure for MCAR
    kcal.per.g mass neocortex.perc
    1 0.49 1.95 55.16
    2 0.51 2.09 NA
    3 0.46 2.51 NA
    4 0.48 1.62 NA
    5 0.60 2.19 NA
    6 0.47 5.25 64.54
    7 0.56 5.37 64.54
    8 0.89 2.51 67.64
    9 0.91 0.71 NA
    10 0.92 0.68 68.85
    11 0.80 0.12 58.85
    12 0.46 0.47 61.69
    13 0.71 0.32 60.32
    14 0.71 0.60 NA
    15 0.73 3.47 NA
    16 0.68 1.55 69.97
    17 0.72 7.08 NA
    18 0.97 3.24 70.41
    19 0.79 7.94 NA
    20 0.84 12.30 73.40
    21 0.48 7.59 NA
    22 0.62 5.37 67.53
    23 0.51 10.72 NA
    24 0.54 35.48 71.26
    25 0.49 79.43 72.60
    26 0.53 97.72 NA
    27 0.48 40.74 70.24
    28 0.55 33.11 76.30
    29 0.71 54.95 75.49

    View Slide

  47. Milk energy MAR
    • Consider just neocortex variable:
    • Q: What is your best guess of each
    missing value?
    • A: Posterior distribution derived from
    remaining data
    neocortex.perc
    1 55.16
    2 NA
    3 NA
    4 NA
    5 NA
    6 64.54
    7 64.54
    8 67.64
    9 NA
    10 68.85
    11 58.85
    12 61.69
    13 60.32
    14 NA
    15 NA
    16 69.97
    17 NA
    18 70.41
    19 NA
    20 73.40
    21 NA
    22 67.53
    23 NA
    24 71.26
    25 72.60
    26 NA
    27 70.24
    28 76.30
    29 75.49

    View Slide

  48. Milk energy MCAR
    • Place a unique parameter for each
    missing value
    • NC1 ... NC12
    • These are values to be imputed
    neocortex.perc
    1 55.16
    2 NC1
    3 NC2
    4 NC3
    5 NC4
    6 64.54
    7 64.54
    8 67.64
    9 NC5
    10 68.85
    11 58.85
    12 61.69
    13 60.32
    14 NC6
    15 NC7
    16 69.97
    17 NC8
    18 70.41
    19 NC9
    20 73.40
    21 NC10
    22 67.53
    23 NC11
    24 71.26
    25 72.60
    26 NC12
    27 70.24
    28 76.30
    29 75.49

    View Slide

  49. Milk energy MAR: model
    ćF PCTUBDMF JO QSBDUJDF JT UIBU XF IBWF UP DPODFJWF PG UIF QSFEJDUPS OPX BT B NJYFE WFDUP
    G EBUB BOE QBSBNFUFST *O PVS DBTF UIF WBSJBCMF XJUI NJTTJOH WBMVFT JT OFPDPSUFY QFSDFO
    "HBJO XFMM DBMM JU # GPS iCSBJOw
    # = [., #, #, #, ., ., ..., ., .]
    PS FWFSZ JOEFY J BU XIJDI UIFSF JT B NJTTJOH WBMVF UIFSF JT BMTP B QBSBNFUFS #J
    UIBU XJMM GPSN
    QPTUFSJPS EJTUSJCVUJPO GPS JU
    .*44*/( %"5"
    ćJT JT UIF NPEFM XF OFFE XJUI UIF OFPDPSUFY QJFDFT JO CMVF
    ,J ∼ /PSNBM(µJ, σ) [distribution
    µJ = α + β#
    #J + β.
    MPH .J
    #J ∼ /PSNBM(ν, σ#) [distribution for
    α ∼ /PSNBM(, .)
    β# ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    ν ∼ /PSNBM(., )
    σ# ∼ &YQPOFOUJBM()
    /PUF UIBU XIFO #J
    JT PCTFSWFE UIFO UIF UIJSE MJOF BCPWF JT B MJLFMJIPPE KVTU MJLF B
    SFHSFTTJPO ćF NPEFM MFBSOT UIF EJTUSJCVUJPOT PG ν BOE σ#
    UIBU BSF DPOTJTUFOU X

    View Slide

  50. Milk energy MAR: model
    ćF PCTUBDMF JO QSBDUJDF JT UIBU XF IBWF UP DPODFJWF PG UIF QSFEJDUPS OPX BT B NJYFE WFDUP
    G EBUB BOE QBSBNFUFST *O PVS DBTF UIF WBSJBCMF XJUI NJTTJOH WBMVFT JT OFPDPSUFY QFSDFO
    "HBJO XFMM DBMM JU # GPS iCSBJOw
    # = [., #, #, #, ., ., ..., ., .]
    PS FWFSZ JOEFY J BU XIJDI UIFSF JT B NJTTJOH WBMVF UIFSF JT BMTP B QBSBNFUFS #J
    UIBU XJMM GPSN
    QPTUFSJPS EJTUSJCVUJPO GPS JU
    .*44*/( %"5"
    ćJT JT UIF NPEFM XF OFFE XJUI UIF OFPDPSUFY QJFDFT JO CMVF
    ,J ∼ /PSNBM(µJ, σ) [distribution
    µJ = α + β#
    #J + β.
    MPH .J
    #J ∼ /PSNBM(ν, σ#) [distribution for
    α ∼ /PSNBM(, .)
    β# ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    ν ∼ /PSNBM(., )
    σ# ∼ &YQPOFOUJBM()
    /PUF UIBU XIFO #J
    JT PCTFSWFE UIFO UIF UIJSE MJOF BCPWF JT B MJLFMJIPPE KVTU MJLF B
    SFHSFTTJPO ćF NPEFM MFBSOT UIF EJTUSJCVUJPOT PG ν BOE σ#
    UIBU BSF DPOTJTUFOU X
    linear model using
    mix of observed and
    imputed values

    View Slide

  51. Milk energy MAR: model
    ćF PCTUBDMF JO QSBDUJDF JT UIBU XF IBWF UP DPODFJWF PG UIF QSFEJDUPS OPX BT B NJYFE WFDUP
    G EBUB BOE QBSBNFUFST *O PVS DBTF UIF WBSJBCMF XJUI NJTTJOH WBMVFT JT OFPDPSUFY QFSDFO
    "HBJO XFMM DBMM JU # GPS iCSBJOw
    # = [., #, #, #, ., ., ..., ., .]
    PS FWFSZ JOEFY J BU XIJDI UIFSF JT B NJTTJOH WBMVF UIFSF JT BMTP B QBSBNFUFS #J
    UIBU XJMM GPSN
    QPTUFSJPS EJTUSJCVUJPO GPS JU
    .*44*/( %"5"
    ćJT JT UIF NPEFM XF OFFE XJUI UIF OFPDPSUFY QJFDFT JO CMVF
    ,J ∼ /PSNBM(µJ, σ) [distribution
    µJ = α + β#
    #J + β.
    MPH .J
    #J ∼ /PSNBM(ν, σ#) [distribution for
    α ∼ /PSNBM(, .)
    β# ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    ν ∼ /PSNBM(., )
    σ# ∼ &YQPOFOUJBM()
    /PUF UIBU XIFO #J
    JT PCTFSWFE UIFO UIF UIJSE MJOF BCPWF JT B MJLFMJIPPE KVTU MJLF B
    SFHSFTTJPO ćF NPEFM MFBSOT UIF EJTUSJCVUJPOT PG ν BOE σ#
    UIBU BSF DPOTJTUFOU X
    when obs, likelihood;
    when imputed, prior
    mean neocortex
    (to be estimated)
    std dev of neocortex
    (to be estimated)

    View Slide

  52. Fitting
    m15.3 <- ulam(
    alist(
    K ~ dnorm( mu , sigma ),
    mu <- a + bB*B + bM*M,
    B ~ dnorm( nu , sigma_B ),
    c(a,nu) ~ dnorm( 0 , 0.5 ),
    c(bB,bM) ~ dnorm( 0, 0.5 ),
    sigma_B ~ dexp( 1 ),
    sigma ~ dexp( 1 )
    ) , data=dat_list , chains=4 , cores=4 )
    ulam detects NA values and tries to cope.
    More explicit example in text.

    View Slide

  53. 8IFO ZPV TUBSU UIF NPEFM JU XJMM OPUJGZ ZPV UIBU JU GPVOE WBMVFT BOE JT USZ
    UIFN 0ODF JU ĕOJTIFT UBLF B MPPL BU UIF QPTUFSJPS TVNNBSZ
    3 DPEF
    +- $.ΰ (ͤ͠΀͢ ΁ +/#Ѵ͡ α
    ( ) . ͤ΀ͤњ ͨͣ΀ͤњ )ά !! #/
    )0 Ζ͟΀ͣ͟ ͟΀͟͡ Ζ͟΀ͥ͢ ͟΀ͧ͡ ͧͥ͟͠ ͠
    ͟΀͟͡ ͟΀ͦ͠ Ζ͟΀ͤ͡ ͟΀ͧ͡ ͧ͟͟͡ ͠
    Ζ͟΀ͤͤ ͟΀͟͡ Ζ͟΀ͧͦ Ζ͟΀͡͡ ͧͣ͟͠ ͠
    ͟΀ͤ͟ ͟΀͢͡ ͟΀͠͡ ͟΀ͧͥ ͧͧ͠ ͠
    .$"(ά ͠΀͟͠ ͟΀ͦ͠ ͟΀ͦͦ ͠΀͢͠ ͣͤ͠͡ ͠
    .$"( ͟΀ͧͣ ͟΀ͣ͠ ͟΀ͥͣ ͠΀ͦ͟ ͟͟͠͠ ͠
    ά$(+0/ β͠γ Ζ͟΀ͤͦ ͟΀ͧͦ Ζ͠΀ͨ͢ ͟΀ͧ͟ ͣͧ͟͡ ͠
    ά$(+0/ β͡γ Ζ͟΀ͦ͟ ͟΀ͨ͡ Ζ͡΀ͣ͠ ͟΀ͦͤ ͨͣ͟͠ ͠
    ά$(+0/ β͢γ Ζ͟΀ͦ͡ ͟΀ͨͥ Ζ͡΀ͣ͡ ͟΀ͧ͠ ͧ͢͠͡ ͠
    ά$(+0/ βͣγ Ζ͟΀͢͠ ͟΀ͧͦ Ζ͠΀ͥͤ ͠΀͟͠ ͣͤ͢͡ ͠
    ά$(+0/ βͤγ ͟΀ͣͤ ͟΀ͧͧ Ζ͟΀ͧͧ ͠΀ͧ͢ ͣͥ͡͠ ͠
    ά$(+0/ βͥγ Ζ͟΀ͨ͠ ͟΀ͧͨ Ζ͠΀ͤͥ ͠΀͢͡ ͧͦ͡͠ ͠
    ά$(+0/ βͦγ ͟΀͡͡ ͟΀ͧͦ Ζ͠΀ͥ͠ ͠΀ͤͨ ͣͧ͡͡ ͠
    ά$(+0/ βͧγ ͟΀ͧ͡ ͟΀ͧͤ Ζ͠΀͟͠ ͠΀ͥ͡ ͦ͡͡͠ ͠
    ά$(+0/ βͨγ ͟΀ͤ͠ ͟΀ͧͦ Ζ͟΀ͨ͠ ͠΀ͧͧ ͧͦͦ͡ ͠
    ά$(+0/ β͟͠γ Ζ͟΀ͣͤ ͟΀ͧͨ Ζ͠΀ͧͣ ͟΀ͨͣ ͥͥͧ͡ ͠
    ά$(+0/ β͠͠γ Ζ͟΀ͧ͡ ͟΀ͧͧ Ζ͠΀ͥͨ ͠΀ͥ͠ ͣͣ͟͡ ͠
    ά$(+0/ β͠͡γ ͟΀ͤ͠ ͟΀ͨ͠ Ζ͠΀͢͢ ͠΀ͥ͟ ͣͧ͢͡ ͠
    &BDI PG UIF JNQVUFE EJTUSJCVUJPOT GPS NJTTJOH WBMVFT JT TIPXO IFSF BMPOH
    OBSZ SFHSFTTJPO QBSBNFUFST BCPWF UIFN 5P TFF IPX JODMVEJOH BMM DBTFT IBT JN

    View Slide

  54. Compared to complete-cases
    ͟΀ͧͧ ͟΀ͨ͠ ͟΀ͥͣ ͠΀͟͡ ͣͤ͠͠ ͠
    JOH UIJT QPTUFSJPS UP UIF QSFWJPVT XJMM CF FBTJFS XJUI B QMPU
    * !/ΰ(ͤ͠΀͢΁(ͤ͠΀ͣα ΁ +-.ѴΰΊΊ΁ΊΊα α
    m15.3
    m15.4
    m15.3
    m15.4
    bB
    bM
    -1.0 -0.5 0.0 0.5 1.0
    Value
    EFM UIBU JNQVUFT UIF NJTTJOH WBMVFT (ͤ͠΀͢ IBT OBSSPXFS NBSHJOBM EJTUSJCVUJPOT
    FDUT )PX DPVME UIJT IBQQFO 8F VTFE NPSF JOGPSNBUJPO UIF WBMVFT PG CPEZ N
    OPU NJTTJOH CVU BSF EJTDBSEFE CZ (ͤ͠΀ͣ ćFTF WBMVFT TVHHFTU B TMJHIUMZ TNBM
    F PG CPEZ NBTT BOE UIJT BMTP DBTDBEFT JOUP
    EP TPNF QMPUUJOH UP WJTVBMJ[F XIBUT IBQQFOFE IFSF
    m15.3: full sample (with imputation)
    m15.4: complete-cases only

    View Slide

  55. .*44*/( %"5" "/% 05)&3 0110356/*5*&4
    -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
    -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
    neocortex percent (std)
    kcal milk (std)
    -2 -1 0 1 2
    -2.0 -1.0 0.0 0.5 1.0 1.5
    log body mass (std)
    neocortex percent (std)
    'ĶĴłĿIJ Ɖƍƌ -Fę *OGFSSFE EJTUSJCVUJPO PG NJML FOFSHZ WFSUJDBM
    BOE OFP
    DPSUFY QSPQPSUJPO IPSJ[POUBM
    XJUI JNQVUFE WBMVFT TIPXO CZ PQFO QPJOUT
    ćF MJOF TFHNFOUT BSF QPTUFSJPS DPNQBUJCJMJUZ JOUFSWBMT 3JHIU *O
    GFSSFE EJTUSJCVUJPO CFUXFFO UIF UXP QSFEJDUPST OFPDPSUFY QSPQPSUJPO BOE
    MPH NBTT *NQVUFE WBMVFT BHBJO TIPXO CZ PQFO QPJOUT
    Imputed values track regression trend

    View Slide

  56. .*44*/( %"5" "/% 05)&3 0110356/*5*&4
    -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
    -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
    neocortex percent (std)
    kcal milk (std)
    -2 -1 0 1 2
    -2.0 -1.0 0.0 0.5 1.0 1.5
    log body mass (std)
    neocortex percent (std)
    'ĶĴłĿIJ Ɖƍƌ -Fę *OGFSSFE EJTUSJCVUJPO PG NJML FOFSHZ WFSUJDBM
    BOE OFP
    DPSUFY QSPQPSUJPO IPSJ[POUBM
    XJUI JNQVUFE WBMVFT TIPXO CZ PQFO QPJOUT
    ćF MJOF TFHNFOUT BSF QPTUFSJPS DPNQBUJCJMJUZ JOUFSWBMT 3JHIU *O
    GFSSFE EJTUSJCVUJPO CFUXFFO UIF UXP QSFEJDUPST OFPDPSUFY QSPQPSUJPO BOE
    MPH NBTT *NQVUFE WBMVFT BHBJO TIPXO CZ PQFO QPJOUT
    Imputed values do not track other predictor!

    View Slide

  57. Results
    • Observed neocortex positively
    associated with observed body
    mass
    • Imputed neocortex NOT
    associated with observed body
    mass
    • Can do better
    • Imputation model can use full
    causal model
    -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
    -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
    neocortex percent (std)
    kcal milk (std)
    -2 -1 0 1 2
    -2.0 -1.0 0.0 0.5 1.0 1.5
    log body mass (std)
    neocortex percent (std)
    'ĶĴłĿIJ Ɖƍƌ -Fę *OGFSSFE EJTUSJCVUJPO PG NJML FOFSHZ WFSUJDBM
    BOE OFP
    DPSUFY QSPQPSUJPO IPSJ[POUBM
    XJUI JNQVUFE WBMVFT TIPXO CZ PQFO QPJOUT
    ćF MJOF TFHNFOUT BSF QPTUFSJPS DPNQBUJCJMJUZ JOUFSWBMT 3JHIU *O
    GFSSFE EJTUSJCVUJPO CFUXFFO UIF UXP QSFEJDUPST OFPDPSUFY QSPQPSUJPO BOE
    MPH NBTT *NQVUFE WBMVFT BHBJO TIPXO CZ PQFO QPJOUT
    +'*/ΰ /ά'$./с ΁ /ά'$./с ΁ +#Ѵͥ͠ ΁ *'Ѵ-)"$͡ ΁
    4'ѴΊ) **-/ 3 + - )/ ΰ./αΊ ΁ 3'ѴΊ'*" *4 (.. ΰ./αΊ α
    $ ѶΖ /ά'$./сβ($..ά$3γ
    .*44*/( %"5"
    "OPUIFS QPTTJCJMJUZ JT UIBU TPNF PUIFS WBSJBCMF JOĘVFODFT UIF N
    B
    B_obs
    K
    M
    R_B
    U

    View Slide

  58. Full Flavor Imputation
    .*44*/( %"5"
    JUZ JT UIBU TPNF PUIFS WBSJBCMF JOĘVFODFT UIF NJTTJOHOFTT QSPDFTT
    B
    B_obs
    K
    M
    R_B
    U
    #
    XIJDI NFBOT GPS FYBNQMF UIBU TQFDJFT XJUI TNBMMFS CPEJFT BSF NPSF
    F NJTTJOH WBMVFT JO #ļįŀ
    ćJT DPVME IBQQFO JG SFTFBSDIFST BSF MFTT JOUFS
    T BOE TP EP OPU PęFO HP UISPVHI UIF USPVCMF PG NBLJOH EFUBJMFE CSBJO
    FN 8IBU IBQQFOT JO UIJT DBTF ćFSF JT OPX B CBDLEPPS QBUI GSPN
    4P UIF NJTTJOHOFTT QSPDFTT DBO DPOGPVOE PVS JOGFSFODF VOMFTT XF DBO
    .*44*/( %"5"
    ćJT JT UIF NPEFM XF OFFE XJUI UIF OFPDPSUFY QJFDFT JO
    ,J ∼ /PSNBM(µJ, σ)
    µJ = α + β#
    #J + β.
    MPH .J
    #J ∼ /PSNBM(ν, σ#)
    α ∼ /PSNBM(, .)
    β# ∼ /PSNBM(, .)
    β. ∼ /PSNBM(, .)
    σ ∼ &YQPOFOUJBM()
    ν ∼ /PSNBM(., )
    σ# ∼ &YQPOFOUJBM()
    /PUF UIBU XIFO #J
    JT PCTFSWFE UIFO UIF UIJSE MJOF BCPWF JT B M
    SFHSFTTJPO ćF NPEFM MFBSOT UIF EJTUSJCVUJPOT PG ν BOE σ#
    U
    K
    #PUI CPEZ NBTT . BOE OFPDPSUFY # JOĘVFODF NJML FOFSHZ , "OE
    XJUI POF BOPUIFS UIPVHI TPNF VOLOPXO NFDIBOJTN 6 ćJT NFBO
    NJTTJOH WBMVFT GPS # XF NJHIU EP B CFUUFS KPC JG XF VTF LOPXMFEHF PG U
    6 4P MFUT CVJME B NPEFM OPX UIBU CFUUFS NBUDIFT UIF %"( BCPWF
    ćF OPUJPO JT UP DIBOHF UIF JNQVUBUJPO MJOF PG UIF NPEFM GSPN UI
    #J ∼ /PSNBM(ν, σ#)
    UP B CJWBSJBUF OPSNBM UIBU JODMVEFT CPUI . BOE #
    (.J, #J) ∼ .7/PSNBM((µ., µ#), 4)
    ćF 4 NBUSJY JT BOPUIFS DPWBSJBODF NBUSJY BOE JU XJMM NFBTVSF UIF
    BOE # VTJOH UIF PCTFSWFE DBTFT BOE UIFO VTF UIBU DPSSFMBUJPO UP JOGF
    )FSFT UIF 0'( JNQMFNFOUBUJPO ćJT JT DPNQMFY DPEF CFDBVT
    B WBSJBCMF UIBU JODMVEFT CPUI UIF PCTFSWFE . WBMVFT BOE UIF NFSHF
    JNQVUFE # WBMVFT *MM BMTP EP UIF NFSHJOH NPSF FYQMJDJUMZ *O UIF 0
    FOE *MM XBML UISPVHI IPX UIF 4UBO DPEF XPSLT
    (ͤ͠΀ͤ ѶΖ 0'(ΰ
    '$./ΰ

    View Slide

  59. m15.5 <- ulam(
    alist(
    # K as function of B and M
    K ~ dnorm( mu , sigma ),
    mu <- a + bB*B_merge + bM*M,
    # M and B correlation
    MB ~ multi_normal( c(muM,muB) , Rho_BM , Sigma_BM ),
    matrix[29,2]:MB <<- append_col( M , B_merge ),
    # define B_merge as mix of observed and imputed values
    vector[29]:B_merge <- merge_missing( B , B_impute ),
    # priors
    c(a,muB,muM) ~ dnorm( 0 , 0.5 ),
    c(bB,bM) ~ dnorm( 0, 0.5 ),
    sigma ~ dexp( 1 ),
    Rho_BM ~ lkj_corr(2),
    Sigma_BM ~ exponential(1)
    ) , data=dat_list , chains=4 , cores=4 )
    Full Flavor Imputation
    "OPUIFS QPTTJCJMJUZ JT UIBU TPNF PUIFS WBSJBCMF JOĘVFODFT UIF
    B
    B_obs
    K
    M
    R_B
    U
    /PX . JOĘVFODFT 3#
    XIJDI NFBOT GPS FYBNQMF UIBU TQFDJFT XJUI
    PS MFTT
    MJLFMZ UP IBWF NJTTJOH WBMVFT JO #ļįŀ
    ćJT DPVME IBQQFO JG
    FTUFE JO TNBMM TQFDJFT BOE TP EP OPU PęFO HP UISPVHI UIF USPVCMF
    NFBTVSFNFOUT GPS UIFN 8IBU IBQQFOT JO UIJT DBTF ćFSF JT OP
    #ļįŀ
    UIPVHI 3#
    UP , 4P UIF NJTTJOHOFTT QSPDFTT DBO DPOGPVOE PV
    DMPTF UIF CBDLEPPS *O UIJT DBTF XF DBO TIVU UIF CBDLEPPS CZ DPOE
    IBWF EPOF UIJT BOZXBZ CFDBVTF XF XBOU UP UIF EJSFDU JOĘVFODF PG
    JOHOFTT JT LOPXO CZ BOPUIFS VOGPSUVOBUFMZ BXLXBSE OBNF ĺĶŀŀĶĻ
    8F EPOU OFFE UP EJTDPWFS UIF NJTTJOHOFTT QSPDFTT BCPWF #VU U
    OFFE UP EP 8F OFFE UP JNQVUF NJTTJOH WBMVFT GPS #ļįŀ
    8IZ *G X
    TQFDJFT
    XJUI BOZ NJTTJOH WBMVFT UIFO XF BSF QPMMVUJOH UIF PUIFS W
    QSPDFTT ćJT EJEOU IBQQFO JO UIF QSFWJPVT .$"3
    FYBNQMF CF
    EJEOU IBWF BOZ BTTPDJBUJPO XJUI UIF PUIFS WBSJBCMFT $BTF EFMFUJPO

    View Slide

  60. % 05)&3 0110356/*5*&4
    -2 -1 0 1 2
    -2.0 -1.0 0.0 0.5 1.0 1.5
    log body mass (std)
    neocortex percent (std)
    T TIPXO JO 'ĶĴłĿIJ Ɖƍƌ CVU OPX GPS UIF
    IF BTTPDJBUJPO CFUXFFO UIF QSFEJDUPST
    CFUXFFO QSFEJDUPST IBT CFFO VTFE UP JO
    NJML FOFSHZ BOE UIF JNQVUFE WBMVFT
    neocortex percent (std)
    'ĶĴłĿIJ Ɖƍƍ 4BNF SFMBUJPOTIJQT B
    JNQVUBUJPO NPEFM UIBU FTUJNBUFT
    ćF JOGPSNBUJPO JO UIF BTTPDJBUJPO
    GFS B TUSPOHFS SFMBUJPOTIJQ CFUXFFO
    ( ) . ͤ΀ͤњ ͨͣ΀ͤњ )ά
    Ζ͟΀ͥͤ ͟΀͡͡ Ζ͠΀͟͟ Ζ͟΀͢͟ ͠
    ͟΀ͤͧ ͟΀ͥ͡ ͟΀ͥ͠ ͟΀ͨͨ ͠
    #*άβ͠΁͠γ ͠΀͟͟ ͟΀͟͟ ͠΀͟͟ ͠΀͟͟
    #*άβ͠΁͡γ ͟΀ͥ͟ ͟΀͢͠ ͟΀ͦ͢ ͟΀ͦͧ ͠
    #*άβ͡΁͠γ ͟΀ͥ͟ ͟΀͢͠ ͟΀ͦ͢ ͟΀ͦͧ ͠
    #*άβ͡΁͡γ ͠΀͟͟ ͟΀͟͟ ͠΀͟͟ ͠΀͟͟ ͠
    ćF TMPQFT BOE IBWFOU DIBOHFE NVDI
    8FSF JOUFSFTUFE JO UIBU DPSSFMBUJPO BOE IP
    UFSJPS DPSSFMBUJPO JT RVJUF TUSPOH PO BWF
    CFUXFFO . BOE # UIBU XF BMSFBEZ LOFX FYJ
    8IBU EPFT UIJT DPSSFMBUJPO EP UP UIF J
    DPEF BT CFGPSF 'ĶĴłĿIJ Ɖƍƍ EJTQMBZT UIF TB
    JNQVUBUJPO NPEFM 0O UIF SJHIU ZPV DBO
    QSFTFSWF UIF QPTJUJWF BTTPDJBUJPO CFUXFFO O
    UIJT EPFTOU NBLF B CJH EJČFSFODF JO UIF JOG

    View Slide

  61. % 05)&3 0110356/*5*&4
    -2 -1 0 1 2
    -2.0 -1.0 0.0 0.5 1.0 1.5
    log body mass (std)
    neocortex percent (std)
    TIPXO JO 'ĶĴłĿIJ Ɖƍƌ CVU OPX GPS UIF
    IF BTTPDJBUJPO CFUXFFO UIF QSFEJDUPST
    CFUXFFO QSFEJDUPST IBT CFFO VTFE UP JO
    NJML FOFSHZ BOE UIF JNQVUFE WBMVFT
    .*44*/( %"5" "/% 05)&3 0110356/*5*&4
    -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
    -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
    neocortex percent (std)
    kcal milk (std)
    -2 -1 0 1 2
    -2.0 -1.0 0.0 0.5 1.0 1.5
    log body mass (std)
    neocortex percent (std)
    'ĶĴłĿIJ Ɖƍƌ -Fę *OGFSSFE EJTUSJCVUJPO PG NJML FOFSHZ WFSUJDBM
    BOE OFP
    DPSUFY QSPQPSUJPO IPSJ[POUBM
    XJUI JNQVUFE WBMVFT TIPXO CZ PQFO QPJOUT
    ćF MJOF TFHNFOUT BSF QPTUFSJPS DPNQBUJCJMJUZ JOUFSWBMT 3JHIU *O
    GFSSFE EJTUSJCVUJPO CFUXFFO UIF UXP QSFEJDUPST OFPDPSUFY QSPQPSUJPO BOE
    MVNormal Normal

    View Slide

  62. Missing data
    • Can also impute discrete values,
    but need another technique
    (see text)
    • Extends to many model types:
    • Mark-recapture, occupancy
    (presence/absence)
    • Latent-state models (hidden
    Markov models)

    View Slide

  63. Final Homework
    • A little imputation practice
    • Finish for complete sense of
    accomplishment
    EJČFSFODFT .BLF XIBUFWFS BEEJUJPOBM DBMDVMBUJPOT
    QVSTVJU PG BO BOTXFS
    #PT QSJNJHFOJVT

    View Slide

  64. The Golem of Prague
    “Even the most perfect of Golem, risen to
    life to protect us, can easily change into a
    destructive force. Therefore let us treat
    carefully that which is strong, just as we
    bow kindly and patiently to that which
    is weak.”
    Rabbi Judah Loew ben
    Bezalel (1512–1609)
    From Breath of Bones: A Tale of the Golem

    View Slide