Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking 2023 - Lecture 03

Statistical Rethinking 2023 - Lecture 03

Richard McElreath

January 09, 2023
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Statistical Rethinking
    3. Geocentric Models
    2023

    View full-size slide

  2. Mars: June 2020 until Feb 2021 — Tunç Tezel — https://vimeo.com/user48630149

    View full-size slide

  3. MARS
    EARTH
    Prediction
    Without
    Explanation
    Geocentric Model

    View full-size slide

  4. Giuseppe Piazzi 1746–1826 Palermo Circle
    1.1.1801

    View full-size slide

  5. Carl Friedrich Gauss 1777–1855 (portrait in 1803)

    View full-size slide

  6. 1809 Bayesian argument
    for normal error and
    least-squares estimation

    View full-size slide

  7. Linear Regression
    Geocentric: Describes associations, makes
    predictions, but mechanistically wrong
    Gaussian: Abstracts from generative error
    model, replaces with normal distribution,
    mechanistically silent
    Useful when handled with care
    Many special cases: ANOVA, ANCOVA,
    t-test, others
    From Breath of Bones: A Tale of the Golem

    View full-size slide

  8. Positions Distribution

    View full-size slide

  9. Positions Distribution

    View full-size slide

  10. Why Normal?
    Two arguments
    (1) Generative: Summed fluctuations tend towards normal
    distribution
    (2) Inferential: For estimating mean and variance, normal
    distribution is least informative distribution (maxent)
    Variable does not have to be normally distributed for normal
    model to be useful. It’s a machine for estimating mean/variance.

    View full-size slide

  11. Making Geocentric Models
    Skill development goals:
    (1) Language for representing models
    (2) Calculate posterior distributions
    with multiple unknowns
    (3) Constructing & understanding
    linear models

    View full-size slide

  12. Owl-drawing workflow
    (1) State a clear question
    (2) Sketch your causal assumptions
    (3) Use the sketch to define a generative model
    (4) Use generative model to build estimator
    (5) Profit

    View full-size slide

  13. Linear Regression
    Drawing the Owl
    (1) Question/goal/estimand
    (2) Scientific model
    (3) Statistical model(s)
    (4) Validate model
    (5) Analyze data

    View full-size slide

  14. Linear Regression
    Drawing the Owl
    (1) Question/goal/estimand
    (2) Scientific model
    (3) Statistical model(s)
    (4) Validate model
    (5) Analyze data
    library(rethinking)
    data(Howell1)
    60 80 100 140 180
    10 20 30 40 50 60
    height (cm)
    weight (kg)

    View full-size slide

  15. Linear Regression
    Drawing the Owl
    (1) Question/goal/estimand
    Describe association between
    weight and height
    library(rethinking)
    data(Howell1)
    60 80 100 140 180
    10 20 30 40 50 60
    height (cm)
    weight (kg)

    View full-size slide

  16. Linear Regression
    Drawing the Owl
    (1) Question/goal/estimand
    Describe association between
    ADULT weight and height
    data(Howell1)
    d <- Howell1[Howell1$age>=18,]
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)

    View full-size slide

  17. Linear Regression
    (2) Scientific model
    How does height influence
    weight?
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    data(Howell1)
    d <- Howell1[Howell1$age>=18,]
    H W
    W = f(H)
    “Weight is some function of height”

    View full-size slide

  18. Generative models
    Options
    (1) Dynamic: Incremental growth of
    organism; both mass and height (length)
    derive from growth pattern; Gaussian
    variation result of summed fluctuations
    (2) Static: Changes in height result in
    changes in weight, but no mechanism;
    Gaussian variation result of growth history

    View full-size slide

  19. (2) Scientific model
    How does height influence
    weight?
    H W
    W = f(H,U)
    “Weight is some function of height and unobserved stuff”
    U
    unobserved

    View full-size slide

  20. For adults, weight is a proportion of height plus the influence
    of unobserved causes:
    Generative model: H → W
    H W U
    UIF QSPQPSUJPOBMJUZ DPOTUBOU *G JUT GPS FYBNQMF UIFO B QFSTPO XIP
    LH
    VTMZ OPU FWFSZPOF XJUI UIF TBNF IFJHIU IBT FYBDUMZ UIF TBNF XFJHIU
    TIPVME SFĘFDU UIJT 4P XF OFFE UP JOUSPEVDF TPNF WBSJBUJPO 8FMM VTF
    ćF XBZ *MM EP UIJT JT UP TJNVMBUF PVS VOPCTFSWFE 6 WBSJBCMF GSPN UI
    ćFO XF DBO DPNQVUF B QFSTPOT XFJHIU BT
    8 = β) + 6
    F DPEF UP EP UIJT
    n to simulate weights of individuals from height
    t <- function(H,b,sd) {

    View full-size slide

  21. Generative model: H → W
    ćF XBZ *MM EP UIJT JT UP TJNVMBUF PVS VOPCTFSWFE 6 WBSJBCMF GSPN UI
    ćFO XF DBO DPNQVUF B QFSTPOT XFJHIU BT
    8 = β) + 6
    F DPEF UP EP UIJT
    n to simulate weights of individuals from height
    t <- function(H,b,sd) {
    rnorm( length(H) , 0 , sd )
    b*H + U
    n(W)
    F E
    XIFSF β JT UIF QSPQPSUJPOBMJUZ DPOTUBOU *G JUT GPS FYBNQMF UIFO B QFSTPO XIP JT DN
    UBMM XFJHIT LH
    0CWJPVTMZ OPU FWFSZPOF XJUI UIF TBNF IFJHIU IBT FYBDUMZ UIF TBNF XFJHIU "OE UIF
    TJNVMBUJPO TIPVME SFĘFDU UIJT 4P XF OFFE UP JOUSPEVDF TPNF WBSJBUJPO 8FMM VTF (BVTTJBO
    WBSJBUJPO ćF XBZ *MM EP UIJT JT UP TJNVMBUF PVS VOPCTFSWFE 6 WBSJBCMF GSPN UIF QSFWJPVT
    TFDUJPO ćFO XF DBO DPNQVUF B QFSTPOT XFJHIU BT
    8 = β) + 6
    )FSFT TPNF DPEF UP EP UIJT
    3 DPEF
    # function to simulate weights of individuals from height
    sim_weight <- function(H,b,sd) {
    U <- rnorm( length(H) , 0 , sd )
    W <- b*H + U
    return(W)
    }
    ćF BSHVNFOU sd JT UIF TUBOEBSE EFWJBUJPO PG 6 *U EFUFSNJOFT IPX NVDI WBSJBUJPO UIFSF JT
    BSPVOE UIF FYQFDUFE WBMVF PG 8 GPS FBDI ) WBMVF
    5P NBLF UIJT TJNVMBUJPO XPSL XF BMTP OFFE UP TJNVMBUF IFJHIU *MM KVTU VTF B VOJGPSN
    Generative code:

    View full-size slide

  22. F
    )FSFT TPNF DPEF UP EP UIJT
    3 DPEF
    # function to simulate weights of individuals from height
    sim_weight <- function(H,b,sd) {
    U <- rnorm( length(H) , 0 , sd )
    W <- b*H + U
    return(W)
    }
    ćF BSHVNFOU sd JT UIF TUBOEBSE EFWJBUJPO PG 6 *U EFUFSNJOFT IPX NVDI WBSJBUJPO UIFSF JT
    BSPVOE UIF FYQFDUFE WBMVF PG 8 GPS FBDI ) WBMVF
    5P NBLF UIJT TJNVMBUJPO XPSL XF BMTP OFFE UP TJNVMBUF IFJHIU *MM KVTU VTF B VOJGPSN
    EJTUSJCVUJPO PG BEVMU IFJHIU GSPN DN UP DN BT BO FYBNQMF ćFO XF DBO TJNVMBUF BOE
    QMPU PVS TZOUIFUJD QFPQMF
    3 DPEF
    H <- runif( 200 , min=130 , max=170 )
    W <- sim_weight( H , b=0.5 , sd=5 )
    plot( W ~ H , col=2 , lwd=3 )

    W <- b*H + U
    return(W)
    }
    ćF BSHVNFOU sd JT UIF TUBOEBSE EFWJBUJPO PG 6 *U EFUFSNJOFT IPX NVDI WBSJBUJPO UIFSF JT
    BSPVOE UIF FYQFDUFE WBMVF PG 8 GPS FBDI ) WBMVF
    5P NBLF UIJT TJNVMBUJPO XPSL XF BMTP OFFE UP TJNVMBUF IFJHIU *MM KVTU VTF B VOJGPSN
    EJTUSJCVUJPO PG BEVMU IFJHIU GSPN DN UP DN BT BO FYBNQMF ćFO XF DBO TJNVMBUF BOE
    QMPU PVS TZOUIFUJD QFPQMF
    3 DPEF
    H <- runif( 200 , min=130 , max=170 )
    W <- sim_weight( H , b=0.5 , sd=5 )
    plot( W ~ H , col=2 , lwd=3 )

    View full-size slide

  23. QMPU PVS TZOUIFUJD QFPQMF
    3 DPEF
    H <- runif( 200 , min=130 , max=170 )
    W <- sim_weight( H , b=0.5 , sd=5 )
    plot( W ~ H , col=2 , lwd=3 )
    ę
    8)"5 *4 5)& */'-6&/$& 0' )&*()5 0/ 8&*()5
    130 140 150 160 170
    50 60 70 80 90
    H
    W
    ćF TJNVMBUJPO DBO QSPEVDF NBOZ EJČFSFOU SFMBUJPOTIJQT &YQFSJNFOU XJUI UIF b QBSBNFUFS

    View full-size slide

  24. Describing models
    Conventional statistical model notation:
    (1) List the variables
    (2) Define each variable as a deterministic or distributional
    function of the other variables

    View full-size slide

  25. Describing models
    F
    8 = β) + 6
    )FSFT TPNF DPEF UP EP UIJT
    3 DPEF
    # function to simulate weights of individuals from height
    sim_weight <- function(H,b,sd) {
    U <- rnorm( length(H) , 0 , sd )
    W <- b*H + U
    return(W)
    }
    ćF BSHVNFOU sd JT UIF TUBOEBSE EFWJBUJPO PG 6 *U EFUFSNJOFT IPX NVDI WBSJBUJPO UIFSF JT
    BSPVOE UIF FYQFDUFE WBMVF PG 8 GPS FBDI ) WBMVF
    5P NBLF UIJT TJNVMBUJPO XPSL XF BMTP OFFE UP TJNVMBUF IFJHIU *MM KVTU VTF B VOJGPSN
    EJTUSJCVUJPO PG BEVMU IFJHIU GSPN DN UP DN BT BO FYBNQMF ćFO XF DBO TJNVMBUF BOE
    QMPU PVS TZOUIFUJD QFPQMF
    3 DPEF
    H <- runif( 200 , min=130 , max=170 )
    W <- sim_weight( H , b=0.5 , sd=5 )
    QBSBNFUFS
    DSJCJOH NPEFMT #FGPSF NPWJOH PO UIF EFWFMPQJOH UIF FTUJNBUPS *
    EBSE XBZ PG EFTDSJCJOH NPEFMT MJLF UIF POF BCPWF VTJOH TUBUJTUJDBM O
    IFMQGVM GPS CPUI HFOFSBUJWF NPEFMT MJLF UIF POF BCPWF BOE GPS TUBU
    XFMM EFWFMPQ "OE JU JT WFSZ DPNNPO JO UIF TDJFODFT
    FJHIU TJNVMBUJPO BCPWF JT EFĕOFE BT
    8J = β)J + 6J
    6J ∼ /PSNBM(, σ)
    )J ∼ 6OJGPSN(, )
    OF JT UIF FRVBUJPO GPS 8 ćF MJUUMF J PO 8 ) BOE 6 JOEJDBUFT BO J

    View full-size slide

  26. EBSE XBZ PG EFTDSJCJOH NPEFMT MJLF UIF POF BCPWF VTJOH TUBUJTUJDBM O
    IFMQGVM GPS CPUI HFOFSBUJWF NPEFMT MJLF UIF POF BCPWF BOE GPS TUBU
    XFMM EFWFMPQ "OE JU JT WFSZ DPNNPO JO UIF TDJFODFT
    FJHIU TJNVMBUJPO BCPWF JT EFĕOFE BT
    8J = β)J + 6J
    6J ∼ /PSNBM(, σ)
    )J ∼ 6OJGPSN(, )
    OF JT UIF FRVBUJPO GPS 8 ćF MJUUMF J PO 8 ) BOE 6 JOEJDBUFT BO J
    OVNCFS :PV DBO SFBE JU MJLF iFBDI JOEJWJEVBMT 8w ćF TFDPOE MJOF E
    G UIPTF VOPCTFSWFE 6 WBMVFT POF GPS FBDI JOEJWJEVBM ćFTF XFSF TJ
    JTUSJCVUJPO XJUI TUBOEBSE EFWJBUJPO σ ćBU ∼ TZNCPM JOEJDBUFT B
    variables definitions

    View full-size slide

  27. EBSE XBZ PG EFTDSJCJOH NPEFMT MJLF UIF POF BCPWF VTJOH TUBUJTUJDBM O
    IFMQGVM GPS CPUI HFOFSBUJWF NPEFMT MJLF UIF POF BCPWF BOE GPS TUBU
    XFMM EFWFMPQ "OE JU JT WFSZ DPNNPO JO UIF TDJFODFT
    FJHIU TJNVMBUJPO BCPWF JT EFĕOFE BT
    8J = β)J + 6J
    6J ∼ /PSNBM(, σ)
    )J ∼ 6OJGPSN(, )
    OF JT UIF FRVBUJPO GPS 8 ćF MJUUMF J PO 8 ) BOE 6 JOEJDBUFT BO J
    OVNCFS :PV DBO SFBE JU MJLF iFBDI JOEJWJEVBMT 8w ćF TFDPOE MJOF E
    G UIPTF VOPCTFSWFE 6 WBMVFT POF GPS FBDI JOEJWJEVBM ćFTF XFSF TJ
    JTUSJCVUJPO XJUI TUBOEBSE EFWJBUJPO σ ćBU ∼ TZNCPM JOEJDBUFT B
    individuals

    View full-size slide

  28. EBSE XBZ PG EFTDSJCJOH NPEFMT MJLF UIF POF BCPWF VTJOH TUBUJTUJDBM O
    IFMQGVM GPS CPUI HFOFSBUJWF NPEFMT MJLF UIF POF BCPWF BOE GPS TUBU
    XFMM EFWFMPQ "OE JU JT WFSZ DPNNPO JO UIF TDJFODFT
    FJHIU TJNVMBUJPO BCPWF JT EFĕOFE BT
    8J = β)J + 6J
    6J ∼ /PSNBM(, σ)
    )J ∼ 6OJGPSN(, )
    OF JT UIF FRVBUJPO GPS 8 ćF MJUUMF J PO 8 ) BOE 6 JOEJDBUFT BO J
    OVNCFS :PV DBO SFBE JU MJLF iFBDI JOEJWJEVBMT 8w ćF TFDPOE MJOF E
    G UIPTF VOPCTFSWFE 6 WBMVFT POF GPS FBDI JOEJWJEVBM ćFTF XFSF TJ
    JTUSJCVUJPO XJUI TUBOEBSE EFWJBUJPO σ ćBU ∼ TZNCPM JOEJDBUFT B
    deterministic
    distributed as

    View full-size slide

  29. EBSE XBZ PG EFTDSJCJOH NPEFMT MJLF UIF POF BCPWF VTJOH TUBUJTUJDBM O
    IFMQGVM GPS CPUI HFOFSBUJWF NPEFMT MJLF UIF POF BCPWF BOE GPS TUBU
    XFMM EFWFMPQ "OE JU JT WFSZ DPNNPO JO UIF TDJFODFT
    FJHIU TJNVMBUJPO BCPWF JT EFĕOFE BT
    8J = β)J + 6J
    6J ∼ /PSNBM(, σ)
    )J ∼ 6OJGPSN(, )
    OF JT UIF FRVBUJPO GPS 8 ćF MJUUMF J PO 8 ) BOE 6 JOEJDBUFT BO J
    OVNCFS :PV DBO SFBE JU MJLF iFBDI JOEJWJEVBMT 8w ćF TFDPOE MJOF E
    G UIPTF VOPCTFSWFE 6 WBMVFT POF GPS FBDI JOEJWJEVBM ćFTF XFSF TJ
    JTUSJCVUJPO XJUI TUBOEBSE EFWJBUJPO σ ćBU ∼ TZNCPM JOEJDBUFT B
    Equation for expected weight

    View full-size slide

  30. EBSE XBZ PG EFTDSJCJOH NPEFMT MJLF UIF POF BCPWF VTJOH TUBUJTUJDBM O
    IFMQGVM GPS CPUI HFOFSBUJWF NPEFMT MJLF UIF POF BCPWF BOE GPS TUBU
    XFMM EFWFMPQ "OE JU JT WFSZ DPNNPO JO UIF TDJFODFT
    FJHIU TJNVMBUJPO BCPWF JT EFĕOFE BT
    8J = β)J + 6J
    6J ∼ /PSNBM(, σ)
    )J ∼ 6OJGPSN(, )
    OF JT UIF FRVBUJPO GPS 8 ćF MJUUMF J PO 8 ) BOE 6 JOEJDBUFT BO J
    OVNCFS :PV DBO SFBE JU MJLF iFBDI JOEJWJEVBMT 8w ćF TFDPOE MJOF E
    G UIPTF VOPCTFSWFE 6 WBMVFT POF GPS FBDI JOEJWJEVBM ćFTF XFSF TJ
    JTUSJCVUJPO XJUI TUBOEBSE EFWJBUJPO σ ćBU ∼ TZNCPM JOEJDBUFT B
    Gaussian error with
    standard deviation sigma

    View full-size slide

  31. EBSE XBZ PG EFTDSJCJOH NPEFMT MJLF UIF POF BCPWF VTJOH TUBUJTUJDBM O
    IFMQGVM GPS CPUI HFOFSBUJWF NPEFMT MJLF UIF POF BCPWF BOE GPS TUBU
    XFMM EFWFMPQ "OE JU JT WFSZ DPNNPO JO UIF TDJFODFT
    FJHIU TJNVMBUJPO BCPWF JT EFĕOFE BT
    8J = β)J + 6J
    6J ∼ /PSNBM(, σ)
    )J ∼ 6OJGPSN(, )
    OF JT UIF FRVBUJPO GPS 8 ćF MJUUMF J PO 8 ) BOE 6 JOEJDBUFT BO J
    OVNCFS :PV DBO SFBE JU MJLF iFBDI JOEJWJEVBMT 8w ćF TFDPOE MJOF E
    G UIPTF VOPCTFSWFE 6 WBMVFT POF GPS FBDI JOEJWJEVBM ćFTF XFSF TJ
    JTUSJCVUJPO XJUI TUBOEBSE EFWJBUJPO σ ćBU ∼ TZNCPM JOEJDBUFT B
    Height uniformly distributed
    from 130cm to 170cm

    View full-size slide

  32. F E
    XIFSF β JT UIF QSPQPSUJPOBMJUZ DPOTUBOU *G JUT GPS FYBNQMF UIFO B QFSTPO XIP JT DN
    UBMM XFJHIT LH
    0CWJPVTMZ OPU FWFSZPOF XJUI UIF TBNF IFJHIU IBT FYBDUMZ UIF TBNF XFJHIU "OE UIF
    TJNVMBUJPO TIPVME SFĘFDU UIJT 4P XF OFFE UP JOUSPEVDF TPNF WBSJBUJPO 8FMM VTF (BVTTJBO
    WBSJBUJPO ćF XBZ *MM EP UIJT JT UP TJNVMBUF PVS VOPCTFSWFE 6 WBSJBCMF GSPN UIF QSFWJPVT
    TFDUJPO ćFO XF DBO DPNQVUF B QFSTPOT XFJHIU BT
    8 = β) + 6
    )FSFT TPNF DPEF UP EP UIJT
    3 DPEF
    # function to simulate weights of individuals from height
    sim_weight <- function(H,b,sd) {
    U <- rnorm( length(H) , 0 , sd )
    W <- b*H + U
    return(W)
    }
    ćF BSHVNFOU sd JT UIF TUBOEBSE EFWJBUJPO PG 6 *U EFUFSNJOFT IPX NVDI WBSJBUJPO UIFSF JT
    BSPVOE UIF FYQFDUFE WBMVF PG 8 GPS FBDI ) WBMVF
    FJHIU TJNVMBUJPO BCPWF JT EFĕOFE BT
    8J = β)J + 6J
    6J ∼ /PSNBM(, σ)
    )J ∼ 6OJGPSN(, )
    OF JT UIF FRVBUJPO GPS 8 ćF MJUUMF J PO 8 ) BOE 6 JOEJDBUFT BO J
    OVNCFS :PV DBO SFBE JU MJLF iFBDI JOEJWJEVBMT 8w ćF TFDPOE MJOF E
    G UIPTF VOPCTFSWFE 6 WBMVFT POF GPS FBDI JOEJWJEVBM ćFTF XFSF TJ
    JTUSJCVUJPO XJUI TUBOEBSE EFWJBUJPO σ ćBU ∼ TZNCPM JOEJDBUFT B
    Q :PV DBO SFBE JU BT iUIF 6 WBMVFT BSF EJTUSJCVUFE BDDPSEJOH UP B OP
    NFBO [FSP BOE TUBOEBSE EFWJBUJPO σw ćF MBTU MJOF JOEJDBUFT BOPUIFS
    Q CVU GPS ) ćFTF XFSF VOJGPSN WBMVFT CFUXFFO BOE

    View full-size slide

  33. Linear Regression
    Drawing the Owl
    (1) Question/goal/estimand
    (2) Scientific model
    (3) Statistical model(s)
    (4) Validate model
    (5) Analyze data
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    data(Howell1)
    d <- Howell1[Howell1$age>=18,]

    View full-size slide

  34. Estimator
    We want to estimate how the average weight changes with
    height.
    F UIF PCTFSWFE CPEZ XFJHIUT #VU UIF HBSEFO EFQFOET VQPO NPSF UIBO POF VO
    ODF B OPSNBM EJTUSJCVUJPO EFQFOET VQPO CPUI B NFBO BOE WBSJBODF 4P UIFSF BS
    MJUJFT UP DPOTJEFS -VDLJMZ QSPCBCJMJUZ UIFPSZ DBO IBOEMF UIFN BMM FBTJMZ *MM U
    VDUJPO TMPX BOE UIFO UIFSF XJMM CF FYBNQMFT
    BHJOF B MJOF EFTDSJCJOH UIF SFMBUJPOTIJQ CFUXFFO BOZ QBSUJDVMBS WBMVF )J
    BOE
    8J
    GPS B TQFDJĕD )J
    XIJDI JT PęFO XSJUUFO &(8J|)J) " MJOF GPS UIJT FYQFDUB
    CZ
    &(8J|)J) = α + β)J
    SJBCMF α IFSF JT UIF ĶĻŁIJĿİIJĽŁ 8IFO )J = UIFO &(8J|)J) = α "OE β
    PG UIF MJOF
    F MBTU UIJOH XF OFFE CFGPSF XF DBO SFBMMZ CVJME UIF FTUJNBUPS UIF QPTUFSJPS EJTUSJ
    OTJEFS 6 UIF WBSJBUJPO BSPVOE UIF FYQFDUBUJPO ćF MBSHFS σ JT UIF NPSF WB
    NQMJFT B OPSNBM EJTUSJCVUJPO XJUI NFBO &(8J|)J) BOE TUBOEBSE EFWJBUJPO σ
    Average weight
    conditional on height
    intercept
    slope

    View full-size slide

  35. Posterior distribution
    CJMJUJFT .BZCF ĕY σ = BOE UIFO DPOTJEFS POMZ α = , β = BOE α =
    QQPTF XF BMSFBEZ IBWF B QSFMJNJOBSZ QPTUFSJPS EJTUSJCVUJPO 1S(α, β, σ) GP
    JUJFT /PX XF HFU UIF EBUB GPS B OFX JOEJWJEVBM )J
    BOE 8J
    8F XBOU UP VQ
    PS TP JU JODMVEFT UIJT JOEJWJEVBM 'PS BOZ DPNCJOBUJPO PG α BOE β BOE σ UIF Q
    MJUZ JT
    1S(α, β, σ|)J, 8J) =
    1S(8J|)J, α, β, σ) 1S(α, β, σ)
    ;
    ; JT PVS GSJFOEMZ OPSNBMJ[JOH DPOTUBOU ćF BDUJPO BT BMXBZT JT PO UPQ *O #
    XF UBLF UIF QSFWJPVT QPTUFSJPS QSPCBCJMJUZ UIF QSJPS QSPCBCJMJUZ
    PG UIJT DPNC
    , σ) BOE NVMUJQMZ JU CZ UIF SFMBUJWF OVNCFS PG XBZT QSPCBCJMJUZ
    UIBU UIJT DPNC
    FT DPVME QSPEVDF UIF PCTFSWBUJPO 8J
    XIJDI JT 1S(8J|)J, α, β, σ)
    T UBLF XIBU XF IBWF TP GBS BOE BDUVBMMZ EP TPNF DBMDVMBUJPOT -FUT NBLF JU F
    σ = BOE α = 4P UIFO XF KVTU OFFE UP DPOTJEFS EJČFSFOU QPTTJCMF WBMVF

    View full-size slide

  36. Posterior distribution
    posterior probability
    of specific line
    CJMJUJFT .BZCF ĕY σ = BOE UIFO DPOTJEFS POMZ α = , β = BOE α =
    QQPTF XF BMSFBEZ IBWF B QSFMJNJOBSZ QPTUFSJPS EJTUSJCVUJPO 1S(α, β, σ) GP
    JUJFT /PX XF HFU UIF EBUB GPS B OFX JOEJWJEVBM )J
    BOE 8J
    8F XBOU UP VQ
    PS TP JU JODMVEFT UIJT JOEJWJEVBM 'PS BOZ DPNCJOBUJPO PG α BOE β BOE σ UIF Q
    MJUZ JT
    1S(α, β, σ|)J, 8J) =
    1S(8J|)J, α, β, σ) 1S(α, β, σ)
    ;
    ; JT PVS GSJFOEMZ OPSNBMJ[JOH DPOTUBOU ćF BDUJPO BT BMXBZT JT PO UPQ *O #
    XF UBLF UIF QSFWJPVT QPTUFSJPS QSPCBCJMJUZ UIF QSJPS QSPCBCJMJUZ
    PG UIJT DPNC
    , σ) BOE NVMUJQMZ JU CZ UIF SFMBUJWF OVNCFS PG XBZT QSPCBCJMJUZ
    UIBU UIJT DPNC
    FT DPVME QSPEVDF UIF PCTFSWBUJPO 8J
    XIJDI JT 1S(8J|)J, α, β, σ)
    T UBLF XIBU XF IBWF TP GBS BOE BDUVBMMZ EP TPNF DBMDVMBUJPOT -FUT NBLF JU F
    σ = BOE α = 4P UIFO XF KVTU OFFE UP DPOTJEFS EJČFSFOU QPTTJCMF WBMVF

    View full-size slide

  37. Posterior distribution
    posterior probability
    of specific line
    CJMJUJFT .BZCF ĕY σ = BOE UIFO DPOTJEFS POMZ α = , β = BOE α =
    QQPTF XF BMSFBEZ IBWF B QSFMJNJOBSZ QPTUFSJPS EJTUSJCVUJPO 1S(α, β, σ) GP
    JUJFT /PX XF HFU UIF EBUB GPS B OFX JOEJWJEVBM )J
    BOE 8J
    8F XBOU UP VQ
    PS TP JU JODMVEFT UIJT JOEJWJEVBM 'PS BOZ DPNCJOBUJPO PG α BOE β BOE σ UIF Q
    MJUZ JT
    1S(α, β, σ|)J, 8J) =
    1S(8J|)J, α, β, σ) 1S(α, β, σ)
    ;
    ; JT PVS GSJFOEMZ OPSNBMJ[JOH DPOTUBOU ćF BDUJPO BT BMXBZT JT PO UPQ *O #
    XF UBLF UIF QSFWJPVT QPTUFSJPS QSPCBCJMJUZ UIF QSJPS QSPCBCJMJUZ
    PG UIJT DPNC
    , σ) BOE NVMUJQMZ JU CZ UIF SFMBUJWF OVNCFS PG XBZT QSPCBCJMJUZ
    UIBU UIJT DPNC
    FT DPVME QSPEVDF UIF PCTFSWBUJPO 8J
    XIJDI JT 1S(8J|)J, α, β, σ)
    T UBLF XIBU XF IBWF TP GBS BOE BDUVBMMZ EP TPNF DBMDVMBUJPOT -FUT NBLF JU F
    σ = BOE α = 4P UIFO XF KVTU OFFE UP DPOTJEFS EJČFSFOU QPTTJCMF WBMVF
    garden of forking data

    View full-size slide

  38. Posterior distribution
    posterior probability
    of specific line
    prior
    CJMJUJFT .BZCF ĕY σ = BOE UIFO DPOTJEFS POMZ α = , β = BOE α =
    QQPTF XF BMSFBEZ IBWF B QSFMJNJOBSZ QPTUFSJPS EJTUSJCVUJPO 1S(α, β, σ) GP
    JUJFT /PX XF HFU UIF EBUB GPS B OFX JOEJWJEVBM )J
    BOE 8J
    8F XBOU UP VQ
    PS TP JU JODMVEFT UIJT JOEJWJEVBM 'PS BOZ DPNCJOBUJPO PG α BOE β BOE σ UIF Q
    MJUZ JT
    1S(α, β, σ|)J, 8J) =
    1S(8J|)J, α, β, σ) 1S(α, β, σ)
    ;
    ; JT PVS GSJFOEMZ OPSNBMJ[JOH DPOTUBOU ćF BDUJPO BT BMXBZT JT PO UPQ *O #
    XF UBLF UIF QSFWJPVT QPTUFSJPS QSPCBCJMJUZ UIF QSJPS QSPCBCJMJUZ
    PG UIJT DPNC
    , σ) BOE NVMUJQMZ JU CZ UIF SFMBUJWF OVNCFS PG XBZT QSPCBCJMJUZ
    UIBU UIJT DPNC
    FT DPVME QSPEVDF UIF PCTFSWBUJPO 8J
    XIJDI JT 1S(8J|)J, α, β, σ)
    T UBLF XIBU XF IBWF TP GBS BOE BDUVBMMZ EP TPNF DBMDVMBUJPOT -FUT NBLF JU F
    σ = BOE α = 4P UIFO XF KVTU OFFE UP DPOTJEFS EJČFSFOU QPTTJCMF WBMVF
    garden of forking data

    View full-size slide

  39. Posterior distribution
    posterior probability
    of specific line normalizing constant
    prior
    CJMJUJFT .BZCF ĕY σ = BOE UIFO DPOTJEFS POMZ α = , β = BOE α =
    QQPTF XF BMSFBEZ IBWF B QSFMJNJOBSZ QPTUFSJPS EJTUSJCVUJPO 1S(α, β, σ) GP
    JUJFT /PX XF HFU UIF EBUB GPS B OFX JOEJWJEVBM )J
    BOE 8J
    8F XBOU UP VQ
    PS TP JU JODMVEFT UIJT JOEJWJEVBM 'PS BOZ DPNCJOBUJPO PG α BOE β BOE σ UIF Q
    MJUZ JT
    1S(α, β, σ|)J, 8J) =
    1S(8J|)J, α, β, σ) 1S(α, β, σ)
    ;
    ; JT PVS GSJFOEMZ OPSNBMJ[JOH DPOTUBOU ćF BDUJPO BT BMXBZT JT PO UPQ *O #
    XF UBLF UIF QSFWJPVT QPTUFSJPS QSPCBCJMJUZ UIF QSJPS QSPCBCJMJUZ
    PG UIJT DPNC
    , σ) BOE NVMUJQMZ JU CZ UIF SFMBUJWF OVNCFS PG XBZT QSPCBCJMJUZ
    UIBU UIJT DPNC
    FT DPVME QSPEVDF UIF PCTFSWBUJPO 8J
    XIJDI JT 1S(8J|)J, α, β, σ)
    T UBLF XIBU XF IBWF TP GBS BOE BDUVBMMZ EP TPNF DBMDVMBUJPOT -FUT NBLF JU F
    σ = BOE α = 4P UIFO XF KVTU OFFE UP DPOTJEFS EJČFSFOU QPTTJCMF WBMVF
    garden of forking data

    View full-size slide

  40. PS TP JU JODMVEFT UIJT JOEJWJEVBM 'PS BOZ DPNCJOBUJPO PG α BOE β BOE σ UIF Q
    MJUZ JT
    1S(α, β, σ|)J, 8J) =
    1S(8J|)J, α, β, σ) 1S(α, β, σ)
    ;
    ; JT PVS GSJFOEMZ OPSNBMJ[JOH DPOTUBOU ćF BDUJPO BT BMXBZT JT PO UPQ *O #
    XF UBLF UIF QSFWJPVT QPTUFSJPS QSPCBCJMJUZ UIF QSJPS QSPCBCJMJUZ
    PG UIJT DPNC
    , σ) BOE NVMUJQMZ JU CZ UIF SFMBUJWF OVNCFS PG XBZT QSPCBCJMJUZ
    UIBU UIJT DPNC
    FT DPVME QSPEVDF UIF PCTFSWBUJPO 8J
    XIJDI JT 1S(8J|)J, α, β, σ)
    T UBLF XIBU XF IBWF TP GBS BOE BDUVBMMZ EP TPNF DBMDVMBUJPOT -FUT NBLF JU F
    σ = BOE α = 4P UIFO XF KVTU OFFE UP DPOTJEFS EJČFSFOU QPTTJCMF WBMVF
    F β DPVME UBLF BOZ WBMVF CFUXFFO [FSP BOE POF 8F FYDMVEF OFHBUJWF WBMVFT
    X UIBU BWFSBHF XFJHIU JODSFBTFT XJUI IFJHIU "OE XF FYDMVEF WBMVFT HSFBUFS UI
    QFPQMF BSF KVTU OPU UIBU EFOTF‰B DN QFSTPO EPFT OPU XFJHIU PO BWFSBH
    EFWJBUJPO σ *O TUBUJTUJDBM NPEFM OPUBUJPO UIJT NFBOT
    8J ∼ /PSNBM(α + β)J, σ)
    EJTUSJCVUFE BDDPSEJOH UP B OPSNBM EJTUSJCVUJPO XJUI NFBO α + β)J
    BOE
    O σw *UT DVTUPNBSZ UP XSJUF EFĕOJUJPOT MJLF UIJT XJUI NPSF UIBO POF MJOF TP
    SFBE
    8J ∼ /PSNBM(µJ, σ)
    µJ = α + β)J
    BCMF µJ
    JT KVTU UIF FYQFDUFE WBMVF GPS JOEJWJEVBM J "OE JUT FOUJSFMZ B GVODUJ
    SJBCMFT *U JT UIFSF TP JU JT FBTJFS UP SFBE UIF NPEFM "OE UIJT JT BMNPTU BMX
    PEFMT BSF EFĕOFE 4P XFMM GPMMPX UIF DPOWFOUJPO
    W is distributed normally with mean
    that is a linear function of H

    View full-size slide

  41. Grid approximate posterior
    (&0$&/53*$ .0%&-4
    0 0.2 0.4 0.6 0.8 1
    beta
    posterior probability
    0.0 0.1 0.2 0.3 0.4 0.5
    FT UIJT JNQMZ BCPVU UIF MJOF UIPVHI -FUT QSPKFDU UIF QPTUFSJPS EJTUSJCVUJPO

    View full-size slide

  42. Grid approximate posterior
    (&0$&/53*$ .0%&-4
    0 0.2 0.4 0.6 0.8 1
    beta
    posterior probability
    0.0 0.1 0.2 0.3 0.4 0.5
    FT UIJT JNQMZ BCPVU UIF MJOF UIPVHI -FUT QSPKFDU UIF QPTUFSJPS EJTUSJCVUJPO
    130 140 150 160 170
    50 60 70 80 90
    height (cm)
    weight (kg)
    N = 3

    View full-size slide

  43. 130 140 150 160 170
    50 60 70 80 90
    height (cm)
    weight (kg)
    N = 1
    130 140 150 160 170
    50 60 70 80 90
    height (cm)
    weight (kg)
    N = 2
    130 140 150 160 170
    50 60 70 80 90
    height (cm)
    weight (kg)
    N = 3
    130 140 150 160 170
    50 60 70 80 90
    weight (kg)
    N = 10
    130 140 150 160 170
    50 60 70 80 90
    weight (kg)
    N = 20
    130 140 150 160 170
    50 60 70 80 90
    weight (kg)
    N = 89

    View full-size slide

  44. Updating the posterior

    View full-size slide

  45. Updating the posterior

    View full-size slide

  46. Enough grid approximation
    We’ll use quadratic approximation for the rest of the first half
    of the course.
    BMSFBEZ 8IBU XF BMTP OFFE UP EFĕOF JT UIF QSJPS *O UIF DPEF XFWF
    QVU UIF QSFWJPVT QPTUFSJPS PS VTF B QSJPS UIBU BTTJHOFE UIF TBNF JOJUJBM
    TJCJMJUZ 8IFO UIFSF BSF JOĕOJUF QPTTJCJMJUJFT HJWFO UIFN BMM UIF TBNF
    E JEFB $POTJEFS GPS FYBNQMF UIF TMPQF β 8F LOPX JU JT QPTJUJWF BOE
    POF 8F VTFE UIBU LOPXMFEHF XIFO XF CVJMU PVS MJTU PG QPTTJCJMJUJFT
    FYQSFTT TVDI LOPXMFEHF JO UIF NPEFM EFĕOJUJPO BT XFMM )FSFT NZ
    8J ∼ /PSNBM(µJ, σ)
    µJ = α + β)J
    α ∼ /PSNBM(, )
    β ∼ 6OJGPSN(, )
    σ ∼ 6OJGPSN(, )
    PU NPSF BCPVU UIFTF QSJPST JO B NPNFOU #VU IFSFT UIF DPEF GPS UIJT
    m3.1 <- quap(
    alist(
    W ~ dnorm(mu,sigma),
    mu <- a + b*H,
    a ~ dnorm(0,10),
    b ~ dunif(0,1),
    sigma ~ dunif(0,10)
    ) , data=list(W=W,H=H) )

    View full-size slide

  47. Enough grid approximation
    We’ll use quadratic approximation for the rest of the first half
    of the course.
    BMSFBEZ 8IBU XF BMTP OFFE UP EFĕOF JT UIF QSJPS *O UIF DPEF XFWF
    QVU UIF QSFWJPVT QPTUFSJPS PS VTF B QSJPS UIBU BTTJHOFE UIF TBNF JOJUJBM
    TJCJMJUZ 8IFO UIFSF BSF JOĕOJUF QPTTJCJMJUJFT HJWFO UIFN BMM UIF TBNF
    E JEFB $POTJEFS GPS FYBNQMF UIF TMPQF β 8F LOPX JU JT QPTJUJWF BOE
    POF 8F VTFE UIBU LOPXMFEHF XIFO XF CVJMU PVS MJTU PG QPTTJCJMJUJFT
    FYQSFTT TVDI LOPXMFEHF JO UIF NPEFM EFĕOJUJPO BT XFMM )FSFT NZ
    8J ∼ /PSNBM(µJ, σ)
    µJ = α + β)J
    α ∼ /PSNBM(, )
    β ∼ 6OJGPSN(, )
    σ ∼ 6OJGPSN(, )
    PU NPSF BCPVU UIFTF QSJPST JO B NPNFOU #VU IFSFT UIF DPEF GPS UIJT
    m3.1 <- quap(
    alist(
    W ~ dnorm(mu,sigma),
    mu <- a + b*H,
    a ~ dnorm(0,10),
    b ~ dunif(0,1),
    sigma ~ dunif(0,10)
    ) , data=list(W=W,H=H) )

    View full-size slide

  48. Prior predictive distribution
    Priors should express scientific knowledge, but softly
    When H = 0, W = 0
    Weight increases (on avg) with height
    Weight (kg) is less than height (cm)
    sigma must be positive
    F
    VTFE TP GBS XF DPVME JOQVU UIF QSFWJPVT QPTUFSJPS PS VTF B QSJPS
    QSPCBCJMJUZ UP FWFSZ QPTTJCJMJUZ 8IFO UIFSF BSF JOĕOJUF QPTTJCJM
    QSJPS QSPCBCJMJUZ JT B CBE JEFB $POTJEFS GPS FYBNQMF UIF TMPQF
    UIBU JU JTOU HSFBUFS UIBO POF 8F VTFE UIBU LOPXMFEHF XIFO XF
    8F XBOU UP CF BCMF UP FYQSFTT TVDI LOPXMFEHF JO UIF NPEFM E
    TUBSUJOH QSPQPTBM
    8J ∼ /PSNBM(µJ, σ)
    µJ = α + β)J
    α ∼ /PSNBM(, )
    β ∼ 6OJGPSN(, )
    σ ∼ 6OJGPSN(, )
    8FSF HPJOH UP UIJOL B MPU NPSF BCPVU UIFTF QSJPST JO B NPNFO
    TUBUJTUJDBM NPEFM

    View full-size slide

  49. Prior predictive distribution
    Understand the implications of priors
    through simulation
    What do the observable variables look
    like with these priors?
    F
    VTFE TP GBS XF DPVME JOQVU UIF QSFWJPVT QPTUFSJPS PS VTF B QSJPS
    QSPCBCJMJUZ UP FWFSZ QPTTJCJMJUZ 8IFO UIFSF BSF JOĕOJUF QPTTJCJM
    QSJPS QSPCBCJMJUZ JT B CBE JEFB $POTJEFS GPS FYBNQMF UIF TMPQF
    UIBU JU JTOU HSFBUFS UIBO POF 8F VTFE UIBU LOPXMFEHF XIFO XF
    8F XBOU UP CF BCMF UP FYQSFTT TVDI LOPXMFEHF JO UIF NPEFM E
    TUBSUJOH QSPQPTBM
    8J ∼ /PSNBM(µJ, σ)
    µJ = α + β)J
    α ∼ /PSNBM(, )
    β ∼ 6OJGPSN(, )
    σ ∼ 6OJGPSN(, )
    8FSF HPJOH UP UIJOL B MPU NPSF BCPVU UIFTF QSJPST JO B NPNFO
    TUBUJTUJDBM NPEFM

    View full-size slide

  50. F ES
    NFBOT UIF TFBSDI GBJMFE UP ĕOE UIF DPNCJOBUJPO PG VOLOPXOT UIBU NBYJNJ[FT QPTUFSJPS QSPCBCJMJUZ
    4UBSUJOH UIF TFBSDI GSPN BOPUIFS QPTTJCMZ SBOEPN MPDBUJPO PS JODSFBTJOH IPX MPOH UIF TFBSDI JT BM
    MPXFE UP DPOUJOVF DBO PęFO SFTPMWF UIJT #VU NPSF PęFO JU JOEJDBUFT UIBU UIF NPEFM JT NJTTQFDJĕFE
    ćF 'PML ćFPSFN PG 4UBUJTUJDBM $PNQVUJOH TPDBMMFE CZ "OESFX (FMNBO JT 8IFO ZPV IBWF DPN
    QVUBUJPOBM QSPCMFNT PęFO UIFSFT B QSPCMFN XJUI ZPVS NPEFM
    1SJPS QSFEJDUJWF EJTUSJCVUJPOT #FGPSF UBDLMJOH B SFBM TBNQMF BOE DPNQVUJOH BO FTUJ
    NBUF MFUT UIJOL NPSF BCPVU UIF QSJPS EJTUSJCVUJPO JO PVU MJUUMF HPMFN
    3 DPEF

    n <- 1e3
    a <- rnorm(n,0,10)
    b <- runif(n,0,1)
    plot( NULL , xlim=c(130,170) , ylim=c(50,90) ,
    xlab="height (cm)" , ylab="weight (kg)" )
    for ( j in 1:50 ) abline( a=a[j] , b=b[j] , lwd=2 , col=2 )
    ćF EBUB ćF EBUB DPOUBJOFE JO data(Howell1) BSF QBSUJBM DFOTVT EBUB GPS UIF %PCF
    BSFB ,VOH 4BO DPNQJMFE GSPN JOUFSWJFXT DPOEVDUFE CZ /BODZ )PXFMM JO UIF MBUF T
    ćF ,VOH 4BO BSF UIF NPTU GBNPVT GPSBHJOH QPQVMBUJPO PG UIF UXFOUJFUI DFOUVSZ MBSHFMZ
    CFDBVTF PG EFUBJMFE RVBOUJUBUJWF TUVEJFT CZ QFPQMF MJLF )PXFMM -PBE UIF EBUB BOE QMBDF UIFN
    JOUP B DPOWFOJFOU PCKFDU XJUI
    3 DPEF

    library(rethinking)
    data(Howell1)
    F
    VTFE TP GBS XF DPVME JOQVU UIF QSFWJPVT QPTUFSJPS PS VTF B QSJPS
    QSPCBCJMJUZ UP FWFSZ QPTTJCJMJUZ 8IFO UIFSF BSF JOĕOJUF QPTTJCJM
    QSJPS QSPCBCJMJUZ JT B CBE JEFB $POTJEFS GPS FYBNQMF UIF TMPQF
    UIBU JU JTOU HSFBUFS UIBO POF 8F VTFE UIBU LOPXMFEHF XIFO XF
    8F XBOU UP CF BCMF UP FYQSFTT TVDI LOPXMFEHF JO UIF NPEFM E
    TUBSUJOH QSPQPTBM
    8J ∼ /PSNBM(µJ, σ)
    µJ = α + β)J
    α ∼ /PSNBM(, )
    β ∼ 6OJGPSN(, )
    σ ∼ 6OJGPSN(, )
    8FSF HPJOH UP UIJOL B MPU NPSF BCPVU UIFTF QSJPST JO B NPNFO
    TUBUJTUJDBM NPEFM

    View full-size slide

  51. F ES
    NFBOT UIF TFBSDI GBJMFE UP ĕOE UIF DPNCJOBUJPO PG VOLOPXOT UIBU NBYJNJ[FT QPTUFSJPS QSPCBCJMJUZ
    4UBSUJOH UIF TFBSDI GSPN BOPUIFS QPTTJCMZ SBOEPN MPDBUJPO PS JODSFBTJOH IPX MPOH UIF TFBSDI JT BM
    MPXFE UP DPOUJOVF DBO PęFO SFTPMWF UIJT #VU NPSF PęFO JU JOEJDBUFT UIBU UIF NPEFM JT NJTTQFDJĕFE
    ćF 'PML ćFPSFN PG 4UBUJTUJDBM $PNQVUJOH TPDBMMFE CZ "OESFX (FMNBO JT 8IFO ZPV IBWF DPN
    QVUBUJPOBM QSPCMFNT PęFO UIFSFT B QSPCMFN XJUI ZPVS NPEFM
    1SJPS QSFEJDUJWF EJTUSJCVUJPOT #FGPSF UBDLMJOH B SFBM TBNQMF BOE DPNQVUJOH BO FTUJ
    NBUF MFUT UIJOL NPSF BCPVU UIF QSJPS EJTUSJCVUJPO JO PVU MJUUMF HPMFN
    3 DPEF

    n <- 1e3
    a <- rnorm(n,0,10)
    b <- runif(n,0,1)
    plot( NULL , xlim=c(130,170) , ylim=c(50,90) ,
    xlab="height (cm)" , ylab="weight (kg)" )
    for ( j in 1:50 ) abline( a=a[j] , b=b[j] , lwd=2 , col=2 )
    ćF EBUB ćF EBUB DPOUBJOFE JO data(Howell1) BSF QBSUJBM DFOTVT EBUB GPS UIF %PCF
    BSFB ,VOH 4BO DPNQJMFE GSPN JOUFSWJFXT DPOEVDUFE CZ /BODZ )PXFMM JO UIF MBUF T
    ćF ,VOH 4BO BSF UIF NPTU GBNPVT GPSBHJOH QPQVMBUJPO PG UIF UXFOUJFUI DFOUVSZ MBSHFMZ
    CFDBVTF PG EFUBJMFE RVBOUJUBUJWF TUVEJFT CZ QFPQMF MJLF )PXFMM -PBE UIF EBUB BOE QMBDF UIFN
    JOUP B DPOWFOJFOU PCKFDU XJUI
    3 DPEF

    library(rethinking)
    data(Howell1)
    130 140 150 160 170
    30 40 50 60 70
    height (cm)
    weight (kg)

    View full-size slide

  52. Sermon on priors
    There are no correct priors, only
    scientifically justifiable priors
    Justify with information outside the data —
    like rest of model
    Priors not so important in simple models
    Very important/useful in complex models
    Need to practice now: simulate, understand
    130 140 150 160 170
    30 40 50 60 70
    height (cm)
    weight (kg)

    View full-size slide

  53. Linear Regression
    Drawing the Owl
    (1) Question/goal/estimand
    (2) Scientific model
    (3) Statistical model(s)
    (4) Validate model
    (5) Analyze data
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    data(Howell1)
    d <- Howell1[Howell1$age>=18,]

    View full-size slide

  54. Simulation-Based Validation
    Bare minimum: Test statistical model
    with simulated observations from
    scientific model
    Golem might be broken
    Even working golems might not deliver
    what you hoped
    Strong test: Simulation-Based Calibration
    Fahrvergnügen

    View full-size slide

  55. # simulate a sample of 10 people
    set.seed(93)
    H <- runif(10,130,170)
    W <- sim_weight(H,b=0.5,sd=5)
    # run the model
    library(rethinking)
    m3.1 <- quap(
    alist(
    W ~ dnorm(mu,sigma),
    mu <- a + b*H,
    a ~ dnorm(0,10),
    b ~ dunif(0,1),
    sigma ~ dunif(0,10)
    ) , data=list(W=W,H=H) )
    # summary
    precis( m3.1 )
    mean sd 5.5% 94.5%
    a 5.19 9.43 -9.88 20.26
    b 0.49 0.07 0.38 0.59
    sigma 5.64 1.29 3.57 7.71
    Vary slope and make sure
    posterior mean tracks it
    Use a large sample to see
    that it converges to data
    generating value
    Same for other unknowns
    (parameters)

    View full-size slide

  56. Linear Regression
    Drawing the Owl
    (1) Question/goal/estimand
    (2) Scientific model
    (3) Statistical model(s)
    (4) Validate model
    (5) Analyze data 140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    data(Howell1)
    d <- Howell1[Howell1$age>=18,]

    View full-size slide

  57. Analyze the data
    dat <- list(W=d2$weight,H=d2$height)
    m3.2 <- quap(
    alist(
    W ~ dnorm(mu,sigma),
    mu <- a + b*H,
    a ~ dnorm(0,10),
    b ~ dunif(0,1),
    sigma ~ dunif(0,10)
    ) , data=dat )
    precis( m3.2 )
    mean sd 5.5% 94.5%
    a -43.38 4.17 -50.04 -36.71
    b 0.57 0.03 0.53 0.61
    sigma 4.25 0.16 3.99 4.51
    a
    0.50 0.55 0.60 0.65
    -1
    -60 -50 -40 -30
    0.11
    0.50 0.55 0.60 0.65
    b
    -0.11
    -60 -50 -40 -30
    3.8 4.0 4.2 4.4 4.6 4.8
    3.8 4.2 4.6
    sigma

    View full-size slide

  58. Obey The Law
    First Law of Statistical Interpretation:
    The parameters are not independent of
    one another and cannot always be
    independently interpreted
    Instead:
    Push out posterior predictions and
    describe/interpret those

    View full-size slide

  59. Posterior predictive distribution
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    post <- extract.samples(m3.2)
    plot( d2$height , d2$weight , col=2 , lwd=3 ,
    xlab="height (cm)" , ylab="weight (kg)" )
    for ( j in 1:20 )
    abline( a=post$a[j] , b=post$b[j] , lwd=1 )
    The posterior is full of lines

    View full-size slide

  60. 140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    N = 1
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    N = 5
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    N = 25
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    N = 50
    140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    N = 352

    View full-size slide

  61. 140 150 160 170 180
    30 35 40 45 50 55 60
    height (cm)
    weight (kg)
    Posterior predictive distribution
    post <- extract.samples(m3.2)
    plot( d2$height , d2$weight , col=2 , lwd=3 ,
    xlab="height (cm)" , ylab="weight (kg)" )
    for ( j in 1:20 )
    abline( a=post$a[j] , b=post$b[j] , lwd=1 )
    The posterior is full of lines
    The posterior is full of people
    height_seq <- seq(130,190,len=20)
    W_postpred <- sim( m3.2 ,
    data=list(H=height_seq) )
    W_PI <- apply( W_postpred , 2 , PI )
    lines( height_seq , W_PI[1,] , lty=2 , lwd=2 )
    lines( height_seq , W_PI[2,] , lty=2 , lwd=2 )

    View full-size slide

  62. Flexible Linear Thermometers
    Generative model
    How does height influence
    weight?
    W = f(H,U)
    “Weight is some function of height
    & unmeasured stuff”
    H W U

    View full-size slide

  63. Flexible Linear Thermometers
    Generative model
    How does height influence
    weight?
    W = f(H,U)
    “Weight is some function of height
    & unmeasured stuff”
    F E
    µJ = α + β)J
    :PVWF TFFO UIJT NVDI BMSFBEZ 8IBU XF BMTP OFFE UP EFĕOF JT UIF
    VTFE TP GBS XF DPVME JOQVU UIF QSFWJPVT QPTUFSJPS PS VTF B QSJPS UIBU
    QSPCBCJMJUZ UP FWFSZ QPTTJCJMJUZ 8IFO UIFSF BSF JOĕOJUF QPTTJCJMJUJFT
    QSJPS QSPCBCJMJUZ JT B CBE JEFB $POTJEFS GPS FYBNQMF UIF TMPQF β 8
    UIBU JU JTOU HSFBUFS UIBO POF 8F VTFE UIBU LOPXMFEHF XIFO XF CVJ
    8F XBOU UP CF BCMF UP FYQSFTT TVDI LOPXMFEHF JO UIF NPEFM EFĕO
    TUBSUJOH QSPQPTBM
    8J ∼ /PSNBM(µJ, σ)
    µJ = α + β)J
    α ∼ /PSNBM(, )
    β ∼ 6OJGPSN(, )
    σ ∼ 6OJGPSN(, )
    H W U
    Statistical model
    How does average weight
    change with height?

    View full-size slide

  64. Course Schedule
    Week 1 Bayesian inference Chapters 1, 2, 3
    Week 2 Linear models & Causal Inference Chapter 4
    Week 3 Causes, Confounds & Colliders Chapters 5 & 6
    Week 4 Overfitting / Interactions Chapters 7 & 8
    Week 5 MCMC & Generalized Linear Models Chapters 9, 10, 11
    Week 6 Integers & Other Monsters Chapters 11 & 12
    Week 7 Multilevel models I Chapter 13
    Week 8 Multilevel models II Chapter 14
    Week 9 Measurement & Missingness Chapter 15
    Week 10 Generalized Linear Madness Chapter 16
    https://github.com/rmcelreath/stat_rethinking_2023

    View full-size slide