Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking Fall 2017 Lecture 03

Statistical Rethinking Fall 2017 Lecture 03

Week 2, Lecture 3, Statistical Rethinking: A Bayesian Course with Examples in R and Stan. This lecture covers Chapter 4 of the book.

Richard McElreath

November 01, 2017
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Triumph of Geocentrism • Claudius Ptolemy (90–168) • Egyptian mathematician

    • Accurate model of planetary motion • Epicycles: orbits on orbits • Fourier series   -*/&" Earth equant planet epicycle deferent
  2. Geocentrism • Descriptively accurate • Mechanistically wrong • General method

    of approximation • Known to be wrong Regression • Descriptively accurate • Mechanistically wrong • General method of approximation • Taken too seriously
  3. Linear regression • Simple statistical golems • Model of mean

    and variance of normally (Gaussian) distributed measure • Mean as additive combination of predictors • Constant variance
  4. 1809 Bayesian argument for normal error and least-squares estimation 

    0$7+(0$7,&60$*$=,1( ]D[LV 3DODQHWHVGVUEQ K OQWUUL SHULKHOLRQ LVWKHSRLQFORVHVWLWWDQH HL L RQKLVHOOSWLDORUEW L GHHULRQH E\aWHWP ISHULKHOLRQ SVDH ROFLH\ FRPSXWHGYDOXHV ILQ HUO\VWXLHVRI ODQHWUPRWLQ 6SRHQZWD HNR WZ K R QHQ ),*85( 3DUDPHWHUV GHVFWLELQJ WKH SODQHWDU\ RUELW GLIIHUHQW SODQHV  ZKLFK GHWHUPLQHV WKH RULHQWDWLRQ RI WKH HOOLSVH ZLWKLQ WKH RUELW SODQH 1RWH WKDW DSKHOLRQ LV WKH SRLQW RQ WKH RUELW IXUWKHVW IURP WKH VXQ ZKHUH SHULKHOLRQ LV WKH SRLQW FORVHVW WR WKH VXQ 7KH HOOLSVH LWVHOI LVGHWHUPLQHG E\ DWK OHQJWK RI LWV VHPLPDMRU D[LV DQGHLWV HFFHQWULFLW\ )LQDOO\ WKH SRVLWLRQ RI WKH SODQH RQ WKLV HOOLSWLFDO RUELW LV GHWHUPLQHG E\ WKH WLPH RI SHULKHOLRQ SDVVDJH &ROOHFWLYHO WKH VL[ TXDQWLWLHV L 4O 7 DH DQG US DUH UHIHUUHG WR DV WKH HOHPHQWV RI WKH RUELW )RU IXWXUH UHIHUHQFH ZH QRWH WKDW WKH DQJOH LQ )LJ ,LV NQRZQ DV WKH WUXH DQRU!DO\ DQG GHILQH Y 4O FR DJDLQDVXPRIDQJOHV LQWZRGLIIHUHQW SODQHV 7KHZRU
  5. Why normal? • Why are normal (Gaussian) distributions so common

    in statistics? 1. Easy to calculate with 2. Common in nature 3. Very conservative assumption 0.0 0.1 0.2 0.3 0.4 x density −4σ −2σ 0 2σ 4σ 95%
  6. 0 4 8 12 16 -6 -3 0 3 6

    step number position -6 -3 0 3 6 0.0 0.2 0.4 position Density 4 steps -6 -3 0 3 6 0.00 0.15 0.30 position Density 8 steps -6 -3 0 3 6 0.00 0.10 0.20 position Density 16 steps 0 4 8 12 16 -6 -3 0 3 6 step number position -6 -3 0 3 6 0.0 0.2 0.4 position Density 4 steps -6 -3 0 3 6 0.00 0.15 0.30 position Density 8 steps -6 -3 0 3 6 0.00 0.10 0.20 position Density 16 steps Figure 4.2
  7. Why normal? • Processes that produce normal distributions • Addition

    • Products of small deviations • Logarithms of products Francis Galton’s 1894 “bean machine” for simulating normal distributions
  8. Why normal? • Ontological perspective • Processes which add fluctuations

    result in dampening • Damped fluctuations end up Gaussian • No information left, except mean and variance • Can’t infer process from distribution! • Epistemological perspective • Know only mean and variance • Then least surprising and most conservative (maximum entropy) distribution is Gaussian • Nature likes maximum entropy distributions
  9. Linear models • Models of normally distributed data common •

    “General Linear Model”: t-test, single regression, multiple regression, ANOVA, ANCOVA, MANOVA, MANCOVA, yadda yadda yadda • All the same thing • Learn strategy, not procedure
  10. Language for modeling • Questions to answer 1. What are

    the outcomes? 2. How are the outcomes constrained (what is likelihood)? 3. What are the predictors, if any? 4. How do predictors relate to likelihood? 5. What are the priors? From Breath of Bones: A Tale of the Golem
  11. Language for modeling IF NPEFM EFĕOJUJPOT 8F XJMM BMTP CF

    BCMF UP TFF OBUVSBM XBZT UP DIBOHF UIFTF BT VNQUJPOT JOTUFBE PG GFFMJOH USBQQFE XJUIJO TPNF QSPDSVTUFBO NPEFM UZQF MJL FHSFTTJPO PS NVMUJQMF SFHSFTTJPO PS "/07" PS "/$07" PS TVDI ćFTF BSF B IF TBNF LJOE PG NPEFM BOE UIBU GBDU CFDPNFT PCWJPVT PODF XF LOPX IPX UP UBM CPVU NPEFMT BT NBQQJOHT PG POF TFU PG WBSJBCMFT UISPVHI B QSPCBCJMJUZ EJTUSJCV PO POUP BOPUIFS TFU PG WBSJBCMFT  3FEFTDSJCJOH UIF HMPCF UPTTJOH NPEFM *UT HPPE UP XPSL XJUI FYBNQMFT FDBMM UIF QSPQPSUJPO PG XBUFS QSPCMFN GSPN QSFWJPVT DIBQUFST ćF NPEFM JO UIB BTF XBT BMXBZT O8 ∼ #JOPNJBM(O, Q) Q ∼ 6OJGPSN(, ) XIFSF O8 XBT UIF PCTFSWFE DPVOU PG XBUFS QPJOUT O XBT UIF UPUBM OVNCFS P PJOUT BOE Q XBT UIF QSPQPSUJPO PG XBUFS 3FBE UIF BCPWF TUBUFNFOU BT ćF DPVOU O8 JT EJTUSJCVUFE CJOPNJBMMZ XJUI TBNQMF TJ[F O BOE QSPCBCJMJUZ Q ćF QSJPS GPS Q JT BTTVNFE UP CF VOJGPSN CFUXFFO [FSP BOE POF 0ODF XF LOPX UIF NPEFM JO UIJT XBZ XF BVUPNBUJDBMMZ LOPX BMM PG JUT BTTVNQ • Revisit globe tossing model: outcome parameter to estimate likelihood prior distribution parameters “is distributed”
  12. Language for modeling FHSFTTJPO PS NVMUJQMF SFHSFTTJPO PS "/07" PS

    "/$07" PS TVDI ćFTF BSF B IF TBNF LJOE PG NPEFM BOE UIBU GBDU CFDPNFT PCWJPVT PODF XF LOPX IPX UP UBM CPVU NPEFMT BT NBQQJOHT PG POF TFU PG WBSJBCMFT UISPVHI B QSPCBCJMJUZ EJTUSJCV PO POUP BOPUIFS TFU PG WBSJBCMFT  3FEFTDSJCJOH UIF HMPCF UPTTJOH NPEFM *UT HPPE UP XPSL XJUI FYBNQMFT FDBMM UIF QSPQPSUJPO PG XBUFS QSPCMFN GSPN QSFWJPVT DIBQUFST ćF NPEFM JO UIB BTF XBT BMXBZT O8 ∼ #JOPNJBM(O, Q) Q ∼ 6OJGPSN(, ) XIFSF O8 XBT UIF PCTFSWFE DPVOU PG XBUFS QPJOUT O XBT UIF UPUBM OVNCFS P PJOUT BOE Q XBT UIF QSPQPSUJPO PG XBUFS 3FBE UIF BCPWF TUBUFNFOU BT ćF DPVOU O8 JT EJTUSJCVUFE CJOPNJBMMZ XJUI TBNQMF TJ[F O BOE QSPCBCJMJUZ Q ćF QSJPS GPS Q JT BTTVNFE UP CF VOJGPSN CFUXFFO [FSP BOE POF 0ODF XF LOPX UIF NPEFM JO UIJT XBZ XF BVUPNBUJDBMMZ LOPX BMM PG JUT BTTVNQ POT 8F LOPX UIF CJOPNJBM EJTUSJCVUJPO BTTVNFT UIBU FBDI TBNQMF HMPCF UPTT JOEFQFOEFOU PG UIF PUIFST BOE TP XF BMTP LOPX UIBU UIF NPEFM BTTVNFT UIB • Revisit globe tossing model: The count nW is distributed binomially with sample size n and probability p. The prior for p is assumed to be uniform between zero and one.
  13. Some data: Kalahari foragers DFOTVT EBUB GPS UIF %PCF BSFB

    ,VOH 4BO DPNQJMFE GSPN JOUFSWJFXT DPO EVDUFE CZ /BODZ )PXFMM JO UIF MBUF ÔT 'PS UIF OPOBOUISPQPMPHJ SFBEJOH BMPOH UIF ,VOH 4BO BSF UIF NPTU GBNPVT GPSBHJOH QPQVMBUJP PG UIF UI DFOUVSZ MBSHFMZ CFDBVTF PG EFUBJMFE RVBOUJUBUJWF TUVEJFT C QFPQMF MJLF )PXFMM -PBE UIF EBUB BOE QMBDF UIFN JOUP B DPOWFOJFOU PCKFDU XJUI 3 DPEF  OLEUDU\ UHWKLQNLQJ GDWD +RZHOO G  +RZHOO 8IBU ZPV IBWF OPX JT B EBUB GSBNF OBNFE TJNQMZ G * VTF UIF OBN G PWFS BOE PWFS BHBJO JO UIJT CPPL UP SFGFS UP UIF EBUB GSBNF XF BS XPSLJOH XJUI BU UIF NPNFOU * LFFQ JUT OBNF TIPSU UP TBWF ZPV UZQJO " EBUB GSBNF JT B TQFDJBM LJOE PG PCKFDU JO 3 *U JT B UBCMF XJUI OBNF DPMVNOT DPSSFTQPOEJOH UP WBSJBCMFT BOE OVNCFSFE SPXT DPSSFTQPOEJO UP JOEJWJEVBM DBTFT *O UIJT FYBNQMF UIF DBTFT BSF JOEJWJEVBMT *OTQFD height weight age male 1 151.765 47.82561 63 1 2 139.700 36.48581 63 0 3 136.525 31.86484 65 0 4 156.845 53.04191 41 1 5 145.415 41.27687 51 0 6 163.830 62.99259 35 1 ... 544 158.750 52.53162 68 1
  14. Gaussian model • A first model: 140 150 160 170

    180 0.00 0.02 0.04 0.06 height (cm) Density UIF QMPUUFE PVUDPNF WBSJBCMF MPPLT (BVTTJBO UP ZPV (BXLJOH BU USZ UP EFDJEF IPX UP NPEFM UIFN JT VTVBMMZ OPU B HPPE JEFB ćF NJYUVSF PG EJČFSFOU OPSNBM EJTUSJCVUJPOT GPS FYBNQMF BOE JO UIBU CF BCMF UP EFUFDU UIF VOEFSMZJOH OPSNBMJUZ KVTU CZ FZFCBMMJOH UIF IUT BSF BQQSPYJNBUFMZ OPSNBMMZ EJTUSJCVUFE CVU XIJDI OPSNBM EJT F BSF BO JOĕOJUF OVNCFS PG UIFN XJUI BO JOĕOJUF OVNCFS PG EJG OE WBSJBODFT 8FSF SFBEZ UP XSJUF EPXO UIF HFOFSBM NPEFM BOE QBSBNFUFST UIBU NBYJNJ[F UIF MJLFMJIPPE 5P EFĕOF UIF IFJHIUT SJCVUFE XJUI B NFBO µ BOE TUBOEBSE EFWJBUJPO σ XF XSJUF IJ ∼ /PSNBM(µ, σ). ZPVMM TFF UIF TBNF NPEFM XSJUUFO BT IJ ∼ N(µ, σ) XIJDI NFBOT ćF TZNCPM I SFGFST UP UIF MJTU PG IFJHIUT BOE UIF TVCTDSJQU J JWJEVBM FMFNFOU PG UIJT MJTU *U JT DPOWFOUJPOBM UP VTF J CFDBVTF JU Y ćF JOEFY J UBLFT PO SPX OVNCFST BOE TP JO UIJT FYBNQMF DBO SPN  UP  UIF OVNCFS PG IFJHIUT JO ƽɠ# $"#/  "T TVDI UIF TBZJOH UIBU BMM UIF HPMFN LOPXT BCPVU FBDI IFJHIU NFBTVSFNFOU
  15. Gaussian model DPVME CF B NJYUVSF PG EJČFSFOU OPSNBM EJTUSJCVUJPOT

    GPS FYBNQMF BOE J ZPV XPOU CF BCMF UP EFUFDU UIF VOEFSMZJOH OPSNBMJUZ KVTU CZ FZFCBMMJO TJUZ QMPU 4P UIF IFJHIUT BSF BQQSPYJNBUFMZ OPSNBMMZ EJTUSJCVUFE CVU XIJDI OPSNB VUJPO ćFSF BSF BO JOĕOJUF OVNCFS PG UIFN XJUI BO JOĕOJUF OVNCFS P OU NFBOT BOE WBSJBODFT 8FSF SFBEZ UP XSJUF EPXO UIF HFOFSBM NPEF UIF VOJRVF QBSBNFUFST UIBU NBYJNJ[F UIF MJLFMJIPPE 5P EFĕOF UIF I PSNBMMZ EJTUSJCVUFE XJUI B NFBO µ BOE TUBOEBSE EFWJBUJPO σ XF XSJUF IJ ∼ /PSNBM(µ, σ). NBOZ CPPLT ZPVMM TFF UIF TBNF NPEFM XSJUUFO BT IJ ∼ N(µ, σ) XIJDI N TBNF UIJOH ćF TZNCPM I SFGFST UP UIF MJTU PG IFJHIUT BOE UIF TVCTD OT FBDI JOEJWJEVBM FMFNFOU PG UIJT MJTU *U JT DPOWFOUJPOBM UP VTF J CFDB ET GPS JOEFY ćF JOEFY J UBLFT PO SPX OVNCFST BOE TP JO UIJT FYBNQ BOZ WBMVF GSPN  UP  UIF OVNCFS PG IFJHIUT JO ƽɠ# $"#/  "T TVD EFM BCPWF JT TBZJOH UIBU BMM UIF HPMFN LOPXT BCPVU FBDI IFJHIU NFBTVSF FĕOFE CZ UIF TBNF OPSNBM EJTUSJCVUJPO XJUI B DPOTUBOU NFBO µ BOE TUBO BUJPO σ #FGPSF MPOH UIPTF MJUUMF JT BSF HPJOH UP TIPX VQ PO UIF SJHIUIBO outcome “is distributed” likelihood Height hi of an individual i is distributed normally, with mean mu and standard deviation sigma. mean standard deviation
  16. Gaussian model • Add priors: IJHIMZ DPSSFMBUFE IFJHIUT #VU UIF

    PWFSBMM EJTUSJCVUJPO PG GFNBMF IFJHIU SFNBJOT NBM *O TVDI DBTFT JJE SFNBJOT QFSGFDUMZ VTFGVM EFTQJUF JHOPSJOH UIF DPSSFMB YBNQMF UIBU .BSLPW DIBJO .POUF $BSMP $IBQUFS  DBO VTF IJHIMZ DPSSFMBUFE FTUJNBUF NPTU BOZ JJE EJTUSJCVUJPO XF MJLF F NPEFM XFSF HPJOH UP OFFE TPNF QSJPST ćF QBSBNFUFST UP CF FTUJNBUFE P XF OFFE B QSJPS 1S(µ, σ) UIF KPJOU QSJPS QSPCBCJMJUZ GPS BMM QBSBNFUFST ST BSF TQFDJĕFE JOEFQFOEFOUMZ GPS FBDI QBSBNFUFS XIJDI BNPVOUT UP BT 1S(µ) 1S(σ) ćFO XF DBO XSJUF IJ ∼ /PSNBM(µ, σ) >OLNHOLKRRG@ µ ∼ /PSNBM(, ) >µ SULRU@ σ ∼ 6OJGPSN(, ) >σ SULRU@ HIU BSF OPU QBSU PG UIF NPEFM CVU JOTUFBE KVTU OPUFT UP IFMQ ZPV LFFQ USBDL BDI MJOF ćF QSJPS GPS µ JT B CSPBE (BVTTJBO QSJPS DFOUFSFE PO DN JMJUZ CFUXFFO  ±  :PVS BVUIPS JT  DN UBMM "OE UIF SBOHF GSPN  DN UP  DN FODPN F PG QMBVTJCMF NFBO IFJHIUT GPS IVNBO QPQVMBUJPOT 4P EPNBJOTQFDJĕD 120 160 200 240 0.000 0.005 0.010 0.015 0.020 mu dnorm(x, 178, 20) 0 20 40 60 80 0.000 0.005 0.010 0.015 0.020 sigma dunif(x, 0, 50)
  17. Gaussian model • What do these priors imply about height,

    before we see data? Simulate! => prior predictive distribution 100 cm = 3.3 feet 200 cm = 6.5 feet  " ("644*"/ .0%&- 0' )&*()5  3 DPEF  0*-)"Ǯ*2 ʆǦ /+,/*ǯ ƾ"ǁ ǒ ƾDŽDž ǒ ƿƽ ǰ 0*-)"Ǯ0&$* ʆǦ /2+&#ǯ ƾ"ǁ ǒ ƽ ǒ ǂƽ ǰ -/&,/Ǯ% ʆǦ /+,/*ǯ ƾ"ǁ ǒ 0*-)"Ǯ*2 ǒ 0*-)"Ǯ0&$* ǰ !"+0ǯ -/&,/Ǯ% ǰ ćF EFOTJUZ QMPU ZPV HFU TIPXT B WBHVFMZ CFMMTIBQFE EFOTJUZ XJUI UIJDL UBJMT *U JT UIF FYQFDUFE EJTUSJCVUJPO PG IFJHIUT BWFSBHFE PWFS UIF QSJPS /PUJDF UIBU UIF QSJPS QSPCBCJMJUZ EJTUSJCVUJPO PG IFJHIU JT OPU JUTFMG (BVTTJBO ćJT JT PLBZ ćF EJTUSJCVUJPO ZPV TFF JT OPU BO FNQJSJDBM FYQFDUBUJPO CVU SBUIFS UIF EJTUSJCVUJPO PG SFMBUJWF QMBVTJCJMJUJFT PG EJČFSFOU IFJHIUT CFGPSF TFFJOH UIF EBUB 1MBZ BSPVOE XJUI UIF OVNCFST JO UIF QSJPST BCPWF UP FYQMPSF UIFJS FČFDUT PO UIF QSJPS QSPCBCJMJUZ EFOTJUZ PG IFJHIUT 3FUIJOLJOH " GBSFXFMM UP FQTJMPO 4PNF SFBEFST XJMM IBWF BMSFBEZ NFU BO BMUFSOBUJWF OPUBUJPO GPS B (BVTTJBO MJOFBS NPEFM IJ = µ + ϵJ ϵJ ∼ /PSNBM(, σ) 50 100 200 300 0.000 0.004 0.008 0.012 N = 10000 Bandwidth = 2.256 Density
  18. Estimating mu and sigma • Aim for the posterior distribution,

    which is now 2-dimensional • Grid approximation: Compute posterior for many combinations of mu and sigma   -*/&"3 . 153.0 154.0 155.0 156.0 7.0 7.5 8.0 8.5 9.0 mu sigma ' S E U C 3 DPEF   ).ǭ .(+' Ǐ(0 Ǯ  ).ǭ .(+' Ǐ.$"( Ǯ
  19. 153.0 154.0 155.0 156.0 0.0 0.2 0.4 0.6 0.8 1.0

    mu Density 7.0 7.5 8.0 8.5 9.0 0.0 0.4 0.8 1.2 sigma Density mu sigma Figure 4.3
  20. Quadratic approximation • Approximate posterior as Gaussian • Can estimate

    with two numbers: • Peak of posterior, maximum a posteriori (MAP) • Standard deviation of posterior • Lots of algorithms • With flat priors, same as conventional maximum likelihood estimation
  21. PO UIF SJHIUIBOE NBSHJO IJ ∼ /PSNBM(µ, σ) %"&$%1 ʍ

    !+,/*ǯ* µ ∼ /PSNBM(, ) *2 ʍ !+,/*ǯƾ σ ∼ 6OJGPSN(, ) 0&$* ʍ !2+&#ǯƽ /PX QMBDF UIF 3 DPEF FRVJWBMFOUT JOUP BO )&01 )FSFT BO )&01 PG UIF GPSNVMBT #)&01 ʆǦ )&01ǯ %"&$%1 ʍ !+,/*ǯ *2 ǒ 0&$* ǰ ǒ *2 ʍ !+,/*ǯ ƾDŽDž ǒ ƿƽ ǰ ǒ 0&$* ʍ !2+&#ǯ ƽ ǒ ǂƽ ǰ ǰ /PUF UIF DPNNBT BU UIF FOE PG FBDI MJOF FYDFQU UIF MBTU ćFTF DPNNBT TFQBSBUF PG UIF NPEFM EFĕOJUJPO 'JU UIF NPEFM UP UIF EBUB JO UIF EBUB GSBNF !ƿ XJUI *ǁǑƾ ʆǦ *-ǯ #)&01 ǒ !1ʅ!ƿ ǰ "ęFS FYFDVUJOH UIJT DPEF ZPVMM IBWF B ĕU NPEFM TUPSFE JO UIF TZNCPM *ǁǑƾ /PX U BU UIF ĕU NBYJNVN B QPTUFSJPSJ NPEFM Using map VTFGVM EFTQJUF JHOPSJOH UIF DPSSFMBUJPOT 5P DPNQMFUF UIF NPEFM XFSF HPJOH UP OFFE TPNF QSJPST ć CF FTUJNBUFE BSF CPUI µ BOE σ TP XF OFFE B QSJPS 1S(µ, σ) UIF BCJMJUZ GPS BMM QBSBNFUFST *O NPTU DBTFT QSJPST BSF TQFDJĕFE J FBDI QBSBNFUFS XIJDI BNPVOUT UP BTTVNJOH 1S(µ, σ) = 1S(µ DBO XSJUF IJ ∼ /PSNBM(µ, σ) µ ∼ /PSNBM(, ) σ ∼ 6OJGPSN(, ) ćF MBCFMT PO UIF SJHIU BSF OPU QBSU PG UIF NPEFM CVU JOTUFBE KVTU LFFQ USBDL PG UIF QVSQPTF PG FBDI MJOF ćF QSJPS GPS µ JT B CSPB DFOUFSFE PO DN XJUI  PG QSPCBCJMJUZ CFUXFFO  ±  *UT B WFSZ HPPE JEFB UP QMPU ZPVS QSJPST TP ZPV IBWF B TFOTF UIFZ CVJME JOUP UIF NPEFM *O UIJT DBTF 0-1 ǭ )*-(ǭ 3 ǐ Ƽǀǁ ǐ Ƽƻ Ǯ ǐ !-*(ʃƼƻƻ ǐ /*ʃƽƻƻ Ǯ &YFDVUF UIBU DPEF ZPVSTFMG UP TFF UIBU UIF HPMFN JT BTTVNJOH 178, 20 Maximum a posteriori
  22. Using map /PX XFSF SFBEZ UP EFĕOF UIF NPEFM VTJOH

    3T GPSNVMB TZOUBY ćF NPEFM EFĕOJUJPO JO UIJT DBTF JT KVTU BT CFGPSF CVU OPX XFMM SFQFBU JU XJUI FBDI DPSSFTQPOEJOH MJOF PG 3 DPEF TIPXO PO UIF SJHIUIBOE NBSHJO IJ ∼ /PSNBM(µ, σ) %"&$%1 ʍ !+,/*ǯ*2ǒ0&$*ǰ µ ∼ /PSNBM(, ) *2 ʍ !+,/*ǯƾǂǃǒƾƽǰ σ ∼ 6OJGPSN(, ) 0&$* ʍ !2+&#ǯƽǒǂƽǰ /PX QMBDF UIF 3 DPEF FRVJWBMFOUT JOUP BO )&01 )FSFT BO )&01 PG UIF GPSNVMBT BCPWF 3 DPEF  #)&01 ʆǦ )&01ǯ %"&$%1 ʍ !+,/*ǯ *2 ǒ 0&$* ǰ ǒ *2 ʍ !+,/*ǯ ƾDŽDž ǒ ƿƽ ǰ ǒ 0&$* ʍ !2+&#ǯ ƽ ǒ ǂƽ ǰ ǰ /PUF UIF DPNNBT BU UIF FOE PG FBDI MJOF FYDFQU UIF MBTU ćFTF DPNNBT TFQBSBUF FBDI MJOF PG UIF NPEFM EFĕOJUJPO 'JU UIF NPEFM UP UIF EBUB JO UIF EBUB GSBNF !ƿ XJUI 3 DPEF  *ǁǑƾ ʆǦ *-ǯ #)&01 ǒ !1ʅ!ƿ ǰ "ęFS FYFDVUJOH UIJT DPEF ZPVMM IBWF B ĕU NPEFM TUPSFE JO UIF TZNCPM *ǁǑƾ /PX UBLF B MPPL BU UIF ĕU NBYJNVN B QPTUFSJPSJ NPEFM 3 DPEF  -/" &0ǯ *ǁǑƾ ǰ DBTF JT KVTU BT CFGPSF CVU OPX XFMM SFQFBU JU XJUI FBDI DPSSFTQPOEJOH MJOF PG 3 DPEF TIPXO PO UIF SJHIUIBOE NBSHJO IJ ∼ /PSNBM(µ, σ) %"&$%1 ʍ !+,/*ǯ*2ǒ0&$*ǰ µ ∼ /PSNBM(, ) *2 ʍ !+,/*ǯƾǂǃǒƾƽǰ σ ∼ 6OJGPSN(, ) 0&$* ʍ !2+&#ǯƽǒǂƽǰ /PX QMBDF UIF 3 DPEF FRVJWBMFOUT JOUP BO )&01 )FSFT BO )&01 PG UIF GPSNVMBT BCPWF 3 DPEF  #)&01 ʆǦ )&01ǯ %"&$%1 ʍ !+,/*ǯ *2 ǒ 0&$* ǰ ǒ *2 ʍ !+,/*ǯ ƾDŽDž ǒ ƿƽ ǰ ǒ 0&$* ʍ !2+&#ǯ ƽ ǒ ǂƽ ǰ ǰ /PUF UIF DPNNBT BU UIF FOE PG FBDI MJOF FYDFQU UIF MBTU ćFTF DPNNBT TFQBSBUF FBDI MJOF PG UIF NPEFM EFĕOJUJPO 'JU UIF NPEFM UP UIF EBUB JO UIF EBUB GSBNF !ƿ XJUI 3 DPEF  *ǁǑƾ ʆǦ *-ǯ #)&01 ǒ !1ʅ!ƿ ǰ "ęFS FYFDVUJOH UIJT DPEF ZPVMM IBWF B ĕU NPEFM TUPSFE JO UIF TZNCPM *ǁǑƾ /PX UBLF B MPPL BU UIF ĕU NBYJNVN B QPTUFSJPSJ NPEFM 3 DPEF  -/" &0ǯ *ǁǑƾ ǰ "+ 1!"3 ǂǑǂɵ džǁǑǂɵ *2 ƾǂǁǑǃƾ ƽǑǁƾ ƾǂǀǑdžǂ ƾǂǂǑƿDŽ 0&$* DŽǑDŽǀ ƽǑƿdž DŽǑƿDŽ DžǑƿƽ
  23. 153.0 154.0 155.0 156.0 0.0 0.2 0.4 0.6 0.8 1.0

    mu Density 7.0 7.5 8.0 8.5 9.0 0.0 0.4 0.8 1.2 sigma Density Samples Approximation BU UIF ĕU NBYJNVN B QPTUFSJPSJ NPEFM 3 DPEF  -/" &0ǯ *ǁǑƾ ǰ "+ 1!"3 ǂǑǂɵ džǁǑǂɵ *2 ƾǂǁǑǃƾ ƽǑǁƾ ƾǂǀǑdžǂ ƾǂǂǑƿDŽ 0&$* DŽǑDŽǀ ƽǑƿdž DŽǑƿDŽ DžǑƿƽ ćFTF OVNCFST QSPWJEF (BVTTJBO BQQSPYJNBUJPOT GPS FBDI QBSBNFUFST NBSHJOBM EJTUSJCVUJPO ćJT NFBOT UIF QMBVTJCJMJUZ PG FBDI WBMVF PG µ BęFS BWFSBHJOH PWFS UIF QMBVTJCJMJUJFT PG FBDI WBMVF PG σ JT HJWFO CZ B (BVTTJBO EJTUSJCVUJPO XJUI NFBO  BOE TUBOEBSE EFWJBUJPO  ćF  BOE  RVBOUJMFT BSF QFSDFOUJMF JOUFSWBM CPVOEBSJFT DPSSFTQPOEJOH UP BO  JOUFSWBM 8IZ  *UT KVTU UIF EFGBVMU *U EJTQMBZT B RVJUF XJEF JOUFSWBM TP JU TIPXT B
  24. Scaffolds • map is a scaffold • Forces full specification

    of model, so you learn it • Works with a very wide class of models • Not really a good way to approximate posterior
  25. Adding a predictor variable • How does weight describe height?

    30 35 40 45 50 55 60 140 150 160 170 180 d2$weight d2$height
  26. Adding a predictor variable • Use a linear model of

    the mean, mu: µ ∼ /PSNBM(, ) >µ SULRU@ σ ∼ 6OJGPSN(, ) >σ SULRU@ IU JOUP B (BVTTJBO NPEFM PG IFJHIU -FU Y CF UIF NBUIFNBUJDBM OBNF IU NFBTVSFNFOUT !ƿɢ4"&$%1 /PX XF IBWF B QSFEJDUPS WBSJBCMF Y FT PG UIF TBNF MFOHUI BT I 8FE MJLF UP TBZ IPX LOPXJOH UIF WBMVFT F PS QSFEJDU UIF WBMVFT JO I 5P HFU 4"&$%1 JOUP UIF NPEFM JO UIJT O µ BT B GVODUJPO PG UIF WBMVFT JO Y ćJT JT XIBU JU MPPLT MJLF XJUI IJ ∼ /PSNBM(µJ, σ) >OLNHOLKRRG@ µJ = α + βYJ >OLQHDUPRGHO@ α ∼ /PSNBM(, ) >α SULRU@ β ∼ /PSNBM(, ) >β SULRU@ σ ∼ 6OJGPSN(, ) >σ SULRU@ JOF PO UIF SJHIUIBOE TJEF CZ UIF UZQF PG EFĕOJUJPO JU FODPEFT 8FMM 5P EFDPEF BMM PG UIJT MFUT CFHJO XJUI KVTU UIF MJLFMJIPPE UIF ĕSTU MJOF
  27. Adding a predictor variable   -*/&"3 .0%&-4 PG UIF

    WBMVFT JO Y ćJT JT XIBU JU MPPLT MJLF XJUI FYQMBOBUJPO UP GPMMPX IJ ∼ /PSNBM(µJ, σ) >OLNHOLKRRG@ µJ = α + βYJ >OLQHDUPRGHO@ α ∼ /PSNBM(, ) >α SULRU@ β ∼ /PSNBM(, ) >β SULRU@ σ ∼ 6OJGPSN(, ) >σ SULRU@ "HBJO *WF MBCFMFE FBDI MJOF PO UIF SJHIUIBOE TJEF CZ UIF UZQF PG EFĕOJUJPO JU FODPEFT 8FMM EJTDVTT UIFN JO UVSOT  -JLFMJIPPE 5P EFDPEF BMM PG UIJT MFUT CFHJO XJUI KVTU UIF MJLFMJIPPE UIF ĕSTU MJOF PG UIF NPEFM ćJT JT OFBSMZ JEFOUJDBM UP CFGPSF FYDFQU OPX UIFSF JT B MJUUMF JOEFY J PO UIF µ BT XFMM BT UIF I ćJT JT OFDFTTBSZ OPX CFDBVTF UIF NFBO mean when xi = 0 “intercept” change in mean, per unit change xi “slope” weight on row i mean on row i
  28. Linear regression priors • Horoscopic advice • Intercept, “alpha”: no

    idea where it might end up, so broad Gaussian prior • Slopes, “beta”: Gaussian, center on zero, scale so extreme estimates ruled out, “regularization” (Chapter 6) • Scale, “sigma”: uniform with reasonable upper bound usually fine; later we’ll use Cauchy or exponential for regularization • Check prior predictive for sanity
  29.   -*/&"3 .0%&-4 UIF TXJUDI :PV KVTU IBWF UP

    SFNFNCFS UP VTF ʆǦ JOTUFBE PG ʅ XIFO EFĕOJOH B MJOFBS NPEFM ćBUT JU "OE UIF BCPWF BMMPXT VT UP CVJME UIF ."1 NPEFM ĕU DPEF  ȅ ),! !1 $&+ǒ 0&+ " &1ǚ0  ),+$ 46  ( )&//6ǯ/"1%&+(&+$ǰ !1ǯ ,4"))ƾǰ ! ʆǦ ,4"))ƾ !ƿ ʆǦ !DZ !ɢ$" ʇʅ ƾDž ǒ Dz ȅ #&1 *,!") *ǁǑǀ ʆǦ *-ǯ )&01ǯ %"&$%1 ʍ !+,/*ǯ *2 ǒ 0&$* ǰ ǒ *2 ʆǦ  ʀ ǹ4"&$%1 ǒ  ʍ !+,/*ǯ ƾǂǃ ǒ ƾƽƽ ǰ ǒ  ʍ !+,/*ǯ ƽ ǒ ƾƽ ǰ ǒ 0&$* ʍ !2+&#ǯ ƽ ǒ ǂƽ ǰ ǰ ǒ !1ʅ!ƿ ǰ ćF QBSBNFUFS *2 JT OP MPOHFS SFBMMZ B QBSBNFUFS IFSF CFDBVTF JU IBT CFFO SFQMBDFE CZ UIF MJOFBS NPEFM ʀǹ4"&$%1 XIFSF  JT α BOE  JT β BOE 4"&$%1 JT PG DPVSTF PVS Y JO UIJT X QBSBNFUFST UP UIF 01/1 MJTU -FUT SFQFBU UIF NPEFM EFĕOJUJPO OPX XJUI UIF 3 DPEF PO UIF SJHIUIBOE TJEF IJ ∼ /PSNBM(µJ, σ) %"&$%1 ʍ !+,/*ǯ*2ǒ0&$*ǰ µJ = α + βYJ *2 ʆǦ  ʀ ǹ4"&$%1 α ∼ /PSNBM(, )  ʍ !+,/*ǯƾǂǃǒƾƽƽǰ β ∼ /PSNBM(, )  ʍ !+,/*ǯƽǒƾƽǰ σ ∼ 6OJGPSN(, ) 0&$* ʍ !2+&#ǯƽǒǂƽǰ F MJOFBS NPEFM JO UIF 3 DPEF PO UIF SJHIUIBOE TJEF VTFT UIF 3 BTTJHONFOU WFO UIPVHI UIF NBUIFNBUJDBM EFĕOJUJPO VTFT UIF TZNCPM  ćJT JT B DPEF BSFE CZ TFWFSBM #BZFTJBO NPEFM ĕUUJOH FOHJOFT TP JUT XPSUI HFUUJOH VTFE UP 178 178 These priors are terrible, but harmless here, because so much data
  30. 30 35 40 45 50 55 60 140 150 160

    170 180 weight height SFMBUJWF QMBVTJCJMJUZ UP FBDI ćJT NFBOT QSPCBCJMJUZ *U DPVME CF UIBU UIFSF BSF NB BT UIF ."1 MJOF 0S JU DPVME CF JOTUFBE U UIF ."1 MJOF  5BCMFT PG FTUJNBUFT #FGPSF MPPLJOH DMPTFMZ BU UIF OFX UBCMF PG FTUJNBUFT JUT JN QPSUBOU UP SFBMJ[F UIBU NPEFMT DBOOPU JO HFOFSBM CF VOEFSTUPPE CZ UBCMFT PG FTUJNBUFT *O UIJT TJNQMF NPEFM B MPU DBO CF MFBSOFE GSPN UIF TVNNBSZ PVUQVU #VU UIJT JT OPU B HFOFSBM QSPQFSUZ PG NPEFMT #BZFTJBO PS OPU CFDBVTF PG UIF DPWBSJBUJPO BNPOH QBSBNFUFST 8JUI UIF OFX MJOFBS SFHSFTTJPO ĕU UP UIF ,BMBIBSJ EBUB XF JOTQFDU UIF FTUJNBUFT 3 DPEF  -/" &0ǯ *ǁǑǀ ǰ "+ 1!"3 ǂǑǂɵ džǁǑǂɵ  ƾƾǀǑdžƽ ƾǑdžƾ ƾƾƽǑDžǂ ƾƾǃǑdžǁ  ƽǑdžƽ ƽǑƽǁ ƽǑDžǁ ƽǑdžDŽ 0&$* ǂǑƽDŽ ƽǑƾdž ǁǑDŽDŽ ǂǑǀDž ćF ĕSTU SPX HJWFT UIF RVBESBUJD BQQSPYJNBUJPO GPS α UIF TFDPOE UIF BQQSPYJNBUJPO GPS β BOE UIF UIJSE BQQSPYJNBUJPO GPS σ -FUT USZ UP NBLF TPNF TFOTF PG UIFN JO UIJT WFSZ TJNQMF NPEFM #FTU UP CFHJO XJUI  β CFDBVTF JUT UIF OFX QBSBNFUFS 4JODF β JT B TMPQF UIF WBMVF  DBO CF SFBE BT B QFSTPO  LH IFBWJFS JT FYQFDUFE UP CF  DN UBMMFS  PG UIF QPTUFSJPS QSPCBCJMJUZ MJFT CFUXFFO  BOE  ćBU TVHHFTUT UIBU β WBMVFT DMPTF UP [FSP PS HSFBUMZ BCPWF POF BSF IJHIMZ JODPNQBUJCMF XJUI UIFTF EBUB BOE UIJT NPEFM *G ZPV XFSF UIJOLJOH UIBU QFSIBQT UIFSF XBT OP SFMBUJPOTIJQ BU BMM CFUXFFO IFJHIU BOE XFJHIU UIFO UIJT FTUJNBUF JOEJ DBUFT TUSPOH FWJEFODF PG B QPTJUJWF SFMBUJPOTIJQ JOTUFBE #VU NBZCF ZPV KVTU XBOUFE BT QSFDJTF B NFBTVSFNFOU BT QPTTJCMF PG UIF SFMBUJPOTIJQ CFUXFFO IFJHIU BOE XFJHIU ćJT FTUJNBUF FN CPEJFT UIBU NFBTVSFNFOU DPOEJUJPOBM PO UIF NPEFM 'PS B EJČFSFOU NPEFM UIF NFBTVSF PG UIF SFMBUJPOTIJQ NJHIU CF EJČFSFOU  "%%*/( " 13&%*$503  0 150 160 170 180 height 'ĶĴłĿIJ ƌƌ )FJHIU JO DFOUJNFUFST WFSUJDBM QMPUUFE BHBJOTU XFJHIU JO LJMPHSBNT IPSJ[PO UBM XJUI UIF NBYJNVN B QPTUFSJPSJ MJOF GPS UIF NFBO IFJHIU BU FBDI XFJHIU QMPUUFE JO CMBDL
  31. Sampling from the posterior • Want to get uncertainty onto

    that graph • Again, sample from posterior 1. Use MAP and standard deviation to approximate posterior 2. Sample from multivariate normal distribution of parameters 3. Use samples to generate predictions that “integrate over” the uncertainty
  32. Historical obstacles • Prior education impedes learning • “Sampling” in

    frequentist stats is a device to construct uncertainty around an estimate • “Sampling” in Bayesian stats is a way to perform integral calculus (or to simulate observations)
  33. Sampling from the posterior UIF ."1 MJOF 4P IPX DBO

    XF HFU UIBU VODFSUBJOUZ POUP UIF QMPU 5PHFUIFS B DPNCJOBUJPO PG α BOE β EFĕOF B MJOF "OE TP XF DPVME TBNQMF B CVODI PG MJOFT GSPN UIF QPTUFSJPS EJTUSJCVUJPO ćFO XF DPVME EJTQMBZ UIPTF MJOFT PO UIF QMPU UP WJTVBMJ[F UIF VODFSUBJOUZ JO UIF SFHSFTTJPO SFMBUJPOTIJQ 5P CFUUFS BQQSFDJBUF IPX UIF QPTUFSJPS EJTUSJCVUJPO DPOUBJOT MJOFT FYUSBDU TPNF TBNQMFT GSPN UIF NPEFM 3 DPEF  -,01 ʆǦ "51/ 1Ǒ0*-)"0ǯ *ǁǑǀ ǰ ćFO JOTQFDU UIF ĕSTU  SPXT PG UIF TBNQMFT 3 DPEF  -,01DZƾǓǂǒDz   0&$* ƾ ƾƾǁǑDŽDžDžƽ ƽǑDžDžƿƿdžƿƾ ǂǑƾƿƾƾƽƿ ƿ ƾƾƿǑDŽƾƾǂ ƽǑdžƿǀƽDžǂǂ ǁǑdžƽDŽdžDžDŽ ǀ ƾƾǁǑǁǂǂDŽ ƽǑdžƽƾDžǁDžƿ ǂǑƿDŽǃƽǀǃ ǁ ƾƾǁǑDŽǃdžǃ ƽǑDžDžǀƾǂǃƾ ǂǑƽƿƾdžǂDž ǂ ƾƾƿǑǃǀǀǀ ƽǑdžǀDžǀǃǀƿ ǁǑDždžDžǂǂǁ &BDI SPX JT B DPSSFMBUFE SBOEPN TBNQMF GSPN UIF KPJOU QPTUFSJPS PG BMM UISFF QBSBNFUFST VTJOH UIF DPWBSJBODFT QSPWJEFE CZ 3 ,3ǯ*ǁǑǀǰ ćF QBJSFE WBMVFT PG  BOE  PO FBDI SPX EFĕOF B MJOF ćF BWFSBHF PG WFSZ NBOZ PG UIFTF MJOFT JT UIF ."1 MJOF #VU UIF TDBUUFS BSPVOE UIBU BWFSBHF JT NFBOJOHGVM CFDBVTF JU BMUFST PVS DPOĕEFODF JO UIF SFMBUJPOTIJQ CFUXFFO UIF QSFEJDUPS BOE UIF PVUDPNF 4P OPX MFUT EJTQMBZ B CVODI PG UIFTF MJOFT TP ZPV DBO TFF UIF TDBUUFS ćJT MFTTPO XJMM CF FBTJFS UP BQQSFDJBUF JG XF VTF POMZ TPNF PG UIF EBUB UP CFHJO ćFO ZPV DBO TFF IPX BEEJOH UIF ."1 MJOF 4P IPX DBO XF HFU UIBU VODFSUBJOUZ POUP UIF QMPU 5PHFUIFS B DPNCJOBUJPO PG α BOE β EFĕOF B MJOF "OE TP XF DPVME TBNQMF B CVODI PG MJOFT GSPN UIF QPTUFSJPS EJTUSJCVUJPO ćFO XF DPVME EJTQMBZ UIPTF MJOFT PO UIF QMPU UP WJTVBMJ[F UIF VODFSUBJOUZ JO UIF SFHSFTTJPO SFMBUJPOTIJQ 5P CFUUFS BQQSFDJBUF IPX UIF QPTUFSJPS EJTUSJCVUJPO DPOUBJOT MJOFT FYUSBDU TPNF TBNQMFT GSPN UIF NPEFM 3 DPEF  -,01 ʆǦ "51/ 1Ǒ0*-)"0ǯ *ǁǑǀ ǰ ćFO JOTQFDU UIF ĕSTU  SPXT PG UIF TBNQMFT 3 DPEF  -,01DZƾǓǂǒDz   0&$* ƾ ƾƾǁǑDŽDžDžƽ ƽǑDžDžƿƿdžƿƾ ǂǑƾƿƾƾƽƿ ƿ ƾƾƿǑDŽƾƾǂ ƽǑdžƿǀƽDžǂǂ ǁǑdžƽDŽdžDžDŽ ǀ ƾƾǁǑǁǂǂDŽ ƽǑdžƽƾDžǁDžƿ ǂǑƿDŽǃƽǀǃ ǁ ƾƾǁǑDŽǃdžǃ ƽǑDžDžǀƾǂǃƾ ǂǑƽƿƾdžǂDž ǂ ƾƾƿǑǃǀǀǀ ƽǑdžǀDžǀǃǀƿ ǁǑDždžDžǂǂǁ &BDI SPX JT B DPSSFMBUFE SBOEPN TBNQMF GSPN UIF KPJOU QPTUFSJPS PG BMM UISFF QBSBNFUFST VTJOH UIF DPWBSJBODFT QSPWJEFE CZ 3 ,3ǯ*ǁǑǀǰ ćF QBJSFE WBMVFT PG  BOE  PO FBDI SPX EFĕOF B MJOF ćF BWFSBHF PG WFSZ NBOZ PG UIFTF MJOFT JT UIF ."1 MJOF #VU UIF TDBUUFS BSPVOE UIBU BWFSBHF JT NFBOJOHGVM CFDBVTF JU BMUFST PVS DPOĕEFODF JO UIF SFMBUJPOTIJQ CFUXFFO UIF QSFEJDUPS BOE UIF PVUDPNF 4P OPX MFUT EJTQMBZ B CVODI PG UIFTF MJOFT TP ZPV DBO TFF UIF TDBUUFS ćJT MFTTPO XJMM CF FBTJFS UP BQQSFDJBUF JG XF VTF POMZ TPNF PG UIF EBUB UP CFHJO ćFO ZPV DBO TFF IPX BEEJOH
  34. Posterior is full of lines  "%%*/( " 13&%*$503 

    30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 10 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 50 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 150 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 352 GSPN UIF NPEFM +*./ ʄǤ 3/-/Ǐ.(+' .ǭ (ƿǏƾ Ǯ ćFO JOTQFDU UIF ĕSTU  SPXT PG UIF TBNQMFT +*./ǯƼǑǀǐǰ  .$"(  Ƽ ƼƼǀǏƼDŽǁƿ ƿǏDŽDŽƽƽǁǂ ƻǏǃǂǂǁƾDŽƾ ƽ ƼƼƼǏƻƾǃDŽ ǀǏƼǁDŽǀƼǀ ƻǏDŽǂǀǃǀǀƿ ƾ ƼƼǀǏƿǃƾƾ ǀǏƼƾƾƿǁƾ ƻǏǃǂƽǁǂǀǂ ƿ ƼƻDŽǏǁƿǃǃ ǀǏƻƻǀǃƾǂ ƻǏDŽǃƼƽǁDŽƽ ǀ ƼƼƽǏƿǁƾǂ ƿǏǁǂǃƾƼƿ ƻǏDŽƾǃƿǃƼƿ &BDI SPX JT B DPSSFMBUFE SBOEPN TBNQMF GSPN VTJOH UIF DPWBSJBODFT QSPWJEFE CZ 1*1ǭ(ƿǏƾǮ EFĕOF B MJOF ćF BWFSBHF PG WFSZ NBOZ PG UIFTF M UIBU BWFSBHF JT NFBOJOHGVM CFDBVTF JU BMUFST PVS QSFEJDUPS BOE UIF PVUDPNF 4P OPX MFUT EJTQMBZ B CVODI PG UIFTF MJOFT T FBTJFS UP BQQSFDJBUF JG XF VTF POMZ TPNF PG UIF E JO NPSF EBUB DIBOHFT UIF TDBUUFS PG UIF MJOFT 4P ćF GPMMPXJOH DPEF FYUSBDUT UIF ĕSTU  DBTFT BOE Figure 4.5