Upgrade to Pro — share decks privately, control downloads, hide ads and more …

L04 Statistical Rethinking Winter 2019

L04 Statistical Rethinking Winter 2019

Lecture 04 of the Dec 2018 through March 2019 edition of Statistical Rethinking.

Richard McElreath

December 14, 2018
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1.  5BCMFT PG NBSHJOBM EJTUSJCVUJPOT 8JUI UIF OFX MJOFBS SFHSFTTJPO

    USBJOFE PO UIF ,BMBIBSJ EBUB XF JOTQFDU UIF NBSHJOBM QPTUFSJPS EJTUSJCVUJPOT PG UIF QBSBNFUFST 3 DPEF  +- $.ǿ (Ǒǡǐ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ  ǎǒǑǡǓǍ ǍǡǏǔ ǎǒǑǡǎǔ ǎǒǒǡǍǐ  ǍǡǖǍ ǍǡǍǑ ǍǡǕǑ Ǎǡǖǔ .$"( ǒǡǍǔ Ǎǡǎǖ Ǒǡǔǔ ǒǡǐǕ ćF ĕSTU SPX HJWFT UIF RVBESBUJD BQQSPYJNBUJPO GPS α UIF TFDPOE UIF BQQSPYJNBUJPO GPS β BOE UIF UIJSE BQQSPYJNBUJPO GPS σ -FUT USZ UP NBLF TPNF TFOTF PG UIFN -FUT GPDVT PO  β CFDBVTF JUT UIF OFX QBSBNFUFS 4JODF β JT B TMPQF UIF WBMVF  DBO CF SFBE BT B QFSTPO  LH IFBWJFS JT FYQFDUFE UP CF  DN UBMMFS  PG UIF QPTUFSJPS QSPCBCJMJUZ MJFT CFUXFFO  BOE  ćBU TVHHFTUT UIBU β WBMVFT DMPTF UP [FSP PS HSFBUMZ BCPWF POF BSF IJHIMZ JODPNQBUJCMF XJUI UIFTF EBUB BOE UIJT NPEFM *U JT NPTU DFSUBJOMZ OPU FWJEFODF UIBU UIF SFMBUJPOTIJQ CFUXFFO XFJHIU BOE IFJHIU JT MJOFBS CFDBVTF UIF NPEFM POMZ DPOTJEFSFE MJOFT *U KVTU TBZT UIBU JG ZPV BSF DPNNJUUFE UP B MJOF UIFO MJOFT XJUI B TMPQF BSPVOE  BSF QMBVTJCMF POFT 3FNFNCFS UIF OVNCFST JO UIF EFGBVMU +- $. PVUQVU BSFOU TVďDJFOU UP EFTDSJCF UIF RVBESBUJD QPTUFSJPS DPNQMFUFMZ 'PS UIBU XF BMTP SFRVJSF UIF WBSJBODFDPWBSJBODF NBUSJY :PV DBO TFF UIF DPWBSJBODFT BNPOH UIF QBSBNFUFST XJUI 1*1 3 DPEF  -*0)ǿ 1*1ǿ (Ǒǡǐ Ȁ Ǣ ǐ Ȁ 30 35 40 45 50 55 60 140 150 160 170 180 weight height ' Q U J UIF NPEFM TBZT #VU GPS FWFO TMJHIUMZ NPSF DP JOUFSBDUJPO FČFDUT $IBQUFS  JOUFSQSFUJOH Q UIJT UIF QSPCMFN PG JODPSQPSBUJOH UIF JOGPSNBU QMPUT BSF JSSFQMBDFBCMF 8FSF HPJOH UP TUBSU XJUI B TJNQMF WFSTJPO NFBO WBMVFT PWFS UIF IFJHIU BOE XFJHIU EBUB NBUJPO UP UIF QSFEJDUJPO QMPUT VOUJM XFWF VTFE 8FMM TUBSU XJUI KVTU UIF SBX EBUB BOE B TJ Figure 4.6 30 35 40 45 50 55 60 140 150 1 weight heig 'ĶĴłĿIJ ƌƎ )FJHIU JO DFOUJNFUFST WFSUJDBM QMPUUFE BHBJOTU XFJHIU JO LJMPHSBNT IPSJ[PO UBM XJUI UIF MJOF BU UIF QPTUFSJPS NFBO QMPUUFE JO CMBDL UIF NPEFM TBZT #VU GPS FWFO TMJHIUMZ NPSF DPNQMFY NPEFMT FTQFDJBMMZ UIPTF UIBU JODMVEF JOUFSBDUJPO FČFDUT $IBQUFS  JOUFSQSFUJOH QPTUFSJPS EJTUSJCVUJPOT JT IBSE $PNCJOF XJUI UIJT UIF QSPCMFN PG JODPSQPSBUJOH UIF JOGPSNBUJPO JO 1*1 JOUP ZPVS JOUFSQSFUBUJPOT BOE UIF QMPUT BSF JSSFQMBDFBCMF 8FSF HPJOH UP TUBSU XJUI B TJNQMF WFSTJPO PG UIBU UBTL TVQFSJNQPTJOH KVTU UIF QPTUFSJPS NFBO WBMVFT PWFS UIF IFJHIU BOE XFJHIU EBUB ćFO XFMM TMPXMZ BEE NPSF BOE NPSF JOGPS NBUJPO UP UIF QSFEJDUJPO QMPUT VOUJM XFWF VTFE UIF FOUJSF QPTUFSJPS EJTUSJCVUJPO 8FMM TUBSU XJUI KVTU UIF SBX EBUB BOE B TJOHMF MJOF ćF DPEF CFMPX QMPUT UIF SBX EBUB DPNQVUFT UIF QPTUFSJPS NFBO WBMVFT GPS  BOE  UIFO ESBXT UIF JNQMJFE MJOF 3 DPEF  +'*/ǿ # $"#/ ʡ 2 $"#/ Ǣ /ʙǏ Ǣ *'ʙ-)"$Ǐ Ȁ +*./ ʚǶ 3/-/ǡ.(+' .ǿ (Ǒǡǐ Ȁ Ǿ(+ ʚǶ ( )ǿ+*./ɶȀ Ǿ(+ ʚǶ ( )ǿ+*./ɶȀ 0-1 ǿ Ǿ(+ ʔ Ǿ(+ȉǿ3 Ƕ 3-Ȁ Ǣ ʙ Ȁ :PV DBO TFF UIF SFTVMUJOH QMPU JO 'ĶĴłĿIJ ƌƎ &BDI QPJOU JO UIJT QMPU JT B TJOHMF JOEJWJEVBM ćF CMBDL MJOF JT EFĕOFE CZ UIF NFBO TMPQF β BOE NFBO JOUFSDFQU α ćJT JT OPU B CBE MJOF *U DFSUBJOMZ MPPLT IJHIMZ QMBVTJCMF #VU UIFSF BO JOĕOJUF OVNCFS PG PUIFS IJHIMZ QMBVTJCMF MJOFT
  2. Showing Uncertainty • Want to get uncertainty onto that graph

    • Again, sample from posterior 1. Use mean and standard deviation to approximate posterior 2. Sample from multivariate normal distribution of parameters 3. Use samples to generate predictions that integrate over the uncertainty  "%%*/( 30 35 40 45 50 55 60 140 150 160 170 180 weight height UIF NPEFM TBZT #VU GPS FWFO TMJHIUMZ NPSF JOUFSBDUJPO FČFDUT $IBQUFS  JOUFSQSFUJOH UIJT UIF QSPCMFN PG JODPSQPSBUJOH UIF JOGPSN QMPUT BSF JSSFQMBDFBCMF 8FSF HPJOH UP TUBSU XJUI B TJNQMF WFSTJP NFBO WBMVFT PWFS UIF IFJHIU BOE XFJHIU EBU
  3. Sampling from the posterior   (&0$&/53*$ .0%&-4 ćFO XF

    DPVME EJTQMBZ UIPTF MJOFT PO UIF QMPU UP WJTVBMJ[F UIF VODFSUBJOUZ JO UIF SFHSFTTJPO SFMBUJPOTIJQ 5P CFUUFS BQQSFDJBUF IPX UIF QPTUFSJPS EJTUSJCVUJPO DPOUBJOT MJOFT XF XPSL XJUI BMM PG UIF TBNQMFT GSPN UIF NPEFM -FUT UBLF B DMPTFS MPPL BU UIF TBNQMFT OPX 3 DPEF  +*./ ʚǶ 3/-/ǡ.(+' .ǿ (Ǒǡǐ Ȁ +*./ȁǎǣǒǢȂ   .$"( ǎ ǎǒǑǡǒǒǍǒ ǍǡǖǏǏǏǐǔǏ ǒǡǎǕǕǓǐǎ Ǐ ǎǒǑǡǑǖǓǒ ǍǡǖǏǕǓǏǏǔ ǒǡǏǔǕǐǔǍ ǐ ǎǒǑǡǑǔǖǑ ǍǡǖǑǖǍǐǏǖ Ǒǡǖǐǔǒǎǐ Ǒ ǎǒǒǡǏǏǕǖ ǍǡǖǏǒǏǍǑǕ ǑǡǕǓǖǕǍǔ ǒ ǎǒǑǡǖǒǑǒ ǍǡǕǎǖǏǒǐǒ ǒǡǍǓǐǓǔǏ &BDI SPX JT B DPSSFMBUFE SBOEPN TBNQMF GSPN UIF KPJOU QPTUFSJPS PG BMM UISFF QBSBNFUFST VTJOH UIF DPWBSJBODFT QSPWJEFE CZ 1*1ǿ(ǑǡǐȀ ćF QBJSFE WBMVFT PG  BOE  PO FBDI SPX EFĕOF B MJOF ćF BWFSBHF PG WFSZ NBOZ PG UIFTF MJOFT JT UIF QPTUFSJPS NFBO MJOF #VU UIF TDBUUFS BSPVOE UIBU BWFSBHF JT NFBOJOHGVM CFDBVTF JU BMUFST PVS DPOĕEFODF JO UIF SFMBUJPOTIJQ CFUXFFO UIF QSFEJDUPS BOE UIF PVUDPNF 4P OPX MFUT EJTQMBZ B CVODI PG UIFTF MJOFT TP ZPV DBO TFF UIF TDBUUFS ćJT MFTTPO XJMM CF FBTJFS UP BQQSFDJBUF JG XF VTF POMZ TPNF PG UIF EBUB UP CFHJO ćFO ZPV DBO TFF IPX BEEJOH JO NPSF EBUB DIBOHFT UIF TDBUUFS PG UIF MJOFT 4P XFMM CFHJO XJUI KVTU UIF ĕSTU  DBTFT JO Ǐ ćF GPMMPXJOH DPEF FYUSBDUT UIF ĕSTU  DBTFT BOE SFFTUJNBUFT UIF NPEFM 3 DPEF   ʚǶ ǎǍ Each row is a line
  4.  "%%*/( " 13&%*$503  30 35 40 45 50

    55 60 140 150 160 170 180 weight height N = 10 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 50 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 150 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 352 Posterior is full of lines Figure 4.7 ćFO XF DPVME EJTQMBZ UIPTF MJOFT PO UIF QM SFMBUJPOTIJQ 5P CFUUFS BQQSFDJBUF IPX UIF QPTUFSJPS UIF TBNQMFT GSPN UIF NPEFM -FUT UBLF B DMP 3 DPEF  +*./ ʚǶ 3/-/ǡ.(+' .ǿ (Ǒǡǐ Ȁ +*./ȁǎǣǒǢȂ   .$"( ǎ ǎǒǑǡǒǒǍǒ ǍǡǖǏǏǏǐǔǏ ǒǡǎǕǕǓǐǎ Ǐ ǎǒǑǡǑǖǓǒ ǍǡǖǏǕǓǏǏǔ ǒǡǏǔǕǐǔǍ ǐ ǎǒǑǡǑǔǖǑ ǍǡǖǑǖǍǐǏǖ Ǒǡǖǐǔǒǎǐ Ǒ ǎǒǒǡǏǏǕǖ ǍǡǖǏǒǏǍǑǕ ǑǡǕǓǖǕǍǔ ǒ ǎǒǑǡǖǒǑǒ ǍǡǕǎǖǏǒǐǒ ǒǡǍǓǐǓǔǏ &BDI SPX JT B DPSSFMBUFE SBOEPN TBNQMF GSPN UIF DPWBSJBODFT QSPWJEFE CZ 1*1ǿ(ǑǡǐȀ ć MJOF ćF BWFSBHF PG WFSZ NBOZ PG UIFTF MJOFT UIBU BWFSBHF JT NFBOJOHGVM CFDBVTF JU BMUFST QSFEJDUPS BOE UIF PVUDPNF 4P OPX MFUT EJTQMBZ B CVODI PG UIFTF MJO FBTJFS UP BQQSFDJBUF JG XF VTF POMZ TPNF PG JO NPSF EBUB DIBOHFT UIF TDBUUFS PG UIF MJOF ćF GPMMPXJOH DPEF FYUSBDUT UIF ĕSTU  DBTF 3 DPEF   ʚǶ ǎǍ  ʚǶ Ǐȁ ǎǣ Ǣ Ȃ ( ʚǶ ,0+ǿ '$./ǿ # $"#/ ʡ )*-(ǿ (0 Ǣ .$"( Ȁ
  5.  "%%*/( " 13&%*$503  30 35 40 45 50

    55 60 140 150 160 170 180 weight height N = 10 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 50 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 150 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 352 Posterior is full of lines Figure 4.7
  6.  "%%*/( " 13&%*$503  30 35 40 45 50

    55 60 140 150 160 170 180 weight height N = 10 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 50 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 150 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 352 Posterior is full of lines Figure 4.7
  7. Posterior is full of lines Figure 4.7  "%%*/( "

    13&%*$503  30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 10 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 50 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 150 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 352
  8. Predict mu łĿIJ ƌƏ JT BO BQQFBMJOH EJTQMBZ CFDBVTF JU

    DPNNVOJDBUFT VODFSUBJOUZ BCPVU UIF SFMBUJPOTIJQ JO B XBZ UIBU NBOZ QFPQMF ĕOE JOUVJUJWF #VU JUT NPSF DPNNPO BOE PęFO NVDI DMFBSFS UP TFF UIF VODFSUBJOUZ EJTQMBZFE CZ QMPUUJOH BO JOUFSWBM PS DPOUPVS BSPVOE UIF BWFSBHF SFHSFT TJPO MJOF *O UIJT TFDUJPO *MM XBML ZPV UISPVHI IPX UP DPNQVUF BOZ BSCJUSBSZ JOUFSWBM ZPV MJLF VTJOH UIF VOEFSMZJOH DMPVE PG SFHSFTTJPO MJOFT FNCPEJFE JO UIF QPTUFSJPS EJTUSJCVUJPO 'PDVT GPS UIF NPNFOU PO B TJOHMF 2 $"#/ WBMVF TBZ  LJMPHSBNT :PV DBO RVJDLMZ NBLF B MJTU PG   WBMVFT PG µ GPS BO JOEJWJEVBM XIP XFJHIT  LJMPHSBNT CZ VTJOH ZPVS TBNQMFT GSPN UIF QPTUFSJPS 3 DPEF  +*./ ʚǶ 3/-/ǡ.(+' .ǿ (Ǒǡǐ Ȁ (0Ǿ/ǾǒǍ ʚǶ +*./ɶ ʔ +*./ɶ ȉ ǿ ǒǍ Ƕ 3- Ȁ ćF DPEF UP UIF SJHIU PG UIF ʚǶ BCPWF UBLFT JUT GPSN GSPN UIF FRVBUJPO GPS µJ  µJ = α + β(YJ − ¯ Y) ćF WBMVF PG YJ JO UIJT DBTF JT  (P BIFBE BOE UBLF B MPPL JOTJEF UIF SFTVMU (0Ǿ/ǾǒǍ *UT B WFDUPS PG QSFEJDUFE NFBOT POF GPS FBDI SBOEPN TBNQMF GSPN UIF QPTUFSJPS 4JODF KPJOU  BOE  XFOU JOUP DPNQVUJOH FBDI UIF WBSJBUJPO BDSPTT UIPTF NFBOT JODPSQPSBUFT UIF VODFSUBJOUZ JO BOE DPSSFMBUJPO CFUXFFO CPUI QBSBNFUFST *U NJHIU CF IFMQGVM BU UIJT QPJOU UP BDUVBMMZ QMPU UIF EFOTJUZ GPS UIJT WFDUPS PG NFBOT 3 DPEF   ).ǿ (0Ǿ/ǾǒǍ Ǣ *'ʙ-)"$Ǐ Ǣ '2ʙǏ Ǣ 3'ʙǫ(0Ȇ2 $"#/ʙǒǍǫ Ȁ * SFQSPEVDF UIJT QMPU JO 'ĶĴłĿIJ ƌƐ 4JODF UIF DPNQPOFOUT PG µ IBWF EJTUSJCVUJPOT TP UPP EPFT µ "OE TJODF UIF EJTUSJCVUJPOT PG α BOE β BSF (BVTTJBO TP UP JT UIF EJTUSJCVUJPO PG µ BEEJOH (BVTTJBO EJTUSJCVUJPOT BMXBZT QSPEVDFT B (BVTTJBO EJTUSJCVUJPO  4JODF UIF QPTUFSJPS GPS µ JT B EJTUSJCVUJPO ZPV DBO ĕOE JOUFSWBMT GPS JU KVTU MJLF GPS BOZ *./ ʚǶ 3/-/ǡ.(+' .ǿ (Ǒǡǐ Ȁ 0Ǿ/ǾǒǍ ʚǶ +*./ɶ ʔ +*./ɶ ȉ ǿ ǒǍ Ƕ 3- Ȁ ćF DPEF UP UIF SJHIU PG UIF ʚǶ BCPWF UBLFT JUT GPSN GSPN UIF FRVBUJPO GPS µJ  µJ = α + β(YJ − ¯ Y) ćF WBMVF PG YJ JO UIJT DBTF JT  (P BIFBE BOE UBLF B MPPL JOTJEF UIF SFTVMU (0Ǿ/ǾǒǍ *UT B FDUPS PG QSFEJDUFE NFBOT POF GPS FBDI SBOEPN TBNQMF GSPN UIF QPTUFSJPS 4JODF KPJOU  BOE XFOU JOUP DPNQVUJOH FBDI UIF WBSJBUJPO BDSPTT UIPTF NFBOT JODPSQPSBUFT UIF VODFSUBJOUZ O BOE DPSSFMBUJPO CFUXFFO CPUI QBSBNFUFST *U NJHIU CF IFMQGVM BU UIJT QPJOU UP BDUVBMMZ QMPU IF EFOTJUZ GPS UIJT WFDUPS PG NFBOT ).ǿ (0Ǿ/ǾǒǍ Ǣ *'ʙ-)"$Ǐ Ǣ '2ʙǏ Ǣ 3'ʙǫ(0Ȇ2 $"#/ʙǒǍǫ Ȁ SFQSPEVDF UIJT QMPU JO 'ĶĴłĿIJ ƌƐ 4JODF UIF DPNQPOFOUT PG µ IBWF EJTUSJCVUJPOT TP UPP PFT µ "OE TJODF UIF EJTUSJCVUJPOT PG α BOE β BSF (BVTTJBO TP UP JT UIF EJTUSJCVUJPO PG µ EEJOH (BVTTJBO EJTUSJCVUJPOT BMXBZT QSPEVDFT B (BVTTJBO EJTUSJCVUJPO  4JODF UIF QPTUFSJPS GPS µ JT B EJTUSJCVUJPO ZPV DBO ĕOE JOUFSWBMT GPS JU KVTU MJLF GPS BOZ PTUFSJPS EJTUSJCVUJPO 5P ĕOE UIF  IJHIFTU QPTUFSJPS EFOTJUZ JOUFSWBM PG µ BU  LH KVTU TF UIF  DPNNBOE BT VTVBM
  9. Predict mu łĿIJ ƌƏ JT BO BQQFBMJOH EJTQMBZ CFDBVTF JU

    DPNNVOJDBUFT VODFSUBJOUZ BCPVU UIF SFMBUJPOTIJQ JO B XBZ UIBU NBOZ QFPQMF ĕOE JOUVJUJWF #VU JUT NPSF DPNNPO BOE PęFO NVDI DMFBSFS UP TFF UIF VODFSUBJOUZ EJTQMBZFE CZ QMPUUJOH BO JOUFSWBM PS DPOUPVS BSPVOE UIF BWFSBHF SFHSFT TJPO MJOF *O UIJT TFDUJPO *MM XBML ZPV UISPVHI IPX UP DPNQVUF BOZ BSCJUSBSZ JOUFSWBM ZPV MJLF VTJOH UIF VOEFSMZJOH DMPVE PG SFHSFTTJPO MJOFT FNCPEJFE JO UIF QPTUFSJPS EJTUSJCVUJPO 'PDVT GPS UIF NPNFOU PO B TJOHMF 2 $"#/ WBMVF TBZ  LJMPHSBNT :PV DBO RVJDLMZ NBLF B MJTU PG   WBMVFT PG µ GPS BO JOEJWJEVBM XIP XFJHIT  LJMPHSBNT CZ VTJOH ZPVS TBNQMFT GSPN UIF QPTUFSJPS 3 DPEF  +*./ ʚǶ 3/-/ǡ.(+' .ǿ (Ǒǡǐ Ȁ (0Ǿ/ǾǒǍ ʚǶ +*./ɶ ʔ +*./ɶ ȉ ǿ ǒǍ Ƕ 3- Ȁ ćF DPEF UP UIF SJHIU PG UIF ʚǶ BCPWF UBLFT JUT GPSN GSPN UIF FRVBUJPO GPS µJ  µJ = α + β(YJ − ¯ Y) ćF WBMVF PG YJ JO UIJT DBTF JT  (P BIFBE BOE UBLF B MPPL JOTJEF UIF SFTVMU (0Ǿ/ǾǒǍ *UT B WFDUPS PG QSFEJDUFE NFBOT POF GPS FBDI SBOEPN TBNQMF GSPN UIF QPTUFSJPS 4JODF KPJOU  BOE  XFOU JOUP DPNQVUJOH FBDI UIF WBSJBUJPO BDSPTT UIPTF NFBOT JODPSQPSBUFT UIF VODFSUBJOUZ JO BOE DPSSFMBUJPO CFUXFFO CPUI QBSBNFUFST *U NJHIU CF IFMQGVM BU UIJT QPJOU UP BDUVBMMZ QMPU UIF EFOTJUZ GPS UIJT WFDUPS PG NFBOT 3 DPEF   ).ǿ (0Ǿ/ǾǒǍ Ǣ *'ʙ-)"$Ǐ Ǣ '2ʙǏ Ǣ 3'ʙǫ(0Ȇ2 $"#/ʙǒǍǫ Ȁ * SFQSPEVDF UIJT QMPU JO 'ĶĴłĿIJ ƌƐ 4JODF UIF DPNQPOFOUT PG µ IBWF EJTUSJCVUJPOT TP UPP EPFT µ "OE TJODF UIF EJTUSJCVUJPOT PG α BOE β BSF (BVTTJBO TP UP JT UIF EJTUSJCVUJPO PG µ BEEJOH (BVTTJBO EJTUSJCVUJPOT BMXBZT QSPEVDFT B (BVTTJBO EJTUSJCVUJPO  4JODF UIF QPTUFSJPS GPS µ JT B EJTUSJCVUJPO ZPV DBO ĕOE JOUFSWBMT GPS JU KVTU MJLF GPS BOZ *./ ʚǶ 3/-/ǡ.(+' .ǿ (Ǒǡǐ Ȁ 0Ǿ/ǾǒǍ ʚǶ +*./ɶ ʔ +*./ɶ ȉ ǿ ǒǍ Ƕ 3- Ȁ ćF DPEF UP UIF SJHIU PG UIF ʚǶ BCPWF UBLFT JUT GPSN GSPN UIF FRVBUJPO GPS µJ  µJ = α + β(YJ − ¯ Y) ćF WBMVF PG YJ JO UIJT DBTF JT  (P BIFBE BOE UBLF B MPPL JOTJEF UIF SFTVMU (0Ǿ/ǾǒǍ *UT B FDUPS PG QSFEJDUFE NFBOT POF GPS FBDI SBOEPN TBNQMF GSPN UIF QPTUFSJPS 4JODF KPJOU  BOE XFOU JOUP DPNQVUJOH FBDI UIF WBSJBUJPO BDSPTT UIPTF NFBOT JODPSQPSBUFT UIF VODFSUBJOUZ O BOE DPSSFMBUJPO CFUXFFO CPUI QBSBNFUFST *U NJHIU CF IFMQGVM BU UIJT QPJOU UP BDUVBMMZ QMPU IF EFOTJUZ GPS UIJT WFDUPS PG NFBOT ).ǿ (0Ǿ/ǾǒǍ Ǣ *'ʙ-)"$Ǐ Ǣ '2ʙǏ Ǣ 3'ʙǫ(0Ȇ2 $"#/ʙǒǍǫ Ȁ SFQSPEVDF UIJT QMPU JO 'ĶĴłĿIJ ƌƐ 4JODF UIF DPNQPOFOUT PG µ IBWF EJTUSJCVUJPOT TP UPP PFT µ "OE TJODF UIF EJTUSJCVUJPOT PG α BOE β BSF (BVTTJBO TP UP JT UIF EJTUSJCVUJPO PG µ EEJOH (BVTTJBO EJTUSJCVUJPOT BMXBZT QSPEVDFT B (BVTTJBO EJTUSJCVUJPO  4JODF UIF QPTUFSJPS GPS µ JT B EJTUSJCVUJPO ZPV DBO ĕOE JOUFSWBMT GPS JU KVTU MJLF GPS BOZ PTUFSJPS EJTUSJCVUJPO 5P ĕOE UIF  IJHIFTU QPTUFSJPS EFOTJUZ JOUFSWBM PG µ BU  LH KVTU TF UIF  DPNNBOE BT VTVBM   (&0$&/53*$ .0%&-4 158.0 158.5 159.0 159.5 160.0 160.5 0.0 0.2 0.4 0.6 0.8 1.0 1.2 mu|weight=50 Density 'ĶĴłĿIJ ƌƐ ćF RVBESBUJD BQQSPYJN SJPS EJTUSJCVUJPO PG UIF NFBO IFJHIU XFJHIU JT  LH ćJT EJTUSJCVUJPO S UIF SFMBUJWF QMBVTJCJMJUZ PG EJČFSFOU UIF NFBO Figure 4.8
  10. Predict every mu UJPO ćF EFGBVMU JT  TBNQMFT CVU

    ZPV DBO VTF BT NBOZ PS BT GFX BT ZPV MJLF &BDI DPMVNO JT B DBTF SPX JO UIF EBUB ćFSF BSF  SPXT JO Ǐ DPSSFTQPOEJOH UP  JOEJWJEVBMT 4P UIFSF BSF  DPMVNOT JO UIF NBUSJY (0 BCPWF /PX XIBU DBO XF EP XJUI UIJT CJH NBUSJY -PUT PG UIJOHT ćF GVODUJPO '$)& QSPWJEFT B QPTUFSJPS EJTUSJCVUJPO PG µ GPS FBDI DBTF XF GFFE JU 4P BCPWF XF IBWF B EJTUSJCVUJPO PG µ GPS FBDI JOEJWJEVBM JO UIF PSJHJOBM EBUB 8F BDUVBMMZ XBOU TPNFUIJOH TMJHIUMZ EJČFSFOU B EJTUSJCVUJPO PG µ GPS FBDI VOJRVF XFJHIU WBMVF PO UIF IPSJ[POUBM BYJT *UT POMZ TMJHIUMZ IBSEFS UP DPNQVUF UIBU CZ KVTU QBTTJOH '$)& TPNF OFX EBUB 3 DPEF  ȕ  !$) . ,0 ) *! 2 $"#/. /* *(+0/ +- $/$*). !*- ȕ /# . 1'0 . 2$''  *) /# #*-$5*)/' 3$. 2 $"#/ǡ. , ʚǶ . ,ǿ !-*(ʙǏǒ Ǣ /*ʙǔǍ Ǣ 4ʙǎ Ȁ ȕ 0. '$)& /* *(+0/ (0 ȕ !*- # .(+' !-*( +*./ -$*- ȕ ) !*- # 2 $"#/ $) 2 $"#/ǡ. , (0 ʚǶ '$)&ǿ (Ǒǡǐ Ǣ /ʙ/ǡ!-( ǿ2 $"#/ʙ2 $"#/ǡ. ,Ȁ Ȁ ./-ǿ(0Ȁ )0( ȁǎǣǎǍǍǍǢ ǎǣǑǓȂ ǎǐǓ ǎǐǓ ǎǐǕ ǎǐǓ ǎǐǔ ǡǡǡ "OE OPX UIFSF BSF POMZ  DPMVNOT JO (0 CFDBVTF XF GFE JU  EJČFSFOU WBMVFT GPS 2 $"#/ 5P WJTVBMJ[F XIBU ZPVWF HPU IFSF MFUT QMPU UIF EJTUSJCVUJPO PG µ WBMVFT BU FBDI IFJHIU PO UIF QMPU 3 DPEF  ȕ 0. /4+ ʙǫ)ǫ /* #$ -2 / +'*/ǿ # $"#/ ʡ 2 $"#/ Ǣ Ǐ Ǣ /4+ ʙǫ)ǫ Ȁ We want a distribution for every value of x
  11. How link works • Sample from posterior • Define series

    of predictor (weight) values • For each predictor value • For each sample from posterior • Compute mu: a + b*(weight – xbar) • Summarize   (&0$&/53*$ .0%&-4 BDDPNQMJTI UIF TBNF UIJOH GPS BOZ NPEFM ĕU CZ BOZ NFBOT CZ QFSGPSNJOH UIFTF TUFQT ZPVSTFMG ćJT JT IPX JUE MPPL GPS (Ǒǡǐ 3 DPEF  +*./ ʚǶ 3/-/ǡ.(+' .ǿ(ǑǡǐȀ (0ǡ'$)& ʚǶ !0)/$*)ǿ2 $"#/Ȁ +*./ɶ ʔ +*./ɶȉǿ 2 $"#/ Ƕ 3- Ȁ 2 $"#/ǡ. , ʚǶ . ,ǿ !-*(ʙǏǒ Ǣ /*ʙǔǍ Ǣ 4ʙǎ Ȁ (0 ʚǶ .++'4ǿ 2 $"#/ǡ. , Ǣ (0ǡ'$)& Ȁ (0ǡ( ) ʚǶ ++'4ǿ (0 Ǣ Ǐ Ǣ ( ) Ȁ (0ǡ  ʚǶ ++'4ǿ (0 Ǣ Ǐ Ǣ  Ǣ +-*ʙǍǡǕǖ Ȁ "OE UIF WBMVFT JO (0ǡ( ) BOE (0ǡ  TIPVME CF WFSZ TJNJMBS BMMPXJOH GPS TJNVMBUJPO WBSJBODF UP XIBU ZPV HPU UIF BVUPNBUFE XBZ VTJOH '$)& ,OPXJOH UIJT NBOVBM NFUIPE JT VTFGVM CPUI GPS  VOEFSTUBOEJOH BOE  TIFFS QPXFS 8IBUFWFS UIF NPEFM ZPV ĕOE ZPVSTFMG XJUI UIJT BQQSPBDI DBO CF VTFE UP HFOFSBUF QPTUFSJPS QSFEJDUJPOT GPS BOZ
  12. Figure 4.9   (&0$&/53*$ .0%&-4 30 35 40 45

    50 55 60 140 150 160 170 180 weight height 30 35 40 45 50 55 60 140 150 160 170 180 weight height 'ĶĴłĿIJ ƌƑ -Fę ćF ĕSTU  WBMVFT JO UIF EJTUSJCVUJPO PG µ BU FBDI XFJHIU WBMVF 3JHIU ćF ,VOH IFJHIU EBUB BHBJO OPX XJUI  )1%* PG UIF NFBO JOEJDBUFE CZ UIF TIBEFE SFHJPO $PNQBSF UIJT SFHJPO UP UIF EJTUSJCVUJPOT PG
  13. 30 35 40 45 50 55 60 140 150 160

    170 180 weight height N = 10 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 20 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 50 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 100 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 200 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 350
  14. 30 35 40 45 50 55 60 140 150 160

    170 180 weight height N = 10 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 20 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 50 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 100 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 200 30 35 40 45 50 55 60 140 150 160 170 180 weight height N = 350
  15. Figure 4.10 89% prediction interval Nothing special about 95% Try

    50%, 80%, 99% Interested in shape, not boundaries  "%%*/( " 13&%*$503 30 35 40 45 50 55 60 140 150 160 170 180 weight height 'ĶĴłĿIJ ƌƉƈ  QSFEJDUJPO IFJHIU BT B GVODUJPO PG XFJHIU JT UIF BWFSBHF MJOF GPS UIF NFBO XFJHIU ćF UXP TIBEFE SFHJPO FOU  QMBVTJCMF SFHJPOT ćF O JOUFSWBM BSPVOE UIF MJOF JT UIF EJ ćF XJEFS TIBEFE SFHJPO SFQSFT XJUIJO XIJDI UIF NPEFM FYQFD PG BDUVBM IFJHIUT JO UIF QPQVM XFJHIU -FUT QMPU FWFSZUIJOH XFWF CVJMU VQ  UIF BWFSBHF MJOF  UIF TIBEFE
  16. Curves From Lines • “Linear” models can make curves •

    Polynomial regression • Common • Badly behaved • Splines • Very flexible • Highly geocentric
  17. Polynomial regression • Purely descriptive (geocentric) strategy: use polynomial of

    predictor variable ɠ (' Ǒ $)/ Ƽ ƻ ƻ Ƽ ƻ Ƽ ƻ Ƽ ƻ Ƽ ǏǏǏ (P BIFBE BOE QMPU # $"#/ BHBJOTU 2 $"#/ ćF SFMBUJPOTIJQ JT WJTJCMZ DVSWF UIBU XFWF JODMVEFE UIF OPOBEVMU JOEJWJEVBMT ćFSF BSF NBOZ XBZT UP NPEFM B DVSWFE SFMBUJPOTIJQ CFUXFFO UXP WBS )FSF *MM TIPX ZPV B WFSZ DPNNPO POF ĽļĹņĻļĺĶĮĹ ĿIJĴĿIJŀŀĶļĻ *O UI UFYU iQPMZOPNJBMw NFBOT FRVBUJPOT GPS µJ UIBU BEE BEEJUJPOBM UFSNT XJUI TR DVCFT BOE FWFO IJHIFS QPXFST PG UIF QSFEJDUPS WBSJBCMF ćFSFT TUJMM POMZ PO EJDUPS WBSJBCMF JO UIF NPEFM TP UIJT JT TUJMM B CJWBSJBUF SFHSFTTJPO #VU UIF EFĕ PG µJ IBT NPSF QBSBNFUFST OPX )FSFT UIF NPTU DPNNPO QPMZOPNJBM SFHSFTTJPO B QBSBCPMJD NPEFM NFBO µJ = α + β YJ + β Y J ćF BCPWF JT B QBSBCPMJD TFDPOE PSEFS QPMZOPNJBM ćF α+β YJ QBSU JT UI MJOFBS GVODUJPO PG Y JO B MJOFBS SFHSFTTJPO KVTU XJUI B MJUUMF iw TVCTDSJQU BE UIFQBSBNFUFSOBNF TP XFDBOUFMM JUBQBSUGSPNUIFOFX QBSBNFUFS ćFBEE UFSN VTFT UIF TRVBSF PG YJ UP DPOTUSVDU B QBSBCPMB SBUIFS UIBO B QFSGFDUMZ T MJOF ćF OFX QBSBNFUFS β NFBTVSFT UIF DVSWBUVSF PG UIF SFMBUJPOTIJQ 1st order (line): (P BIFBE BOE QMPU # $"#/ BHBJOTU 2 $"#/ ćF SFMBUJPOTIJQ JT WJTJCMZ DVSWF UIBU XFWF JODMVEFE UIF OPOBEVMU JOEJWJEVBMT ćFSF BSF NBOZ XBZT UP NPEFM B DVSWFE SFMBUJPOTIJQ CFUXFFO UXP WBS )FSF *MM TIPX ZPV B WFSZ DPNNPO POF ĽļĹņĻļĺĶĮĹ ĿIJĴĿIJŀŀĶļĻ *O UI UFYU iQPMZOPNJBMw NFBOT FRVBUJPOT GPS µJ UIBU BEE BEEJUJPOBM UFSNT XJUI TR DVCFT BOE FWFO IJHIFS QPXFST PG UIF QSFEJDUPS WBSJBCMF ćFSFT TUJMM POMZ PO EJDUPS WBSJBCMF JO UIF NPEFM TP UIJT JT TUJMM B CJWBSJBUF SFHSFTTJPO #VU UIF EFĕ PG µJ IBT NPSF QBSBNFUFST OPX )FSFT UIF NPTU DPNNPO QPMZOPNJBM SFHSFTTJPO B QBSBCPMJD NPEFM NFBO µJ = α + β YJ + β Y J ćF BCPWF JT B QBSBCPMJD TFDPOE PSEFS QPMZOPNJBM ćF α+β YJ QBSU JT UI MJOFBS GVODUJPO PG Y JO B MJOFBS SFHSFTTJPO KVTU XJUI B MJUUMF iw TVCTDSJQU BE UIFQBSBNFUFSOBNF TP XFDBOUFMM JUBQBSUGSPNUIFOFX QBSBNFUFS ćFBEE UFSN VTFT UIF TRVBSF PG YJ UP DPOTUSVDU B QBSBCPMB SBUIFS UIBO B QFSGFDUMZ T MJOF ćF OFX QBSBNFUFS β NFBTVSFT UIF DVSWBUVSF PG UIF SFMBUJPOTIJQ 3FUIJOLJOH -JOFBS BEEJUJWF GVOLZ ćF QBSBCPMJD NPEFM PG µJ BCPWF JT TUJMM D iMJOFBS NPEFMw PG UIF NFBO ćJT JT TP FWFO UIPVHI UIF FRVBUJPO JT DMFBSMZ OPU PG B 2nd order (parabola):
  18. Polynomial regression • We’ll use full !Kung height/weight data adults

    (18+) 10 20 30 40 50 60 60 80 100 120 140 160 180 weight height
  19. Parabolic model of height DBO TVČFS GSPN UIFTF HMJUDIFT MFBEJOH

    UP NJTUBLFO FTUJNBUFT ćF NPO GPS QPMZOPNJBM SFHSFTTJPO CFDBVTF UIF TRVBSF PS DVCF PG B NBTTJWF 4UBOEBSEJ[JOH MBSHFMZ SFTPMWFT UIJT JTTVF *U TIPVME CF ZP 5P EFĕOF UIF QBSBCPMJD NPEFM KVTU NPEJGZ UIF EFĕOJUJPO PG µ IJ ∼ /PSNBM(µJ, σ) # µJ = α + β YJ + β Y J (0 ʚǶ  ʔ ǎȉ2 α ∼ /PSNBM(, ) β ∼ -PH/PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ 6OJGPSN(, ) ćF DPOGVTJOH JTTVF IFSF JT BTTJHOJOH B QSJPS GPS β UIF QBSBNFU Y 6OMJLF β XF EPOU XBOU B QPTJUJWF DPOTUSBJOU *O UIF QSPCMFN ZPVMM VTF QSJPS QSFEJDUJWF TJNVMBUJPO UP VOEFSTUBOE XIZ ćFTF Q JO HFOFSBM WFSZ EJďDVMU UP TFU SFBMJTUJD QSJPST GPS XIJDI JT BOPUIFS S
  20. Standardized predictors • Very helpful to standardize predictor variables before

    fitting • Makes estimation easier • Helps interpretation (sometimes) • To standardize: • subtract mean • divide by standard deviation • result: mean of zero and standard deviation of 1
  21. Parabolic model of height ćF DPOGVTJOH JTTVF IFSF JT BTTJHOJOH

    B QSJPS GPS β UIF QBSBNFUFS PO UIF TRVBSFE WBMVF PG Y 6OMJLF β XF EPOU XBOU B QPTJUJWF DPOTUSBJOU *O UIF QSPCMFNT BU UIF FOE PG UIF DIBQUFS ZPVMM VTF QSJPS QSFEJDUJWF TJNVMBUJPO UP VOEFSTUBOE XIZ ćFTF QPMZOPNJBM QBSBNFUFST BSF JO HFOFSBM WFSZ EJďDVMU UP TFU SFBMJTUJD QSJPST GPS XIJDI JT BOPUIFS SFBTPO UP BWPJE QPMZOPNJBM NPEFMT "QQSPYJNBUJOH UIF QPTUFSJPS JT TUSBJHIUGPSXBSE +VTU NPEJGZ UIF EFĕOJUJPO PG (0 TP UIBU JU DPOUBJOT CPUI UIF MJOFBS BOE RVBESBUJD UFSNT #VU JO HFOFSBM JU JT CFUUFS UP QSFQSPDFTT BOZ WBSJBCMF USBOTGPSNBUJPOT 4P *MM BMTP CVJME UIF TRVBSF PG 2 $"#/ǡ. BT B TFQBSBUF WBSJBCMF 3 DPEF  ɶ2 $"#/Ǿ. ʚǶ ǿ ɶ2 $"#/ Ƕ ( )ǿɶ2 $"#/Ȁ Ȁȅ.ǿɶ2 $"#/Ȁ ɶ2 $"#/Ǿ.Ǐ ʚǶ ɶ2 $"#/Ǿ.ʟǏ (Ǒǡǒ ʚǶ ,0+ǿ '$./ǿ # $"#/ ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ  ʔ ǎȉ2 $"#/Ǿ. ʔ Ǐȉ2 $"#/Ǿ.Ǐ Ǣ  ʡ )*-(ǿ ǎǔǕ Ǣ ǏǍ Ȁ Ǣ ǎ ʡ ')*-(ǿ Ǎ Ǣ ǎ Ȁ Ǣ Ǐ ʡ )*-(ǿ Ǎ Ǣ ǎ Ȁ Ǣ .$"( ʡ 0)$!ǿ Ǎ Ǣ ǒǍ Ȁ Ȁ Ǣ /ʙ Ȁ /PX TJODF UIF SFMBUJPOTIJQ CFUXFFO UIF PVUDPNF # $"#/ BOE UIF QSFEJDUPS 2 $"#/ EFQFOET VQPO UXP TMPQFT ǎ BOE Ǐ JU JTOU TP FBTZ UP SFBE UIF SFMBUJPOTIJQ PČ B UBCMF PG DPFďDJFOUT 3 DPEF  +- $.ǿ (Ǒǡǒ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ   (&0$&/53*$ . DBO TVČFS GSPN UIFTF HMJUDIFT MFBEJOH UP NJTUBLFO NPO GPS QPMZOPNJBM SFHSFTTJPO CFDBVTF UIF TRVB NBTTJWF 4UBOEBSEJ[JOH MBSHFMZ SFTPMWFT UIJT JTTVF 5P EFĕOF UIF QBSBCPMJD NPEFM KVTU NPEJGZ UIF IJ ∼ /PSNBM(µJ, σ) µJ = α + β YJ + β Y J (0 α ∼ /PSNBM(, ) β ∼ -PH/PSNBM(, ) β ∼ /PSNBM(, ) σ ∼ 6OJGPSN(, ) ćF DPOGVTJOH JTTVF IFSF JT BTTJHOJOH B QSJPS GPS β Y 6OMJLF β XF EPOU XBOU B QPTJUJWF DPOTUSBJOU ZPVMM VTF QSJPS QSFEJDUJWF TJNVMBUJPO UP VOEFSTUBO JO HFOFSBM WFSZ EJďDVMU UP TFU SFBMJTUJD QSJPST GPS XI NPEFMT "QQSPYJNBUJOH UIF QPTUFSJPS JT TUSBJHIUGPSXBS JU DPOUBJOT CPUI UIF MJOFBS BOE RVBESBUJD UFSNT #V WBSJBCMF USBOTGPSNBUJPOT 4P *MM BMTP CVJME UIF TRV
  22. -2 -1 0 1 2 60 100 140 180 weight.s

    height N = 10 -2 -1 0 1 2 60 100 140 180 weight.s height N = 20
  23. -2 -1 0 1 2 60 100 140 180 weight.s

    height N = 10 -2 -1 0 1 2 60 100 140 180 weight.s height N = 20 -2 -1 0 1 2 60 100 140 180 weight.s height N = 50
  24. -2 -1 0 1 2 60 100 140 180 weight.s

    height N = 10 -2 -1 0 1 2 60 100 140 180 weight.s height N = 20 -2 -1 0 1 2 60 100 140 180 weight.s height N = 50 -2 -1 0 1 2 60 100 140 180 weight.s height N = 100
  25. -2 -1 0 1 2 60 100 140 180 weight.s

    height N = 10 -2 -1 0 1 2 60 100 140 180 weight.s height N = 20 -2 -1 0 1 2 60 100 140 180 weight.s height N = 50 -2 -1 0 1 2 60 100 140 180 weight.s height N = 100 -2 -1 0 1 2 60 100 140 180 weight.s height N = 300
  26. -2 -1 0 1 2 60 100 140 180 weight.s

    height N = 10 -2 -1 0 1 2 60 100 140 180 weight.s height N = 20 -2 -1 0 1 2 60 100 140 180 weight.s height N = 50 -2 -1 0 1 2 60 100 140 180 weight.s height N = 100 -2 -1 0 1 2 60 100 140 180 weight.s height N = 300 -2 -1 0 1 2 60 100 140 180 weight.s height N = 544
  27. Cubic model • Can go further down the rabbit hole:

    TPNF TQFDUVMBSMZ QPPS QSFEJDUJPOT BU CPUI WFSZ MPX BOE NJEEMF XFJHIUT $PN QBSF UIJT UP QBOFM C PVS OFX QBSBCPMJD SFHSFTTJPO ćF DVSWF EPFT B NVDI CFUUFS KPC PG ĕOEJOH B DFOUSBM QBUI UISPVHI UIF EBUB 1BOFM D JO 'ĶĴłĿIJ ƌƉƈ TIPXT B IJHIFSPSEFS QPMZOPNJBM SFHSFTTJPO B DVCJD SFHSFTTJPO PO XFJHIU ćF NPEFM JT BHBJO XJUI MB[Z ĘBU QSJPST  IJ ∼ /PSNBM(µJ, σ) µJ = α + β YJ + β Y J + β Y J 'JU UIF NPEFM XJUI B TMJHIU NPEJĕDBUJPO PG UIF QBSBCPMJD NPEFMT DPEF (ƿǏǁ ʄǤ (+ǭ '$./ǭ # $"#/ ʋ )*-(ǭ ( )ʃ(0 ǐ .ʃ.$"( Ǯ ǐ (0 ʋ  ɾ ƼǷ2 $"#/Ǐ. ɾ ƽǷ2 $"#/Ǐ.ʉƽ ɾ ƾǷ2 $"#/Ǐ.ʉƾ Ǯ ǐ /ʃ ǐ ./-/ʃ'$./ǭ ʃ( )ǭɠ# $"#/Ǯ ǐ Ƽʃƻ ǐ ƽʃƻ ǐ ƾʃƻ ǐ .$"(ʃƼƻ Ǯ Ǯ $PNQVUJOH UIF DVSWF BOE DPOĕEFODF JOUFSWBMT JT TJNJMBSMZ B TNBMM NPEJĕDBUJPO PG UIF QSFWJPVT DPEF ćJT DVCJD DVSWF JT FWFO NPSF ĘFYJCMF UIBO UIF QBSBCPMB TP   (&0$&/53*$ .0%&-4 -2 -1 0 1 2 60 80 100 120 140 160 180 weight_s height linear -2 -1 0 1 2 60 80 100 120 140 160 180 weight_s height quadratic -2 -1 0 1 2 60 80 100 120 140 160 180 weight_s height cubic 'ĶĴłĿIJ ƌƉƉ 1PMZOPNJBM SFHSFTTJPOT PG IFJHIU PO XFJHIU TUBOEBSEJ[FE Figure 4.11
  28. Polynomial grief • Polynomials make absurd predictions outside range of

    data • Parameters influence every part of curve, so hard to understand • Not actually very flexible — can’t have a monotonic curve! -2 -1 0 1 2 60 100 140 180 weight.s height N = 10
  29. Going Local — B-Splines • Basis-Splines: Wiggly function built from

    many local, less wiggly functions • Basis function: A local function • Better than polynomials, but equally geocentric • Bayesian B-splines often called P-splines.
  30. Going Local — B-Splines • B-Splines are just linear models,

    but with some weird synthetic variables: • Weights w are like slopes • Basis functions B are synthetic variables • In spirit like a squared or cubed terms • But observed data not used to build B • B values turn on weights in different regions of x variable FSFT B MPOHFS FYQMBOBUJPO XJUI WJTVBM FYBNQMFT 0VS HPBM JT UP BQQSPYJNBUF UI SF USFOE XJUI B XJHHMZ GVODUJPO 8JUI #TQMJOFT KVTU MJLF XJUI QPMZOPNJBM SFHS UIJT CZ HFOFSBUJOH OFX QSFEJDUPS WBSJBCMFT BOE VTJOH UIPTF JO UIF MJOFBS NPEFM PMZOPNJBM SFHSFTTJPO #TQMJOFT EP OPU EJSFDUMZ USBOTGPSN UIF QSFEJDUPS CZ TRVB H JU *OTUFBE UIFZ JOWFOU B TFSJFT PG FOUJSFMZ OFX TZOUIFUJD QSFEJDUPS WBSJBCMFT TF WBSJBCMFT TFSWFT UP HSBEVBMMZ UVSO B TQFDJĕD QBSBNFUFS PO BOE PČ XJUIJO B T PG UIF QSFEJDUPS WBSJBCMFT &BDI PG UIFTF WBSJBCMFT JT DBMMFE B įĮŀĶŀ ijłĻİŁĶļ NPEFM FOET VQ MPPLJOH WFSZ GBNJMJBS µJ = α + X #J, + X #J, + X #J, + ... #J,O JT UIF OUI CBTJT GVODUJPOT WBMVF PO SPX J BOE UIF X QBSBNFUFST BSF DPSSF FJHIUT GPS FBDI ćF QBSBNFUFST BDU MJLF TMPQFT BEKVTUJOH UIF JOĘVFODF PG FBD PO PO UIF NFBO µJ  4P SFBMMZ UIJT JT KVTU BOPUIFS MJOFBS SFHSFTTJPO CVU XJUI TPN UJD QSFEJDUPS WBSJBCMFT ćFTF TZOUIFUJD WBSJBCMFT EP TPNF SFBMMZ FMFHBOU EFTDSJ OUSJD‰XPSL GPS VT PX EP XF DPOTUSVDU UIFTF CBTJT WBSJBCMFT # * EJTQMBZ UIF TJNQMFTU DBTF JO 'ĶĴłĿ DI * BQQSPYJNBUF UIF UFNQFSBUVSF EBUB XJUI GPVS EJČFSFOU MJOFBS BQQSPYJNBUJPO F UIF GVMM SBOHF PG UIF IPSJ[POUBM BYJT JOUP GPVS QBSUT VTJOH QJWPU QPJOUT DBMMFE OPUT BSF TIPXO CZ UIF + TZNCPMT JO UIF UPQ QMPU ćFO ĕWF EJČFSFOU CBTJT GVO
  31. B-Spline of Climate CFDBVTF TPNFUJNFT QSJPST BSF UIPVHIU PG BT

    QFOBMUJFT ćJT UFSN XJMM NBLF NPSF TFOTF BęFS $IBQUFS  5P TFF IPX #TQMJOFT XPSL XFMM OFFE BO FYBNQMF UIBU JT NVDI XJHHMJFS‰UIBUT B TDJ FOUJĕD UFSN‰UIBO UIF ,VOH TUBUVSF EBUB -FUT MPBE B UIPVTBOE ZFBST PG +BQBOFTF DIFSSZ CMPTTPN EBUFT 3 DPEF  '$--4ǿ- /#$)&$)"Ȁ /ǿ# --4Ǿ'*..*(.Ȁ  ʚǶ # --4Ǿ'*..*(. +- $.ǿȀ Ǫ/ǡ!-( Ǫǣ ǎǏǎǒ *.ǡ *! ǒ 1-$' .ǣ ( ) . ǒǡǒʉ ǖǑǡǒʉ #$./*"-( 4 - ǎǑǍǕǡǍǍ ǐǒǍǡǕǕ ǕǓǔǡǔǔ ǎǖǑǕǡǏǐ ΪΪΪΪΪΪΪΪΪΪΪΪΤ *4 ǎǍǑǡǒǑ ǓǡǑǎ ǖǑǡǑǐ ǎǎǒǡǍǍ ΤΥΨΪΪΦΤΤ / (+ ǓǡǎǑ ǍǡǓǓ ǒǡǎǒ ǔǡǏǖ ΤΦΨΪΦΥΤΤ / (+Ǿ0++ - ǔǡǎǖ Ǎǡǖǖ ǒǡǖǍ ǕǡǖǍ ΤΥΨΪΪΨΥΥΤΤΤΤΤΤΤ / (+Ǿ'*2 - ǒǡǎǍ ǍǡǕǒ ǐǡǔǖ Ǔǡǐǔ ΤΤΤΤΤΤΤΦΨΪΦΥΤΤΤ 4FF Ǩ# --4Ǿ'*..*(. GPS EFUBJMT BOE TPVSDFT 8FSF HPJOH UP XPSL XJUI KVTU UIF SFDPO TUSVDUFE UFNQFSBUVSF SFDPSE / (+ GPS OPX *U SVOT GSPN UIF ZFBS  UP  *U JT WFSZ XJHHMZ :PV TIPVME HP BIFBE BOE QMPU / (+ BHBJOTU 4 - UP TFF /P QBSBCPMJD DVSWF JT HPJOH UP EP UIF KPC IFSF ćF TIPSU FYQMBOBUJPO PG #TQMJOFT JT UIBU UIFZ EJWJEF UIF GVMM SBOHF PG TPNF QSFEJDUPS WBSJBCMF MJLF 4 - JOUP QBSUT ćFO UIFZ BTTJHO B QBSBNFUFS UP FBDI QBSU ćFTF QBSBNFUFST BSF HSBEVBMMZ UVSOFE PO BOE PČ JO B XBZ UIBU NBLFT UIFJS TVN JOUP B GBODZ XJHHMZ DVSWF ćF MPOH FYQMBOBUJPO DPOUBJOT MPUT NPSF EFUBJMT #VU BMM PG UIPTF EFUBJMT KVTU FYJTU UP BDIJFWF UIJT HPBM PG CVJMEJOH VQ B CJH DVSWZ GVODUJPO GSPN JOEJWJEVBMMZ MFTT DVSWZ BOE MPDBMMZ BDUJOH GVODUJPOT
  32. 800 1000 1200 1400 1600 1800 2000 5 6 7

    8 year March temperature B-Spline of Climate CFDBVTF TPNFUJNFT QSJPST BSF UIPVHIU PG BT QFOBMUJFT ćJT UFSN XJMM NBLF NPSF TFOTF BęFS $IBQUFS  5P TFF IPX #TQMJOFT XPSL XFMM OFFE BO FYBNQMF UIBU JT NVDI XJHHMJFS‰UIBUT B TDJ FOUJĕD UFSN‰UIBO UIF ,VOH TUBUVSF EBUB -FUT MPBE B UIPVTBOE ZFBST PG +BQBOFTF DIFSSZ CMPTTPN EBUFT 3 DPEF  '$--4ǿ- /#$)&$)"Ȁ /ǿ# --4Ǿ'*..*(.Ȁ  ʚǶ # --4Ǿ'*..*(. +- $.ǿȀ Ǫ/ǡ!-( Ǫǣ ǎǏǎǒ *.ǡ *! ǒ 1-$' .ǣ ( ) . ǒǡǒʉ ǖǑǡǒʉ #$./*"-( 4 - ǎǑǍǕǡǍǍ ǐǒǍǡǕǕ ǕǓǔǡǔǔ ǎǖǑǕǡǏǐ ΪΪΪΪΪΪΪΪΪΪΪΪΤ *4 ǎǍǑǡǒǑ ǓǡǑǎ ǖǑǡǑǐ ǎǎǒǡǍǍ ΤΥΨΪΪΦΤΤ / (+ ǓǡǎǑ ǍǡǓǓ ǒǡǎǒ ǔǡǏǖ ΤΦΨΪΦΥΤΤ / (+Ǿ0++ - ǔǡǎǖ Ǎǡǖǖ ǒǡǖǍ ǕǡǖǍ ΤΥΨΪΪΨΥΥΤΤΤΤΤΤΤ / (+Ǿ'*2 - ǒǡǎǍ ǍǡǕǒ ǐǡǔǖ Ǔǡǐǔ ΤΤΤΤΤΤΤΦΨΪΦΥΤΤΤ 4FF Ǩ# --4Ǿ'*..*(. GPS EFUBJMT BOE TPVSDFT 8FSF HPJOH UP XPSL XJUI KVTU UIF SFDPO TUSVDUFE UFNQFSBUVSF SFDPSE / (+ GPS OPX *U SVOT GSPN UIF ZFBS  UP  *U JT WFSZ XJHHMZ :PV TIPVME HP BIFBE BOE QMPU / (+ BHBJOTU 4 - UP TFF /P QBSBCPMJD DVSWF JT HPJOH UP EP UIF KPC IFSF ćF TIPSU FYQMBOBUJPO PG #TQMJOFT JT UIBU UIFZ EJWJEF UIF GVMM SBOHF PG TPNF QSFEJDUPS WBSJBCMF MJLF 4 - JOUP QBSUT ćFO UIFZ BTTJHO B QBSBNFUFS UP FBDI QBSU ćFTF QBSBNFUFST BSF HSBEVBMMZ UVSOFE PO BOE PČ JO B XBZ UIBU NBLFT UIFJS TVN JOUP B GBODZ XJHHMZ DVSWF ćF MPOH FYQMBOBUJPO DPOUBJOT MPUT NPSF EFUBJMT #VU BMM PG UIPTF EFUBJMT KVTU FYJTU UP BDIJFWF UIJT HPBM PG CVJMEJOH VQ B CJH DVSWZ GVODUJPO GSPN JOEJWJEVBMMZ MFTT DVSWZ BOE MPDBMMZ BDUJOH GVODUJPOT
  33. B-Spline of Climate • B-Spline recipe: • Choose some knots

    — locations on predictor variable where the spline is anchored • Choose degree of basis functions — how wiggly • Find posterior distribution of weights
  34. Knots • More knots means more wiggle in global function

    800 1000 1200 1400 1600 1800 2000 5 6 7 8 year March temperature
  35. Basis functions • Starter example: Linear basis functions  $637&4

    '30. -*/&4  800 1000 1200 1400 1600 1800 2000 year basis value 0 1 1 2 3 4 5 1306 is * weight 0 1 3 5 knots Each basis defines the local region where it influences the spline Figure 4.12
  36. Weights • Just an ordinary linear model now • Basis

    functions in a matrix B 800 1000 1200 1400 1600 1800 2000 5 6 year March te 'ĶĴłĿIJ ƌƉƋ " DVCJD TQMJOF XJUI  LOPUT ćF UPQ QMPU JT KVTU MJLF JO UIF QSFWJPVT ĕHVSF UIF CBTJT GVODUJPOT )PXFWFS OPX NPSF PG UIFTF PWFSMBQ ćF NJEEMF QMPU JT BHBJO FBDI CBTJT XFJHIUFE CZ JUT DPSSFTQPOEJOH QBSBNFUFS "OE UIF TVN PG UIFTF XFJHIUFE CBTJT GVODUJPO BU FBDI QPJOU QSPEVDFT UIF TQMJOF TIPXO BU UIF CPUUPN EJTQMBZFE BT B  QPTUFSJPS JOUFSWBM PG µ 3 DPEF  (Ǒǡǔ ʚǶ ,0+ǿ '$./ǿ  ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ  ʔ  ʉȉʉ 2 Ǣ  ʡ )*-(ǿǓǢǎǍȀǢ 2 ʡ )*-(ǿǍǢǎȀǢ .$"( ʡ  3+ǿǎȀ ȀǢ /ʙ'$./ǿ ʙǏɶ/ (+ Ǣ ʙ Ȁ Ǣ ./-/ʙ'$./ǿ 2ʙ- +ǿ Ǎ Ǣ )*'ǿȀ Ȁ Ȁ Ȁ :PV DBO MPPL BU UIF QPTUFSJPS NFBOT JG ZPV MJLF XJUI +- $.ǿ(ǑǡǔǢ +/#ʙǏȀ #VU JU XPOU SFWFBM NVDI :PV TIPVME TFF  2 QBSBNFUFST #VU ZPV DBOU UFMM XIBU UIF NPEFM UIJOLT GSPN F USFOE XJUI B XJHHMZ GVODUJPO 8JUI #TQMJOFT KVTU MJLF XJUI QPMZOPNJBM SFH IJT CZ HFOFSBUJOH OFX QSFEJDUPS WBSJBCMFT BOE VTJOH UIPTF JO UIF MJOFBS NPEFM ZOPNJBM SFHSFTTJPO #TQMJOFT EP OPU EJSFDUMZ USBOTGPSN UIF QSFEJDUPS CZ TRVB U *OTUFBE UIFZ JOWFOU B TFSJFT PG FOUJSFMZ OFX TZOUIFUJD QSFEJDUPS WBSJBCMF WBSJBCMFT TFSWFT UP HSBEVBMMZ UVSO B TQFDJĕD QBSBNFUFS PO BOE PČ XJUIJO B UIF QSFEJDUPS WBSJBCMFT &BDI PG UIFTF WBSJBCMFT JT DBMMFE B įĮŀĶŀ ijłĻİŁĶļ NPEFM FOET VQ MPPLJOH WFSZ GBNJMJBS µJ = α + X #J, + X #J, + X #J, + ... J,O JT UIF OUI CBTJT GVODUJPOT WBMVF PO SPX J BOE UIF X QBSBNFUFST BSF DPSS HIUT GPS FBDI ćF QBSBNFUFST BDU MJLF TMPQFT BEKVTUJOH UIF JOĘVFODF PG FBD O PO UIF NFBO µJ  4P SFBMMZ UIJT JT KVTU BOPUIFS MJOFBS SFHSFTTJPO CVU XJUI TPN D QSFEJDUPS WBSJBCMFT ćFTF TZOUIFUJD WBSJBCMFT EP TPNF SFBMMZ FMFHBOU EFTDS SJD‰XPSL GPS VT X EP XF DPOTUSVDU UIFTF CBTJT WBSJBCMFT # * EJTQMBZ UIF TJNQMFTU DBTF JO 'ĶĴł I * BQQSPYJNBUF UIF UFNQFSBUVSF EBUB XJUI GPVS EJČFSFOU MJOFBS BQQSPYJNBUJPO UIF GVMM SBOHF PG UIF IPSJ[POUBM BYJT JOUP GPVS QBSUT VTJOH QJWPU QPJOUT DBMMFE UT BSF TIPXO CZ UIF + TZNCPMT JO UIF UPQ QMPU ćFO ĕWF EJČFSFOU CBTJT GV
  37. Weights WBSJBCMFT TFSWFT UP HSBEVBMMZ UVSO B TQFDJĕD QBSBNFUFS PO

    BOE PČ XJUIJO B UIF QSFEJDUPS WBSJBCMFT &BDI PG UIFTF WBSJBCMFT JT DBMMFE B įĮŀĶŀ ijłĻİŁĶļ NPEFM FOET VQ MPPLJOH WFSZ GBNJMJBS µJ = α + X #J, + X #J, + X #J, + ... J,O JT UIF OUI CBTJT GVODUJPOT WBMVF PO SPX J BOE UIF X QBSBNFUFST BSF DPSS HIUT GPS FBDI ćF QBSBNFUFST BDU MJLF TMPQFT BEKVTUJOH UIF JOĘVFODF PG FBD O PO UIF NFBO µJ  4P SFBMMZ UIJT JT KVTU BOPUIFS MJOFBS SFHSFTTJPO CVU XJUI TPN D QSFEJDUPS WBSJBCMFT ćFTF TZOUIFUJD WBSJBCMFT EP TPNF SFBMMZ FMFHBOU EFTDS SJD‰XPSL GPS VT X EP XF DPOTUSVDU UIFTF CBTJT WBSJBCMFT # * EJTQMBZ UIF TJNQMFTU DBTF JO 'ĶĴł I * BQQSPYJNBUF UIF UFNQFSBUVSF EBUB XJUI GPVS EJČFSFOU MJOFBS BQQSPYJNBUJPO UIF GVMM SBOHF PG UIF IPSJ[POUBM BYJT JOUP GPVS QBSUT VTJOH QJWPU QPJOUT DBMMFE UT BSF TIPXO CZ UIF + TZNCPMT JO UIF UPQ QMPU ćFO ĕWF EJČFSFOU CBTJT GV BSJBCMFT BSF VTFE UP HFOUMZ USBOTJUJPO GSPN POF QBSU UP UIF OFYU &TTFOUJBMM T UFMM ZPV XIJDI LOPU ZPV BSF DMPTF UP #FHJOOJOH PO UIF MFę PG UIF UPQ QMP O  IBT WBMVF  BOE BMM PG UIF PUIFST BSF TFU UP [FSP "T XF NPWF SJHIUXBSET U OE LOPU CBTJT  EFDMJOFT BOE CBTJT  JODSFBTFT "U LOPU  CBTJT  IBT WBMVF   $637&4 '30. -*/&4  800 1000 1200 1400 1600 1800 2000 year basis value 0 1 1 2 3 4 5 1306 800 1000 1200 1400 1600 1800 2000 year basis * weight 0 1 2 3 4 5 8 ure Figure 4.12
  38. 800 1000 1200 1400 1600 1800 2000 year basis value

    0 1 1 2 3 4 5 1306 800 1000 1200 1400 1600 1800 2000 year basis * weight 0 1 2 3 4 5 800 1000 1200 1400 1600 1800 2000 5 6 7 8 year March temperature Figure 4.12
  39. 800 1000 1200 1400 1600 1800 2000 year basis value

    0 1 800 1000 1200 1400 1600 1800 2000 year basis * weight 0 800 1000 1200 1400 1600 1800 2000 5 6 7 8 year March temperature Figure 4.13 15 knots 3rd degree basis functions (all code in text)
  40. 800 1000 1200 1400 1600 1800 2000 year basis value

    0 1 800 1000 1200 1400 1600 1800 2000 year basis * weight 0 800 1000 1200 1400 1600 1800 2000 5 6 7 8 year March temperature Figure 4.13 800 1000 1200 1400 1600 1800 2000 year basis * weight 0
  41. 800 1000 1200 1400 1600 1800 2000 year basis value

    0 1 800 1000 1200 1400 1600 1800 2000 year basis * weight 0 800 1000 1200 1400 1600 1800 2000 5 6 7 8 year March temperature Figure 4.13 800 1000 1200 1400 1600 1800 2000 year basis * weight 0
  42. Spline possibilities • Knots and basis degree are choices •

    Must worry about overfitting data (Chapter 7) • Other types of splines don’t require knots • Another idea: Gaussian Processes (Chapter 14) • All splines are descriptive, not mechanistic
  43. Work • Homework online later today • In the New

    Year: • Multiple regression • Causal graphs • Colliders • Overfitting • Multilevel models • Adventures in covariance