Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking - Lecture 02

Statistical Rethinking - Lecture 02

Statistical Rethinking: A Bayesian Course with R Examples - Lecture 02, Chapters 2 and 3

Richard McElreath

January 08, 2015
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Homework • Homework problems on the website • Due in

    one week • Need to install rethinking R package • Make sure R is up to date (3.1.2)
  2. Counts to plausibility TVN PG QSPEVDUT FT OPUIJOH TQFDJBM SFBMMZ

    BCPVU TUBOEBSEJ[JOH UP POF "OZ WBMVF XJMM EP #VU VTJOH CFS  FOET VQ NBLJOH UIF NBUIFNBUJDT NPSF DPOWFOJFOU $POTJEFS BHBJO UIF UBCMF GSPN CFGPSF OPX VQEBUFE VTJOH PVS EFĕOJUJPOT PG Q BOE iQ UZw 1PTTJCMF DPNQPTJUJPO Q XBZT UP QSPEVDF EBUB QMBVTJCJMJUZ < >    < > .   < > .   < > .   < >    DBO RVJDLMZ DPNQVUF UIFTF QMBVTJCJMJUJFT JO 3 ʄǤ ǭ ƾ ǐ ǃ ǐ DŽ Ǯ dz.0(ǭ24.Ǯ ƻǏƼǀ ƻǏƿƻ ƻǏƿǀ ćFTF QMBVTJCJMJUJFT BSF BMTP QSPCBCJMJUJFT‰UIFZ BSF OPOOFHBUJWF [FSP PS QPTJUJWF CFST UIBU TVN UP POF "OE BMM PG UIF NBUIFNBUJDBM UIJOHT ZPV DBO EP XJUI QSPCBCJ Plausibility is probability: Set of non-negative real numbers that sum to one. Probability theory is just a set of shortcuts for counting possibilities.
  3. Building a model • How to use probability to do

    typical statistical modeling? 1. Design the model (data story) 2. Condition on the data (update) 3. Evaluate the model (critique)
  4. Design > Condition > Evaluate • Data story motivates the

    model • How do the data arise? • For WLWWWLWLW: • Some true proportion of water, p • Toss globe, probability p of observing W, 1–p of L • Each toss therefore independent of other tosses • Translate data story into probability statements
  5. Design > Condition > Evaluate • Bayesian updating defines optimal

    learning in small world, converts prior into posterior • Give your golem an information state, before the data: Here, an initial confidence in each possible value of p between zero and one • Condition on data to update information state: New confidence in each value of p, conditional on data
  6. probability of water 0 0.5 1 n = 1 W

    L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W W n = 4 W L W W W L W L W confidence n = 5 W L W W W L W L W W prior p, proportion W plausibility
  7. probability of water 0 0.5 1 n = 1 W

    L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W W n = 4 W L W W W L W L W confidence n = 5 W L W W W L W L W W prior posterior p, proportion W plausibility
  8. probability of water 0 0.5 1 n = 1 W

    L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W
  9. Design > Condition > Evaluate • Data order irrelevant, because

    golem assumes order irrelevant • All-at-once, one-at-a-time, shuffled order all give same posterior • Every posterior is a prior for next observation • Every prior is posterior of some other inference   4."-- 803-%4 "/% -"3(& 803-%4 probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W 'ĶĴłĿIJ Ɗƍ )PX B #BZFTJBO NPEFM MFBSOT &BDI UPTT PG UIF HMPCF QSPEVDFT BO PCTFSWBUJPO PG XBUFS 8 PS MBOE -  ćF NPEFMT FTUJNBUF PG UIF QSP QPSUJPO PG XBUFS PO UIF HMPCF JT B QMBVTJCJMJUZ GPS FWFSZ QPTTJCMF WBMVF ćF MJOFT BOE DVSWFT JO UIJT ĕHVSF BSF UIFTF DPMMFDUJPOT PG QMBVTJCJMJUJFT *O FBDI QMPU B QSFWJPVT QMBVTJCJMJUJFT EBTIFE DVSWF BSF VQEBUFE JO MJHIU PG UIF MBUFTU
  10. Design > Condition > Evaluate • Bayesian inference: Logical answer

    to a question in the form of a model • Golem must be supervised • Did the golem malfunction? • Does the golem’s answer make sense? • Does the question make sense? • Check sensitivity of answer to changes in assumptions
  11. Likelihood • Pr(data|assumptions) • Defines probability of each observation, conditional

    “|” on assumptions • i.e. relative count of number of ways of seeing data, given a particular conjecture • In this case, binomial probability:  $0.10/&/54 0' 5)& .0%&- *O UIJT DBTF PODF XF BEE PVS BTTVNQUJPOT UIBU  FWFSZ UPTT JT JOEFQF IF PUIFS UPTTFT BOE  UIF QSPCBCJMJUZ PG 8 JT UIF TBNF PO FWFSZ UPTT Z UIFPSZ QSPWJEFT B VOJRVF BOTXFS LOPXO BT UIF CJOPNJBM EFOTJUZ ćJ NNPO iDPJO UPTTJOHw EJTUSJCVUJPO "OE TP UIF QSPCBCJMJUZ PG PCTFSWJOH O UPTTFT XJUI B QSPCBCJMJUZ Q PG 8 JT 1S(O8|O, Q) = O! O8!(O − O8)! QO8 ( − Q)O−O8 . E UIF BCPWF BT ćF DPVOU PG 8T O8 JT EJTUSJCVUFE CJOPNJBMMZ XJUI QSPCBCJMJUZ Q PG B 8 PO FBDI UPTT BOE O UPTTFT JO UPUBM
  12. Likelihood *O UIJT DBTF PODF XF BEE PVS BTTVNQUJPOT UIBU

     FWFSZ UPTT JT JOEFQF IF PUIFS UPTTFT BOE  UIF QSPCBCJMJUZ PG 8 JT UIF TBNF PO FWFSZ UPTT Z UIFPSZ QSPWJEFT B VOJRVF BOTXFS LOPXO BT UIF CJOPNJBM EFOTJUZ ćJ NNPO iDPJO UPTTJOHw EJTUSJCVUJPO "OE TP UIF QSPCBCJMJUZ PG PCTFSWJOH O UPTTFT XJUI B QSPCBCJMJUZ Q PG 8 JT 1S(O8|O, Q) = O! O8!(O − O8)! QO8 ( − Q)O−O8 . E UIF BCPWF BT ćF DPVOU PG 8T O8 JT EJTUSJCVUFE CJOPNJBMMZ XJUI QSPCBCJMJUZ Q PG B 8 PO FBDI UPTT BOE O UPTTFT JO UPUBM E UIF CJOPNJBM EFOTJUZ GPSNVMB JT CVJMU JOUP 3 TP ZPV DBO FBTJMZ DPNQV JIPPE PG UIF EBUB‰ 8T JO  UPTTFT‰VOEFS BOZ WBMVF PG Q XJUI )*(ǭ ǁ ǐ .$5 ʃDŽ ǐ +-*ʃƻǏǀ Ǯ ƻǏƼǁƿƻǁƽǀ count W number tosses probability W The count of W’s is distributed binomially, with probability p of a W on each toss and n tosses total.
  13. BSF OP PUIFS FWFOUT ćF HMPCF OFWFS HFUT TUVDL UP

    UIF DFJMJOH GPS FYBNQMF 8IFO XF PCTFSWF B TBNQMF PG 8T BOE -T PG MFOHUI /  JO UIF BDUVBM TBNQMF XF OFFE UP TBZ IPX MJLFMZ UIBU FYBDU TBNQMF JT PVU PG UIF VOJWFSTF PG QPUFOUJBM TBNQMFT PG UIF TBNF MFOHUI ćBU NJHIU TPVOE DIBMMFOHJOH CVU JUT UIF LJOE PG UIJOH ZPV HFU HPPE BU WFSZ RVJDLMZ PODF ZPV TUBSU QSBDUJDJOH *O UIJT DBTF PODF XF BEE PVS BTTVNQUJPOT UIBU  FWFSZ UPTT JT JOEFQFOEFOU PG UIF PUIFS UPTTFT BOE  UIF QSPCBCJMJUZ PG 8 JT UIF TBNF PO FWFSZ UPTT QSPCBCJMJUZ UIFPSZ QSPWJEFT B VOJRVF BOTXFS LOPXO BT UIF CJOPNJBM EJTUSJCVUJPO ćJT JT UIF DPNNPO iDPJO UPTTJOHw EJTUSJCVUJPO "OE TP UIF QSPCBCJMJUZ PG PCTFSWJOH X 8T JO O UPTTFT XJUI B QSPCBCJMJUZ Q PG 8 JT 1S(X|O, Q) = O! X!(O − X)! QX ( − Q)O−X . 3FBE UIF BCPWF BT ćF DPVOU PG iXBUFSw PCTFSWBUJPOT X JT EJTUSJCVUFE CJOPNJBMMZ XJUI QSPCB CJMJUZ Q PG iXBUFSw PO FBDI UPTT BOE O UPTTFT JO UPUBM "OE UIF CJOPNJBM EJTUSJCVUJPO GPSNVMB JT CVJMU JOUP 3 TP ZPV DBO FBTJMZ DPNQVUF UIF MJLFMJ IPPE PG UIF EBUB‰ 8T JO  UPTTFT‰VOEFS BOZ WBMVF PG Q XJUI 3 DPEF  $)*(ǭ ǁ ǐ .$5 ʃDŽ ǐ +-*ʃƻǏǀ Ǯ ǯƼǰ ƻǏƼǁƿƻǁƽǀ ćBU OVNCFS JT UIF SFMBUJWF OVNCFS PG XBZT UP HFU  8T IPMEJOH Q BU  BOE O BU  4P JU EPFT UIF KPC PG DPVOUJOH SFMBUJWF OVNCFS PG QBUIT UISPVHI UIF HBSEFO $IBOHF UIF ƻǏǀ UP BOZ PUIFS WBMVF UP TFF IPX UIF WBMVF DIBOHFT 4PNFUJNFT MJLFMJIPPET BSF XSJUUFO -(Q|X, O) UIF MJLFMJIPPE PG Q DPOEJUJPOBM PO X BOE O /PUF IPXFWFS UIBU UIJT OPUBUJPO SFWFSTFT XIBU JT PO UIF MFę TJEF PG UIF | TZNCPM +VTU LFFQ JO NJOE UIBU UIF KPC PG UIF MJLFMJIPPE JT UP UFMM VT UIF SFMBUJWF OVNCFS PG XBZT UP TFF UIF EBUB X HJWFO WBMVFT GPS Q BOE O Likelihood *O UIJT DBTF PODF XF BEE PVS BTTVNQUJPOT UIBU  FWFSZ UPTT JT JOEFQF IF PUIFS UPTTFT BOE  UIF QSPCBCJMJUZ PG 8 JT UIF TBNF PO FWFSZ UPTT Z UIFPSZ QSPWJEFT B VOJRVF BOTXFS LOPXO BT UIF CJOPNJBM EFOTJUZ ćJ NNPO iDPJO UPTTJOHw EJTUSJCVUJPO "OE TP UIF QSPCBCJMJUZ PG PCTFSWJOH O UPTTFT XJUI B QSPCBCJMJUZ Q PG 8 JT 1S(O8|O, Q) = O! O8!(O − O8)! QO8 ( − Q)O−O8 . E UIF BCPWF BT ćF DPVOU PG 8T O8 JT EJTUSJCVUFE CJOPNJBMMZ XJUI QSPCBCJMJUZ Q PG B 8 PO FBDI UPTT BOE O UPTTFT JO UPUBM E UIF CJOPNJBM EFOTJUZ GPSNVMB JT CVJMU JOUP 3 TP ZPV DBO FBTJMZ DPNQV JIPPE PG UIF EBUB‰ 8T JO  UPTTFT‰VOEFS BOZ WBMVF PG Q XJUI )*(ǭ ǁ ǐ .$5 ʃDŽ ǐ +-*ʃƻǏǀ Ǯ ƻǏƼǁƿƻǁƽǀ
  14. Parameters • Likelihood contains symbols: nW , n, p •

    Some are data (nW , n) • Others parameters (p) • Define targets of inference, what is updated • These were the conjectures in the bag example • Which are data and which parameters depend upon your context and question • e.g. mark-recapture: know nW , must infer n, p
  15. Prior • What the golem believes before the data •

    Probability of p • Likelihood & prior define golem’s perspective on the data   4."-- 8 probability of water 0 0.5 1 n = 1 W L W W W L W L W confidence n = 4 W L W W W L W L W dence prior p, proportion W plausibility
  16. Prior • Globe tossing model, a uniform (flat) prior: 

    $0.10/&/54 0' 5)& .0%&- PFTOU SFTPMWF UIF QSPCMFN PG QSPWJEJOH B QSJPS CFDBVTF BU UIF EBXO PG O =  UIF NBDIJOF TUJMM IBE BO JOJUJBM FTUJNBUF GPS UIF QBSBNFUFS Q QFDJGZJOH FRVBM DPOĕEFODF JO FWFSZ QPTTJCMF WBMVF :PV DPVME XSJUF UIF U FYBNQMF BT 1S(Q) =   −  = . SJPS JT B QSPCBCJMJUZ EJTUSJCVUJPO GPS UIF QBSBNFUFS *O HFOFSBM GPS B VOJ GSPN B UP C UIF QSPCBCJMJUZ PG BOZ QPJOU JO UIF JOUFSWBM JT /(C−B) *G Z FE CZ UIF GBDU UIBU UIF QSPCBCJMJUZ PG FWFSZ WBMVF PG Q JT  IBOH PO UP H :PVMM HFU UIF FYQMBOBUJPO JO B MBUFS DIBQUFS #VU BMTP LOPX UIBU U DUMZ OPSNBM BOE IFBMUIZ ćF QSJPS JT XIBU UIF NBDIJOF iCFMJFWFTw CFGPSF JU TFFT UIF EBUB *U JT QB PEFM OPU B SFĘFDUJPO OFDFTTBSJMZ PG XIBU ZPV CFMJFWF $MFBSMZ UIF MJLFMJ JOT NBOZ BTTVNQUJPOT UIBU BSF VOMJLFMZ UP CF FYBDUMZ USVF‰DPNQMFUFMZ The prior probability of p is assumed to be uniform in the interval from zero to one.
  17. Prior • Huge literature on choice of prior • Flat

    prior conventional, but hardly ever best choice • Always know something (before data) that can improve inference • Are zero and one plausible values for p? Is p < 0.5 as plausible as p > 0.5? • Don’t have to get it exactly right; just need to improve Late Cretaceous (90Mya)
  18. Posterior • Bayesian estimate is always posterior distribution over parameters,

    Pr(parameters|data) • Here: Pr(p|nW ) • Compute using Bayes’ theorem: Q) BOE UIF QSJPS QSPCBCJMJUZ 1S(Q) ćJT JT MJLF TBZJOH UIBU UIF QSPCBCJ E XJOE PO UIF TBNF EBZ JT FRVBM UP UIF QSPCBCJMJUZ PG SBJO XIFO JUT X IF QSPCBCJMJUZ PG XJOE ćJT NVDI JT KVTU EFĕOJUJPO #VU JUT KVTU B 1S(O8, Q) = 1S(Q|O8) 1S(O8). EPOF JT SFWFSTF XIJDI QSPCBCJMJUZ JT DPOEJUJPOBM PO UIF SJHIUIBOE M B USVF EFĕOJUJPO /PX TJODF CPUI SJHIUIBOE TJEFT BSF FRVBM UP UIF XF DBO TFU UIFN FRVBM UP POF BOPUIFS BOE TPMWF GPS UIF QPTUFSJPS QSPCB 8) 1S(Q|O8) = 1S(O8|Q) 1S(Q) 1S(O8) . JT JT #BZFT UIFPSFN *U TBZT UIBU UIF QSPCBCJMJUZ PG BOZ QBSUJDVMBS WBMV FSJOH UIF EBUB JT FRVBM UP UIF QSPEVDU PG UIF MJLFMJIPPE BOE QSJPS EJ  .",*/( 5)& .0%&- (0 IJT UIJOH 1S(O8) XIJDI *MM DBMM UIF BWFSBHF MJLFMJIPPE *O XPSE GPSN 1PTUFSJPS = -JLFMJIPPE × 1SJPS "WFSBHF -JLFMJIPPE. ćF BWFSBHF MJLFMJIPPE 1S(O8) DBO CF DPOGVTJOH *U JT DPNNPOMZ DBMMF EFODFw PS UIF iQSPCBCJMJUZ PG UIF EBUB w OFJUIFS PG XIJDI JT B HPPE OBNF
  19. posterior 0 0.5 1 likelihood 0 0.5 1 prior 0

    0.5 1 ⇥ / posterior 0 0.5 1 prior 0 0.5 1 ⇥ / ⇥ / likelihood 0 0.5 1 prior 0 0.5 1 likelihood 0 0.5 1 posterior 0 0.5 1
  20. Computing the posterior 1. Analytical approach (often impossible) 2. Grid

    approximation (very intensive) 3. Quadratic approximation (approximate) 4. Markov chain Monte Carlo (intensive)
  21. Grid approximation • The posterior is: standardized product of the

    likelihood and prior. • Grid approximation uses finite grid of parameter values instead of continuous space
  22. Quadratic approximation • Assume posterior is normally distributed • Can

    estimate with two numbers: • Peak of posterior, maximum a posteriori (MAP) • Standard deviation of posterior • Lots of algorithms • With flat priors, same as conventional maximum likelihood estimation
  23. Sampling from the posterior • Incredibly useful to sample randomly

    from the posterior • Visualize uncertainty • Compute confidence intervals • Simulate observations • MCMC produces only samples • Above all, easier to think with samples
  24. Sampling from the posterior • Recipe: 1. Compute or estimate

    posterior 2. Sample with replacement from posterior 3. Compute stuff from samples
  25. Compute posterior • Grid approximation 0 5000 10000 sample 0.0

    0.5 1.0 probability of water 'ĶĴłĿIJ ƋƉ 4BNQMJOH QBSBNFUFS WBMVFT GSPN UIF QPTUFSJPS EJT USJCVUJPO -Fę UIPVTBOE TBNQMFT GSPN UIF QPTUFSJPS JNQMJFE CZ UIF HMPCF UPTTJOH EBUB BOE NPEFM 3JHIU ćF EFOTJUZ PG TBN QMFT WFSUJDBM BU FBDI QBSBNFUFS WBMVF IPSJ[POUBM  3 DPEF  + ʄǤ . ,ǭ !-*(ʃƻ ǐ /*ʃƼ ǐ ' )"/#Ǐ*0/ʃƼƻƻƻ Ǯ +-$*- ʄǤ - +ǭ Ƽ ǐ Ƽƻƻƻ Ǯ '$& '$#** ʄǤ $)*(ǭ ǁ ǐ .$5 ʃDŽ ǐ +-*ʃ+ Ǯ +*./ -$*- ʄǤ '$& '$#** Ƿ +-$*- +*./ -$*- ʄǤ +*./ -$*- dz .0(ǭ+*./ -$*-Ǯ /PX XF XJTI UP ESBX UIPVTBOE TBNQMFT GSPN UIJT QPTUFSJPS *NBHJOF UIF QPT UFSJPS JT B CVDLFU GVMM PG QBSBNFUFS WBMVFT OVNCFST TVDI BT     FUD 8JUIJO UIF CVDLFU FBDI WBMVF FYJTUT JO QSPQPSUJPO UP JUT QPTUFSJPS QSPCBCJMJUZ TVDI UIBU WBMVFT OFBS UIF QFBL BSF NVDI NPSF DPNNPO UIBO UIPTF JO UIF UBJMT 8FSF HPJOH UP TDPPQ PVU UIPVTBOE WBMVFT GSPN UIF CVDLFU 1SPWJEFE UIF CVDLFU JT XFMM NJYFE UIF SFTVMUJOH TBNQMFT XJMM IBWF UIF TBNF QSPQPSUJPOT BT UIF FYBDU QPT UFSJPS EFOTJUZ )FSFT IPX ZPV DBO EP UIJT JO 3 XJUI POF MJOF PG DPEF 3 DPEF
  26. 3 DPEF  + ʄǤ . ,ǭ !-*(ʃƻ ǐ /*ʃƼ

    ǐ ' )"/#Ǐ*0/ʃƼƻƻƻ Ǯ +-$*- ʄǤ - +ǭ Ƽ ǐ Ƽƻƻƻ Ǯ '$& '$#** ʄǤ $)*(ǭ ǁ ǐ .$5 ʃDŽ ǐ +-*ʃ+ Ǯ +*./ -$*- ʄǤ '$& '$#** Ƿ +-$*- +*./ -$*- ʄǤ +*./ -$*- dz .0(ǭ+*./ -$*-Ǯ /PX XF XJTI UP ESBX UIPVTBOE TBNQMFT GSPN UI UFSJPS JT B CVDLFU GVMM PG QBSBNFUFS WBMVFT OVNCFS 8JUIJO UIF CVDLFU FBDI WBMVF FYJTUT JO QSPQPSUJPO UP UIBU WBMVFT OFBS UIF QFBL BSF NVDI NPSF DPNNPO HPJOH UP TDPPQ PVU UIPVTBOE WBMVFT GSPN UIF CV XFMM NJYFE UIF SFTVMUJOH TBNQMFT XJMM IBWF UIF TBNF UFSJPS EFOTJUZ )FSFT IPX ZPV DBO EP UIJT JO 3 XJUI POF MJOF P 3 DPEF  .(+' . ʄǤ .(+' ǭ (* '. ǐ .$5 ʃƼ ƿ ǐ - +' ćF XPSLIPSTF IFSF JT .(+' XIJDI SBOEPNMZ QV WFDUPS JO UIJT DBTF JT (* '. UIF HSJE PG QBSBNFUFS ćF SFTVMUJOH TBNQMFT BSF EJTQMBZFE JO 'ĶĴłĿIJ Ƌ Ƽ ƿ SBOEPN TBNQMFT BSF TIPXO TFRVFOUJBMMZ 3 DPEF  +'*/ǭ .(+' . Ǯ 0 200 600 1000 0.0 0.4 0.8 Index p 0 200 600 1000 0.6 0.8 1.0 1.2 1.4 Index prior 0 200 600 1000 0.00 0.10 0.20 Index likelihood 0 200 600 1000 0.0000 0.0015 Index posterior
  27. Sample from posterior UFSJPS JT B CVDLFU GVMM PG QBSBNFUFS

    WBMVFT OVNCFST TVDI BT     FUD 8JUIJO UIF CVDLFU FBDI WBMVF FYJTUT JO QSPQPSUJPO UP JUT QPTUFSJPS QSPCBCJMJUZ TVDI UIBU WBMVFT OFBS UIF QFBL BSF NVDI NPSF DPNNPO UIBO UIPTF JO UIF UBJMT 8FSF HPJOH UP TDPPQ PVU UIPVTBOE WBMVFT GSPN UIF CVDLFU 1SPWJEFE UIF CVDLFU JT XFMM NJYFE UIF SFTVMUJOH TBNQMFT XJMM IBWF UIF TBNF QSPQPSUJPOT BT UIF FYBDU QPT UFSJPS EFOTJUZ )FSFT IPX ZPV DBO EP UIJT JO 3 XJUI POF MJOF PG DPEF 3 DPEF  .(+' . ʄǤ .(+' ǭ + ǐ +-*ʃ+*./ -$*- ǐ .$5 ʃƼ ƿ ǐ - +' ʃ Ǯ ćF XPSLIPSTF IFSF JT .(+' XIJDI SBOEPNMZ QVMMT WBMVFT GSPN B WFDUPS ćF WFDUPS JO UIJT DBTF JT (* '. UIF HSJE PG QBSBNFUFS WBMVFT ćF SFTVMUJOH TBNQMFT BSF EJTQMBZFE JO 'ĶĴłĿIJ ƋƉ 0O UIF MFę BMM UIPVTBOE Ƽ ƿ SBOEPN TBNQMFT BSF TIPXO TFRVFOUJBMMZ 3 DPEF  +'*/ǭ .(+' . Ǯ Figure 3.1  4".1-*/( 50 46.."3*;&  0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 1.5 2.0 2.5 proportion water (p) Density 'ĶĴłĿIJ ƋƉ 4BNQMJOH QBSBNFUFS WBMVFT GSPN UIF QPTUFSJPS EJTUSJCVUJPO -Fę UIPVTBOE TBNQMFT GSPN UIF QPTUFSJPS JNQMJFE CZ UIF HMPCF UPTTJOH EBUB BOE NPEFM 3JHIU ćF EFOTJUZ PG TBNQMFT WFSUJDBM BU FBDI QBSBNFUFS
  28. Compute stuff • Summary tasks • How much posterior probability

    below/above/between specified parameter values? • Which parameter values contain 50%/80%/95% of posterior probability? “Confidence” intervals • Which parameter value maximizes posterior probability? Minimizes posterior loss? Point estimates • You decide the question
  29. Figure 3.2 Intervals of defined boundary ask how much mass?

    Intervals of defined mass ask which values? 0.00 0.25 0.50 0.75 1.00 0.0000 0.0010 0.0020 proportion water (p) Density 0.00 0.25 0.50 0.75 1.00 0.0000 0.0010 0.0020 proportion water (p) Density 0.00 0.25 0.50 0.75 1.00 0.0000 0.0010 0.0020 proportion water (p) Density lower 80% 0.00 0.25 0.50 0.75 1.00 0.0000 0.0010 0.0020 proportion water (p) Density middle 80% 'ĶĴłĿIJ ƋƊ 5XP LJOET PG DPOĕEFODF JOUFSWBM 5PQ SPX JOUFSWBMT PG EFĕOFE 0ODF ZPVS NPEFM QSPEVDFT B QPTUFSJPS EJTUSJCVUJPO UIF NPEFMT XPSL JT EPOF #VU ZPVS XPSL IBT KVTU CFHVO *U JT OFDFTTBSZ UP TVNNBSJ[F UIF QPTUFSJPS EJTUSJCVUJPO &YBDUMZ IPX JU JT TVNNBSJ[FE EFQFOET VQPO ZPVS QVSQPTF #VU DPNNPO RVFTUJPOT JODMVEF • )PX NVDI QPTUFSJPS QSPCBCJMJUZ MJFT CFMPX TPNF QBSBNFUFS WBMVF • )PX NVDI QPTUFSJPS QSPCBCJMJUZ MJFT CFUXFFO UXP QBSBNFUFS WBMVFT • 8IJDI QBSBNFUFS WBMVF NBSLT UIF MPXFS  PG UIF QPTUFSJPS QSPCBCJMJUZ • 8IJDI SBOHF PG QBSBNFUFS WBMVFT DPOUBJOT  PG UIF QPTUFSJPS QSPCBCJMJUZ • 8IJDI QBSBNFUFS WBMVF IBT IJHIFTU QPTUFSJPS QSPCBCJMJUZ ćFTF TJNQMF RVFTUJPOT DBO CF VTFGVMMZ EJWJEFE UP JOUP RVFTUJPOT BCPVU  JOUFSWBMT PG EFĕOFE CPVOEBSJFT  RVFTUJPOT BCPVU JOUFSWBMT PG EFĕOFE QSPCBCJMJUZ NBTT BOE  RVFTUJPOT BCPVU QPJOU FTUJNBUFT 8FMM TFF IPX UP BQQSPBDI UIFTF RVFTUJPOT VTJOH TBNQMFT GSPN UIF QPTUFSJPS  *OUFSWBMT PG EFĕOFE CPVOEBSJFT 4VQQPTF * BTL ZPV GPS UIF QPTUFSJPS QSPCBCJMJUZ UIBU UIF QSPQPSUJPO PG XBUFS JT MFTT UIBO  6TJOH UIF HSJEBQQSPYJNBUF QPTUFSJPS ZPV DBO KVTU BEE VQ BMM PG UIF QSPCBCJMJUJFT XIFSF UIF DPSSFTQPOEJOH QBSBNFUFS WBMVF JT MFTT UIBO  3 DPEF  ȃ  0+ +*./ -$*- +-*$'$/4 2# - + ʄ ƻǏǀ .0(ǭ +*./ -$*-ǯ +Ǭ"-$ ʄ ƻǏǀ ǰ Ǯ ǯƼǰ ƻǏƼǂƼǃǂƿǁ 4P BCPVU  PG UIF QPTUFSJPS QSPCBCJMJUZ JT CFMPX  $PVMEOU CF FBTJFS #VU TJODF HSJE BQ QSPYJNBUJPO JTOU QSBDUJDBM JO HFOFSBM JU XPOU BMXBZT CF TP FBTZ 0ODF UIFSF JT NPSF UIBO POF QBSBNFUFS JO UIF QPTUFSJPS EJTUSJCVUJPO XBJU VOUJM UIF OFYU DIBQUFS GPS UIBU DPNQMJDBUJPO FWFO UIJT TJNQMF TVN JT OP MPOHFS WFSZ TJNQMF 4P MFUT TFF IPX UP QFSGPSN UIF TBNF DBMDVMBUJPO VTJOH TBNQMFT GSPN UIF QPTUFSJPS ćJT BQQSPBDI EPFT HFOFSBMJ[F UP DPNQMFY NPEFMT XJUI NBOZ QBSBNFUFST BOE TP ZPV DBO VTF   4".1-*/( 5)& *."(*/"3: JU FWFSZXIFSF "MM ZPV IBWF UP EP JT TJNJMBSMZ BEE VQ BMM PG UIF TBNQMFT CFMPX  CVU BMTP EJWJEF UIF SFTVMUJOH DPVOU CZ UIF UPUBM OVNCFS PG TBNQMFT *O PUIFS XPSET ĕOE UIF GSFRVFODZ PG QBSBNFUFS WBMVFT CFMPX  3 DPEF  .0(ǭ .(+' . ʄ ƻǏǀ Ǯ dz Ƽ ƿ ǯƼǰ ƻǏƼǂƽǁ "OE UIBUT OFBSMZ UIF TBNF BOTXFS BT UIF HSJE BQQSPYJNBUJPO QSPWJEFE BMUIPVHI ZPVS BOTXFS XJMM OPU CF FYBDUMZ UIF TBNF CFDBVTF UIF FYBDU TBNQMFT ZPV ESFX GSPN UIF QPTUFSJPS XJMM CF EJČFSFOU ćJT SFHJPO JT TIPXO JO UIF VQQFSMFę QMPU JO 'ĶĴłĿIJ ƋƊ 6TJOH UIF TBNF BQQSPBDI ZPV DBO BTL IPX NVDI QPTUFSJPS QSPCBCJMJUZ MJFT CFUXFFO  BOE 
  30. Figure 3.2 Intervals of defined boundary ask how much mass?

    Intervals of defined mass ask which values? 0.00 0.25 0.50 0.75 1.00 0.0000 0.0010 0.0020 proportion water (p) Density 0.00 0.25 0.50 0.75 1.00 0.0000 0.0010 0.0020 proportion water (p) Density 0.00 0.25 0.50 0.75 1.00 0.0000 0.0010 0.0020 proportion water (p) Density lower 80% 0.00 0.25 0.50 0.75 1.00 0.0000 0.0010 0.0020 proportion water (p) Density middle 80% 'ĶĴłĿIJ ƋƊ 5XP LJOET PG DPOĕEFODF JOUFSWBM 5PQ SPX JOUFSWBMT PG EFĕOFE
  31. • Percentile intervals (PI): equal area in each tail •

    Highest posterior density intervals (HPDI): narrowest interval containing mass Figure 3.3 0.00 0.25 0.50 0.75 1.00 0.000 0.001 0.002 0.003 0.004 proportion water (p) Density 50% Percentile Interval 0.00 0.25 0.50 0.75 1.00 0.000 0.001 0.002 0.003 0.004 proportion water (p) Density 50% HPDI 'ĶĴłĿIJ ƋƋ ćF EJČFSFODF CFUXFFO QFSDFOUJMF BOE IJHIFTU QPTUFSJPS EFO TJUZ DPOĕEFODF JOUFSWBMT ćF QPTUFSJPS EFOTJUZ IFSF DPSSFTQPOET UP B ĘBU QSJPS BOE PCTFSWJOH UISFF XBUFS TBNQMFT JO UISFF UPUBM UPTTFT PG UIF HMPCF -Fę  QFSDFOUJMF JOUFSWBM ćJT JOUFSWBM BTTJHOT FRVBM NBTT  UP CPUI UIF MFę BOE SJHIU UBJM "T B SFTVMU JU PNJUT UIF NPTU QSPCBCMF QBSBNFUFS WBMVF Q =  3JHIU  IJHIFTU QPTUFSJPS EFOTJUZ JOUFSWBM )1%* ćJT JOUFSWBM ĕOET UIF OBSSPXFTU SFHJPO XJUI  PG UIF QPTUFSJPS QSPCBCJMJUZ 4VDI B SFHJPO BMXBZT JODMVEFT UIF NPTU QSPCBCMF QBSBNFUFS WBMVF
  32. Figure 3.3 0.00 0.25 0.50 0.75 1.00 0.000 0.001 0.002

    0.003 0.004 proportion water (p) Density 50% Percentile Interval 0.00 0.25 0.50 0.75 1.00 0.000 0.001 0.002 0.003 0.004 proportion water (p) Density 50% HPDI 'ĶĴłĿIJ ƋƋ ćF EJČFSFODF CFUXFFO QFSDFOUJMF BOE IJHIFTU QPTUFSJPS EFO TJUZ DPOĕEFODF JOUFSWBMT ćF QPTUFSJPS EFOTJUZ IFSF DPSSFTQPOET UP B ĘBU QSJPS BOE PCTFSWJOH UISFF XBUFS TBNQMFT JO UISFF UPUBM UPTTFT PG UIF HMPCF -Fę  QFSDFOUJMF JOUFSWBM ćJT JOUFSWBM BTTJHOT FRVBM NBTT  UP CPUI UIF MFę BOE SJHIU UBJM "T B SFTVMU JU PNJUT UIF NPTU QSPCBCMF QBSBNFUFS WBMVF Q =  3JHIU  IJHIFTU QPTUFSJPS EFOTJUZ JOUFSWBM )1%* ćJT JOUFSWBM ĕOET UIF OBSSPXFTU SFHJPO XJUI  PG UIF QPTUFSJPS QSPCBCJMJUZ 4VDI B SFHJPO BMXBZT JODMVEFT UIF NPTU QSPCBCMF QBSBNFUFS WBMVF XJUI UIF EBUB UIFZ BSF OPU QFSGFDU $POTJEFS UIF QPTUFSJPS EJTUJSCVUJPO BOE EJČFSFOU JOUF JO 'ĶĴłĿIJ ƋƋ ćJT QPTUFSJPS JT DPOTJTUFOU XJUI PCTFSWJOH  XBUFST JO  UPTTFT BOE B VO ĘBU QSJPS *U JT IJHIMZ TLFXFE IBWJOH JUT NBYJNVN WBMVF BU UIF CPVOEBSZ Q =  :P DPNQVUF JU WJB HSJE BQQSPYJNBUJPO XJUI 3 DPEF  +Ǭ"-$ ʄǤ . ,ǭ !-*(ʃƻ ǐ /*ʃƼ ǐ ' )"/#Ǐ*0/ʃƼƻƻƻ Ǯ +-$*- ʄǤ - +ǭƼǐƼƻƻƻǮ '$& '$#** ʄǤ $)*(ǭ ƾ ǐ .$5 ʃƾ ǐ +-*ʃ+Ǭ"-$ Ǯ +*./ -$*- ʄǤ '$& '$#** Ƿ +-$*- +*./ -$*- ʄǤ +*./ -$*- dz .0(ǭ+*./ -$*-Ǯ .(+' . ʄǤ .(+' ǭ +Ǭ"-$ ǐ .$5 ʃƼ ƿ ǐ - +' ʃ ǐ +-*ʃ+*./ -$*- Ǯ ćJT DPEF BMTP HPFT BIFBE UP TBNQMF GSPN UIF QPTUFSJPS /PX PO UIF MFę PG 'ĶĴłĿIJ Ƌ  QFSDFOUJMF DPOĕEFODF JOUFSWBM JT TIBEFE :PV DBO DPOWFOJFOUMZ DPNQVUF UIJT GSPN TBNQMFT XJUI  QBSU PG - /#$)&$)"  3 DPEF   ǭ .(+' . ǐ +-*ʃƻǏǀ Ǯ ƽǀɳ ǂǀɳ ƻǏǂƻƾǂƻƾǂ ƻǏDŽƾƽDŽƾƽDŽ ćJT JOUFSWBM BTTJHOT  PG UIF QSPCBCJMJUZ NBTT BCPWF BOE CFMPX UIF JOUFSWBM 4P JU WJEFT UIF DFOUSBM  QSPCBCJMJUZ #VU JO UIJT FYBNQMF JU FOET VQ FYDMVEJOH UIF NPTU BCMF QBSBNFUFS WBMVFT OFBS Q =  4P JO UFSNT PG EFTDSJCJOH UIF TIBQF PG UIF QPT TBNQMFT XJUI  QBSU PG - /#$)&$)"  3 DPEF   ǭ .(+' . ǐ +-*ʃƻǏǀ Ǯ ƽǀɳ ǂǀɳ ƻǏǂƻƾǂƻƾǂ ƻǏDŽƾƽDŽƾƽDŽ ćJT JOUFSWBM BTTJHOT  PG UIF QSPCBCJ WJEFT UIF DFOUSBM  QSPCBCJMJUZ #VU JO BCMF QBSBNFUFS WBMVFT OFBS Q =  4P EJTUSJCVUJPO‰XIJDI JT SFBMMZ BMM UIFTF JO CF NJTMFBEJOH *O DPOUSBTU UIF SJHIUIBOE QMPU JO ' ıIJĻŀĶŁņ ĶĻŁIJĿŃĮĹ )1%*  ćF )1% QSPCBCJMJUZ NBTT *G ZPV UIJOL BCPVU JU UI XJUI UIF TBNF NBTT #VU JG ZPV XBOU BO NPTU DPOTJTUFOU XJUI UIF EBUB UIFO ZPV )1%* JT $PNQVUF JU GSPN UIF TBNQMFT X 3 DPEF   ǭ .(+' . ǐ +-*ʃƻǏǀ Ǯ '*2 - ƻǏǀ 0++ - ƻǏǀ ƻǏǃƿƼǃƿƼǃ ƼǏƻƻƻƻƻƻƻ ćJT JOUFSWBM DBQUVSFT UIF QBSBNFUFST XJU UJDFBCMZ OBSSPXFS  JO XJEUI SBUIFS U 4P UIF )1%* IBT TPNF BEWBOUBHFT JOUFSWBM BSF WFSZ TJNJMBS ćFZ POMZ MPP
  33. Point estimates • Don’t usually want point estimates • Entire

    posterior contains more information • “Best” point depends upon purpose • Mean nearly always more sensible than mode  4".1-*/( 0.00 0.25 0.50 0.75 1.00 0.000 0.001 0.002 0.003 0.004 proportion water (p) Density 50% Percentile Interval 'ĶĴłĿIJ ƋƋ ćF EJČFSFODF CFUXFFO TJUZ DPOĕEFODF JOUFSWBMT ćF QPTUFS QSJPS BOE PCTFSWJOH UISFF XBUFS TBN -Fę  QFSDFOUJMF JOUFSWBM ćJT JOU UIF MFę BOE SJHIU UBJM "T B SFTVMU JU PN Q =  3JHIU  IJHIFTU QPTUFSJPS ĕOET UIF OBSSPXFTU SFHJPO XJUI 
  34. Predictive checks • Posterior probability never enough • Even the

    best model might make terrible predictions • Also want to check model assumptions • Predictive checks: Can use samples from posterior to simulate observations • NB: Assumption about sampling is assumption
  35. 0 1000 3000 number of water samples Frequency 0 3

    6 9 0.89 0 1000 3000 number of water samples Frequency 0 3 6 9 0.38 0 1000 3000 number of water samples Frequency 0 3 6 9 0.64 0 1000 3000 number of water samples Frequency 0 3 6 9 0.000 0.010 0.020 probability of water probability 0 0.5 1 (A) p = 0.38 (B) p = 0.64 (C) p = 0.89 A B C Merged Figure 3.4 0 1000 3000 Frequency 0 3 6 9 0.89 0 1000 3000 number of water samples Frequency 0 3 6 9 0 1000 3000 number of water samples Frequency 0 3 6 9 0.64 0 1000 3000 number of water samples Frequency 0 3 6 9 0.000 0.010 0.020 probability of water probability 0 0.5 1 (B) p = 0.64 (C) p = 0.89 A B C Merged
  36. Posterior predictions • One line of code • Will get

    harder, later. But strategy remains the same. 0 1000 3000 number of water samples Frequency 0 3 6 9 0.89 number of water samples 0 3 6 9 0 1000 3000 number of water samples Frequency 0 3 6 9 0.64 0 1000 3000 number of water samples Frequency 0 3 6 9 0.000 0.010 0.020 probability of water probability 0 0.5 1 (B) p = 0.64 (C) p = 0.89 A B C Merged 'ĶĴłĿIJ Ƌƌ 4JNVMBUJOH QSFEJDUJPOT GSPN UIF UPUBM QPTUFSJPS -Fę ćF GBNJMJBS QPTUFSJPS EFOTJUZ GPS UIF HMPCFUPTTJOH EBUB ćSFF FYBNQMF QBSBNFUFS WBMVFT    BSF NBSLFE CZ UIF number of water samples 'ĶĴłĿIJ Ƌƌ 4JNVMBUJOH QSFEJDUJPOT GSPN UIF UPUBM QPTUFSJPS -Fę ćF GBNJMJBS QPTUFSJPS EFOTJUZ GPS UIF HMPCFUPTTJOH EBUB ćSFF FYBNQMF QBSBNFUFS WBMVFT    BSF NBSLFE CZ UIF WFSUJDBM MJOFT .JEEMF DPMVNO &BDI PG UIF UISFF QBSBNFUFS WBM VFT JT VTFE UP TJNVMBUF PCTFSWBUJPOT 3JHIU $PNCJOJOH TJNV MBUFE PCTFSWBUJPO EJTUSJCVUJPOT GPS BMM QBSBNFUFS WBMVFT OPU KVTU   BOE  FBDI XFJHIUFE CZ JUT QPTUFSJPS QSPCBCJMJUZ QSPEVDFT UIF QPTUFSJPS QSFEJDUJWF EFOTJUZ ćJT EFOTJUZ QSPQB HBUFT VODFSUBJOUZ BCPVU QBSBNFUFS UP VODFSUBJOUZ BCPVU QSFEJD UJPO 0CTFSWFE WBMVF  IJHIMJHIUFE 3 DPEF  )2 ʄǤ -$)*(ǭ Ƽ ƿ ǐ .$5 ʃDŽ ǐ +-*ʃ.(+' . Ǯ ćF TZNCPM .(+' . BCPWF JT UIF TBNF MJTU PG SBOEPN TBNQMFT GSPN UIF QPTUFSJPS EFOTJUZ UIBU ZPVWF VTFE JO QSFWJPVT TFDUJPOT 'PS FBDI TBNQMFE WBMVF B SBOEPN CJOPNJBM PCTFSWBUJPO JT HFOFSBUFE 4JODF UIF TBNQMFE WBMVFT BQQFBS JO QSPQPS UJPO UP UIFJS QPTUFSJPS QSPCBCJMJUJFT UIF SFTVMUJOH TJNVMBUFE PCTFSWBUJPOT BSF BW FSBHFE PWFS UIF QPTUFSJPS :PV DBO NBOJQVMBUF UIFTF TJNVMBUFE PCTFSWBUJPOT KVTU MJLF ZPV NBOJQVMBUF TBNQMFT GSPN UIF QPTUFSJPS‰ZPV DBO DPNQVUF JOUFSWBMT BOE
  37. Predictive checks • Something like a significance test, but not

    • No universally best way to evaluate adequacy of model-based predictions • No way to justify always using a threshold like 5% • Good predictive checks always depend upon purpose and imagination “It would be very nice to have a formal apparatus that gives us some ‘optimal’ way of recognizing unusual phenomena and inventing new classes of hypotheses [...]; but this remains an art for the creative human mind.” —E.T. Jaynes (1922–1998)