Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistical Rethinking - Lecture 11

Statistical Rethinking - Lecture 11

Lecture 11 - Markov chain Monte Carlo - Statistical Rethinking: A Bayesian Course with R Examples

Richard McElreath

February 11, 2015
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. Whatever Fortune has raised on high, she lifts but to

    bring low. –Seneca, 1st century AD How, therefore, is she good, who without discernment comes to both the good and to the bad? It profits one nothing to worship her if she is truly fortune...let the bad worship her...this supposed deity. –St Augustine, 5th century AD
  2. Goals this week • Introduce Markov chain Monte Carlo •

    Understand the approach • Meet different strategies • Meet interface to MCMC, map2stan • See how to check chains for good health • See how to fix some common problems • Maybe start maximum entropy
  3. Contract: King Markov must visit each island in proportion to

    its population size. Here’s how he does it...
  4. (1) Flip a coin to choose island on left or

    right. Call it the “proposal” island. 1 2 1 2
  5. (2) Find population of proposal island. (1) Flip a coin

    to choose island on left or right. Call it the “proposal” island. proposal
  6. (2) Find population of proposal island. 1 2 3 4

    5 6 7 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. proposal
  7. (2) Find population of proposal island. 1 2 3 4

    5 6 7 p5 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. proposal
  8. (2) Find population of proposal island. 1 2 3 4

    5 6 7 p5 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. (3) Find population of current island. p4 proposal
  9. (2) Find population of proposal island. 1 2 3 4

    5 6 7 p5 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. (3) Find population of current island. p4 (4) Move to proposal, with probability = p5 p4 proposal
  10. (2) Find population of proposal island. 1 2 3 4

    5 6 7 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. (3) Find population of current island. (4) Move to proposal, with probability = p5 p4 (5) Repeat from (1)
  11. (2) Find population of proposal island. 1 2 3 4

    5 6 7 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. (3) Find population of current island. (4) Move to proposal, with probability = p5 p4 (5) Repeat from (1) This procedure ensures visiting each island in proportion to its population, in the long run.
  12. Metropolis algorithm XFFLT PUIFS UJNFT CPVODJOH BSPVOE XJUIPVU BQQBSFOU QBUUFSO

    #VU JO UIF MPOH SVO UIJT QSPDFEVSF HVBSBOUFFT UIBU UIF LJOH XJMM CF GPVOE PO FBDI JTMBOE JO QSPQPSUJPO UP JUT QPQ VMBUJPO TJ[F :PV DBO QSPWF UIJT UP ZPVSTFMG CZ TJNVMBUJOH ,JOH .BSLPWT KPVSOFZ )FSFT B TIPSU QJFDF PG DPEF UP EP UIJT TUPSJOH UIF IJTUPSZ PG UIF LJOHT JTMBOE QPTJUJPOT JO UIF WFDUPS -,0&1&,+0 3 DPEF  +2*Ǒ4""(0 ʆǦ ƾ"ǁ -,0&1&,+0 ʆǦ /"-ǯƽǒ+2*Ǒ4""(0ǰ 2//"+1 ʆǦ ƾƽ #,/ ǯ & &+ ƾǓ+2*Ǒ4""(0 ǰ dz ȅ /" ,/! 2//"+1 -,0&1&,+ -,0&1&,+0DZ&Dz ʆǦ 2//"+1 ȅ #)&- ,&+ 1, $"+"/1" -/,-,0) -/,-,0) ʆǦ 2//"+1 ʀ 0*-)"ǯ ǯǦƾǒƾǰ ǒ 0&7"ʅƾ ǰ ȅ +,4 *(" 02/" %" ),,-0 /,2+! 1%" / %&-")$, &# ǯ -/,-,0) ʆ ƾ ǰ -/,-,0) ʆǦ ƾƽ &# ǯ -/,-,0) ʇ ƾƽ ǰ -/,-,0) ʆǦ ƾ ȅ *,3"ǘ -/,Ǒ*,3" ʆǦ -/,-,0)ǵ 2//"+1 2//"+1 ʆǦ &#")0"ǯ /2+&#ǯƾǰʆ-/,Ǒ*,3" ǒ -/,-,0) ǒ 2//"+1 ǰ Ǵ
  13. 2 4 6 8 10 0 5 10 15 20

    25 island number of weeks after 100 weeks 2 4 6 8 10 0 20 40 60 80 100 island number of weeks after 500 weeks 2 4 6 8 10 0 100 200 300 400 island number of weeks after 2000 weeks 0 2000 4000 6000 8000 10000 2 4 6 8 10 week island
  14. Markov’s chain of visits • Converges to correct proportions, in

    the long run • No matter which island he starts on • Works as long as proposals are symmetric • Example of the Metropolis algorithm 2 4 6 8 10 0 500 1000 1500 island number of weeks after 10000 weeks
  15. Metropolis and MCMC • Metropolis: Simple version of Markov chain

    Monte Carlo (MCMC) • Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller (1953) into radicals (by collision, or possibly radiation, as in aromatic hydrocarbons). (b) The molecules dissociate, but the resulting radi- cals recombine without escaping from the liquid cage. (c) The molecules dissociate and escape from the cage. In this case we would not expect them to move more than a few molecular diameters through the dense medium before being thermalized. In accordance with the notation introduced by Burton, Magee, and Samuel,22 the molecules following 22 Burton, Magee, and Samuel, J. Chern. Phys. 20, 760 (1952). THE JOURNAL OF CHEMICAL PHYSICS tions that the ionized H2 0 molecules will become the H2 0t molecules, but this is not likely to be a complete correspondence. In conclusion we would like to emphasize that the qualitative result of this section is not critically de- pendent on the exact values of the physical parameters used. However, this treatment is classical, and a correct treatment must be wave mechanical; therefore the result of this section cannot be taken as an a priori theoretical prediction. The success of the radical diffu- sion model given above lends some plausibility to the occurrence of electron capture as described by this crude calculation. Further work is clearly needed. VOLUME 21, NUMBER 6 JUNE, 1953 Equation of State Calculations by Fast Computing Machines NICHOLAS METROPOLIS, ARIANNA W. ROSENBLUTH, MARSHALL N. ROSENBLUTH, AND AUGUSTA H. TELLER, Los Alamos Scientific Laboratory, Los Alamos, New Mexico AND EDWARD TELLER, * Department of Physics, University of Chicago, Chicago, Illinois (Received March 6, 1953) A general method, suitable for fast computing machines, for investigatiflg such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two-dimensional rigid-sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four-term virial coefficient expansion.
  16. MANIAC: Mathematical Analyzer, Numerical Integrator, and Computer MANIAC: 1000 pounds

    5 kilobytes of memory 70k multiplications/sec Your laptop: 4–7 pounds 2–8 million kilobytes Billions of multiplications/sec
  17. Metropolis and MCMC • Metropolis: Simple version of Markov chain

    Monte Carlo (MCMC) • Chain: Sequence of draws from distribution • Markov chain: History doesn’t matter, just where you are now • Monte Carlo: Random simulation Andrei Andreyevich Markov (Ма́рков) (1856–1922)
  18. Metropolis and MCMC • Usual use is to draw samples

    from a posterior distribution • “Islands”: parameter values • “Population size”: proportional to posterior probability • Works for any number of dimensions (parameters) • Works for continuous as well as discrete parameters
  19. Why MCMC? • Real value of MCMC seen when we

    can’t write an integrated likelihood function • Many kinds of problems are like this: • Many multilevel models • Networks/phylogenies • Some spatial models
  20. MCMC strategies • Metropolis: Granddaddy of them all • Metropolis-Hastings

    (MH): More general • Gibbs sampling (GS): Efficient version of MH • Hamiltonian Monte Carlo (HMC) • All remain useful • New methods being developed
  21. Gibbs sampling • Metropolis algorithm requires symmetrical proposals • Metropolis-Hastings

    (MH) does not • Gibbs sampling: Version of MH with very clever proposals • requires choosing certain kinds of priors, conjugate • Basis of BUGS, JAGS
  22.   ."3,07 $)"*/ .0/5& $"3-0 &45*."5*0/ 8 10 12

    14 146 150 154 sigma mu Gibbs sampling 8 10 12 14 146 150 154 sigma mu Metropolis-Hastings 'ĶĴłĿIJ Ɛƌ ćF ĕSTU  TBNQMFT GSPN (JCCT TBNQMJOH MFę BOE ĕYFE TUFQ .FUSPQPMJT)BTUJOHT SJHIU  ćF CMBDL QMVT TJHOT JO FBDI QMPU MPDBUF
  23. Hamiltonian Monte Carlo • Problem with Gibbs sampling (GS) •

    Models with many parameters usually have lots of highly correlated parameters • GS gets stuck, degenerates towards random walk • At best, inefficient because re-explores regions • Hamiltonian dynamics to the rescue • represent parameter state as particle in n- dimensional space • flick it around frictionless log-posterior • record positions William Rowan Hamilton (1805–1865) Commemorated on Irish Euro coin
  24. Hamiltonian Monte Carlo • Population density curve: log-posterior • Position

    of car: parameter vector • Speed of car: momentum of parameter values • Go fast when high • Go slow when low • Samples of position through time comprise samples from posterior distribution
  25. -2 -1 0 1 2 -2 -1 0 1 2

    parameter 1 parameter 2
  26. -2 -1 0 1 2 -2 -1 0 1 2

    parameter 1 parameter 2 Gibbs sampling
  27. -2 -1 0 1 2 -2 -1 0 1 2

    parameter 1 parameter 2 Hamiltonian awesome sauce
  28. MCMC from Hamiltonian dynamics 0 200 400 600 800 1000

    −3 −2 −1 0 1 2 3 iteration last position coordinate Random−walk Metropolis 0 200 400 600 800 1000 −3 −2 −1 0 1 2 3 iteration last position coordinate Hamiltonian Monte Carlo gure 6: Values for the variable with largest standard deviation for the 100-dimension ample, from a random-walk Metropolis run and an HMC run with L = 150. To mat mputation time, 150 updates were counted as one iteration for random-walk Metropoli te works only when, as here, the distribution is tightly constrained in only one directio Neal. 2011. MCMC from Hamiltonian dynamics
  29. • mc-stan.org • Install RStan 1. Get C++ compiler 2.

    ??? 3. Profit Stanislaw Ulam (1909–1984)
  30. Stan is NUTS • No U-Turn Sampler (NUTS): Adaptive Hamiltonian

    Monte Carlo • Implemented in Stan (rstan: mc-stan.org) • We’ll use map2stan for now Formula Stan model C++ model Reusable Sampler
  31. HMC example • Back to terrain ruggedness, data(rugged) • Re-estimate

    using Stan • See how it works • See what to expect from healthy Markov chains
  32.  ʚǶ ȁ *(+' / ǡ. .ǿɶ-"++ǾǏǍǍǍȀ Ǣ Ȃ 4P

    ZPV SFNFNCFS UIF PME XBZ XFSF HPJOH UP SFQFBU UIF QSPDFEVSF GPS ĕUUJOH UIF JOUFSBDUJPO NPEFM ćJT NPEFM BJNT UP QSFEJDU MPH(%1 XJUI UFSSBJO SVHHFEOFTT DPOUJOFOU BOE UIF JOUFSBDUJPO PG UIF UXP )FSFT UIF XBZ UP EP JU XJUI (+ KVTU MJLF CFGPSF 3 DPEF  (Ǖǡǎ ʚǶ (+ǿ '$./ǿ '*"Ǿ"+ ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ  ʔ -ȉ-0""  ʔ ȉ*)/Ǿ!-$ ʔ -ȉ-0"" ȉ*)/Ǿ!-$ Ǣ  ʡ )*-(ǿǍǢǎǍǍȀǢ - ʡ )*-(ǿǍǢǎǍȀǢ  ʡ )*-(ǿǍǢǎǍȀǢ - ʡ )*-(ǿǍǢǎǍȀǢ .$"( ʡ 0)$!ǿǍǢǎǍȀ Ȁ Ǣ /ʙ Ȁ +- $.ǿ(ǕǡǎȀ  ) / 1 Ǐǡǒʉ ǖǔǡǒʉ  ǖǡǏǏ ǍǡǎǑ Ǖǡǖǒ ǖǡǑǖ - ǶǍǡǏǍ ǍǡǍǕ ǶǍǡǐǒ ǶǍǡǍǒ  Ƕǎǡǖǒ ǍǡǏǏ ǶǏǡǐǖ Ƕǎǡǒǎ - Ǎǡǐǖ Ǎǡǎǐ ǍǡǎǑ ǍǡǓǒ .$"( Ǎǡǖǐ ǍǡǍǒ ǍǡǕǐ ǎǡǍǐ +VTU BT ZPV TBX JO UIF QSFWJPVT DIBQUFS  1SFQBSBUJPO #VU OPX XFMM BMTP ĕU UIJT NPEFM VTJOH )BNJMUPOJBO .POUF $BSMP ćJT MAP estimates
  33. HMC estimates • Can use same formula list as in

    map • Pre-process variable transformations • Make a trim data frame with only used variables +VTU BT ZPV TBX JO UIF QSFWJPVT DIBQUFS  1SFQBSBUJPO #VU OPX XFMM BMTP ĕU UIJT NPEFM VTJOH )BNJMUPOJBO .POUF $BSMP ćJT NFBOT UIFSF XJMM CF OP NPSF RVBESBUJD BQQSPYJNBUJPO‰JG UIF QPTUFSJPS EJTUSJCVUJPO JT OPO(BVTTJBO UIFO XFMM HFU XIBUFWFS OPO(BVTTJBO TIBQF JU IBT :PV DBO VTF FYBDUMZ UIF TBNF GPSNVMB MJTU BT CFGPSF CVU ZPV OFFE UP EP UXP BEEJUJPOBM UIJOHT  1SFQSPDFTT BMM WBSJBCMF USBOTGPSNBUJPOT *G UIF PVUDPNF JT USBOTGPSNFE TPNF IPX MJLF CZ UBLJOH UIF MPHBSJUIN UIFO EP UIJT CFGPSF ĕUUJOH UIF NPEFM CZ DPO TUSVDUJOH B OFX WBSJBCMF JO UIF EBUB GSBNF -JLFXJTF JG BOZ QSFEJDUPS WBSJBCMFT BSF USBOTGPSNFE JODMVEJOH TRVBSJOH BOE DVCJOH BOE TVDI UP CVJME QPMZOPNJBM NPEFMT UIFO DPNQVUF UIFTF USBOTGPSNFE WBMVFT CFGPSF ĕUUJOH UIF NPEFM  0ODF ZPVWF HPU BMM UIF WBSJBCMFT SFBEZ NBLF B OFX USJNNFE EPXO EBUB GSBNF UIBU DPOUBJOT POMZ UIF WBSJBCMFT ZPV XJMM BDUVBMMZ VTF UP ĕU UIF NPEFM 5FDIOJDBMMZ ZPV EPOU IBWF UP EP UIJT #VU EPJOH TP BWPJET DPNNPO QSPCMFNT 'PS FYBNQMF JG BOZ PG UIF VOVTFE WBSJBCMFT IBWF NJTTJOH WBMVFT  UIFO 4UBO XJMM SFGVTF UP XPSL )FSFT IPX UP EP CPUI GPS UIF UFSSBJO SVHHFEOFTT NPEFM 3 DPEF  ɠ'*"Ǭ"+ ʄǤ '*"ǭ ɠ-"++Ǭƽƻƻƻ Ǯ Ǐ/-$( ʄǤ ǯ ǐ ǭǙ'*"Ǭ"+ǙǐǙ-0"" ǙǐǙ*)/Ǭ!-$ǙǮ ǰ ./-ǭǏ/-$(Ǯ  &"4: ).$ ƽ  ǘ/Ǐ!-( ǘǑ Ƽǂƻ *.Ǐ *! ƾ 1-$' .Ǒ ɠ '*"Ǭ"+ Ǒ )0( ǂǏƿDŽ ǃǏƽƽ DŽǏDŽƾ DŽǏƿƼ ǂǏǂDŽ ǏǏǏ ɠ -0""  Ǒ )0( ƻǏǃǀǃ ƾǏƿƽǂ ƻǏǂǁDŽ ƻǏǂǂǀ ƽǏǁǃǃ ǏǏǏ ɠ *)/Ǭ!-$Ǒ $)/ Ƽ ƻ ƻ ƻ ƻ ƻ ƻ ƻ ƻ Ƽ ǏǏǏ ćF EBUB GSBNF Ǐ/-$( DPOUBJOT POMZ UIF UISFF WBSJBCMFT XFSF VTJOH
  34. HMC estimates ɶ -0""  ǣ )0( ǍǡǕǒǕ ǐǡǑǏǔ ǍǡǔǓǖ

    Ǎǡǔǔǒ ǏǡǓǕǕ ǡǡǡ ɶ *)/Ǿ!-$ǣ $)/ ǎ Ǎ Ǎ Ǎ Ǎ Ǎ Ǎ Ǎ Ǎ ǎ ǡǡǡ ćF EBUB GSBNF ǡ/-$( DPOUBJOT POMZ UIF UISFF WBSJBCMFT XFSF VTJOH  &TUJNBUJPO /PX QSPWJEFE ZPV IBWF UIF -./) QBDLBHF JOTUBMMFE NDTUBOPSH ZPV DBO HFU TBNQMFT GSPN UIF QPTUFSJPS EJTUSJCVUJPO XJUI UIJT DPEF 3 DPEF  (Ǖǡǎ./) ʚǶ (+Ǐ./)ǿ '$./ǿ '*"Ǿ"+ ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ  ʔ -ȉ-0""  ʔ ȉ*)/Ǿ!-$ ʔ -ȉ-0"" ȉ*)/Ǿ!-$ Ǣ  ʡ )*-(ǿǍǢǎǍǍȀǢ - ʡ )*-(ǿǍǢǎǍȀǢ  ʡ )*-(ǿǍǢǎǍȀǢ - ʡ )*-(ǿǍǢǎǍȀǢ .$"( ʡ 0#4ǿǍǢǏȀ Ȁ Ǣ /ʙǡ/-$( Ȁ ćFSF JT POF DIBOHF UP OPUF IFSF CVU UP FYQMBJO MBUFS ćF VOJGPSN QSJPS PO .$"( IBT CFFO DIBOHFE UP B IBMG$Įłİĵņ QSJPS ćF $BVDIZ EJTUSJCVUJPO JT B VTFGVM UIJDLUBJMFE QSPCBCJMJUZ EJTUSJCVUJPO SFMBUFE UP UIF 4UVEFOU U EJTUSJCVUJPO ćFSFT BO 0WFSUIJOLJOH CPY BCPVU JU MBUFS JO UIF DIBQUFS QBHF   :PV DBO UIJOL PG JU BT B HPPE XFBLMZSFHVMBSJ[JOH QSJPS GPS TUBOEBSE EFWJBUJPOT 8FMM VTF JU BHBJO MBUFS JO UIF DIBQUFS "OE ZPVMM IBWF NBOZ DIBODFT UP HFU VTFE UP JU BT UIF CPPL DPOUJOVFT "ęFS NFTTBHFT BCPVU USBOTMBUJOH DPNQJMJOH BOE TBNQMJOH TFF UIF 0WFSUIJOLJOH CPY GVSUIFS EPXO GPS TPNF FYQMBOBUJPOT PG UIFTF NFTTBHFT (+Ǐ./) SFUVSOT BO PCKFDU UIBU DPOUBJOT B CVODI PG TVNNBSZ JOGPSNBUJPO BT XFMM BT TBNQMFT GSPN UIF QPTUFSJPS EFOTJUZ PG
  35. What is “Cauchy”? • One of many things named after

    Augustin-Louis Cauchy (KO-shee) • Ratio of two Gaussian samples • Useful distribution with thick tails • Parameters: location of mode, scale • Mean and variance undefined • Related to Lévy flights Baron Augustin-Louis Cauchy (1789–1857)
  36. Informational Message: The current Metropolis proposal is about to be

    rejected because of the following issue: Error in function stan::prob::multi_normal_log(N4stan5 Covariance matrixCovariance matrix is not positive definite. Covariance matrix(0,0) is 0:0. If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine, but if this warning occurs often then your model may be either severely ill-conditioned or misspecified. Don’t freak out, when you see this: If it happens some during warmup (adaptation), that’s normal. If it happens a lot, after warmup, then freak out.
  37. HMC estimates • Almost exactly the same (just this model/data)

    -ʃƻǐʃƻǐ-ʃƻǐ.$"(ʃ.ǭǏ/-$(ɠ'*"Ǭ"+Ǯ Ǯ Ǯ "ęFS NFTTBHFT BCPVU USBOTMBUJOH DPNQJMJOH BOE TBNQMJOH TFF UIF P UIFS EPXO GPS TPNF FYQMBOBUJPOT PG UIFTF NFTTBHFT (+ƽ./) SF DPOUBJOT B CVODI PG TVNNBSZ JOGPSNBUJPO BT XFMM BT TBNQMFT GSPN PG BMM QBSBNFUFST :PV DBO DPNQBSF FTUJNBUFT +- $.ǭ(ǃǏƼ./)Ǯ  ) / 1 '*2 - ƻǏDŽǀ 0++ - ƻǏDŽǀ  DŽǏƽƿ ƻǏƼƿ ǃǏDŽDŽ DŽǏǀƾ - ǤƻǏƽƼ ƻǏƻǃ ǤƻǏƾDŽ ǤƻǏƻǂ  ǤƼǏDŽǂ ƻǏƽƽ ǤƽǏƿǃ ǤƼǏǁƼ - ƻǏƿƻ ƻǏƼƾ ƻǏƼƻ ƻǏǁƿ .$"( ƻǏDŽǀ ƻǏƻǀ ƻǏǃǁ ƼǏƻǁ ćFTF WBMVFT BSF WFSZ TJNJMBS UP UIF RVBESBUJD BQQSPYJNBUJPO /PUF CPVOEBSJFT JO UIF UBCMF KVTU BCPWF BSF )1%* OPU 1$*  7JTVBMJ[BUJPO #Z QMPUUJOH UIF TBNQMFT ZPV DBO HFU B EJSFDU B (BVTTJBO RVBESBUJD UIF BDUVBM QPTUFSJPS EFOTJUZ IBT UVSOFE PVU UP TBNQMFT VTF +*./ ʄǤ 3/-/Ǐ.(+' .ǭ (ǃǏƼ./) Ǯ ./-ǭ+*./Ǯ $./ *! ǀ XIFUIFS PS OPU B OBUJPO JT "GSJDBO BOE UIF JOUFSBDUJPO PG UIF UXP ) XJUI (+ KVTU MJLF CFGPSF CVU OPX XJUI TPNF QSJPST 3 DPEF  (ǃǏƼ ʄǤ (+ǭ '$./ǭ '*"ǭ-"++ǬƽƻƻƻǮ ʋ )*-(ǭ (0 ǐ .$"( Ǯ ǐ (0 ʋ  ɾ -Ƿ-0""  ɾ Ƿ*)/Ǭ!-$ ɾ -Ƿ-0"" Ƿ*)  ʋ )*-(ǭƻǐƼƻƻǮǐ - ʋ )*-(ǭƻǐƼƻǮǐ  ʋ )*-(ǭƻǐƼƻǮǐ - ʋ )*-(ǭƻǐƼƻǮǐ .$"( ʋ 0#4ǭƻǐƼǮ Ǯ ǐ /ʃ ǐ ./-/ʃ'$./ǭ ʃ( )ǭ'*"ǭɠ-"++ǬƽƻƻƻǮǮǐ -ʃƻǐʃƻǐ-ʃƻǐ.$"(ʃ.ǭ'*"ǭɠ-"++ǬƽƻƻƻǮǮ Ǯ Ǯ +- $.ǭ(ǃǏƼǮ  ) / 1 ƽǏǀɳ DŽǂǏǀɳ  DŽǏƽƽ ƻǏƼƿ ǃǏDŽǀ DŽǏƿDŽ - ǤƻǏƽƻ ƻǏƻǃ ǤƻǏƾǀ ǤƻǏƻǀ  ǤƼǏDŽǀ ƻǏƽƽ ǤƽǏƾDŽ ǤƼǏǀƼ - ƻǏƾDŽ ƻǏƼƾ ƻǏƼƿ ƻǏǁǀ .$"( ƻǏDŽƾ ƻǏƻǀ ƻǏǃƾ ƼǏƻƾ +VTU BT ZPV TBX JO UIF QSFWJPVT DIBQUFS  1SFQBSBUJPO #VU OPX XFMM BMTP ĕU UIJT NPEFM VTJOH )BNJM ćJT NFBOT UIFSF XJMM CF OP NPSF RVBESBUJD BQQSPYJNBUJPO‰JG UIF Q HMC QA
  38. Extract samples • Once you have samples, they are just

    samples  ) / 1 '*2 - ƻǏDŽǀ 0++ - ƻǏDŽǀ  DŽǏƽƿ ƻǏƼƿ ǃǏDŽDŽ DŽǏǀƾ - ǤƻǏƽƼ ƻǏƻǃ ǤƻǏƾDŽ ǤƻǏƻǂ  ǤƼǏDŽǂ ƻǏƽƽ ǤƽǏƿǃ ǤƼǏǁƼ - ƻǏƿƻ ƻǏƼƾ ƻǏƼƻ ƻǏǁƿ .$"( ƻǏDŽǀ ƻǏƻǀ ƻǏǃǁ ƼǏƻǁ ćFTF WBMVFT BSF WFSZ TJNJMBS UP UIF RVBESBUJD BQQSPYJNBUJPO /PUF UIBU UIF DPOĕEFODF CPVOEBSJFT JO UIF UBCMF KVTU BCPWF BSF )1%* OPU 1$*  7JTVBMJ[BUJPO #Z QMPUUJOH UIF TBNQMFT ZPV DBO HFU B EJSFDU BQQSFDJBUJPO GPS IPX (BVTTJBO RVBESBUJD UIF BDUVBM QPTUFSJPS EFOTJUZ IBT UVSOFE PVU UP CF 5P QVMM PVU UIF TBNQMFT VTF 3 DPEF  +*./ ʄǤ 3/-/Ǐ.(+' .ǭ (ǃǏƼ./) Ǯ ./-ǭ+*./Ǯ $./ *! ǀ ɠ  Ǒ )0( ǯƼǑƼƻƻƻǭƼǮǰ DŽǏƾ DŽǏƾƿ DŽǏƽƼ DŽǏƿƾ DŽǏƽǃ ǏǏǏ ɠ - Ǒ )0( ǯƼǑƼƻƻƻǭƼǮǰ ǤƻǏƼƾƾ ǤƻǏƽƼƿ ǤƻǏƽƼǀ ǤƻǏƽƽDŽ ǤƻǏƽƻDŽ ǏǏǏ List of 5 $ a : num [1:1000(1d)] 9.38 9.24 9.26 9.46 9.21 ... $ br : num [1:1000(1d)] -0.241 -0.184 -0.192 -0.376 -0.167 ... $ bA : num [1:1000(1d)] -2.19 -2.09 -1.91 -2.01 -1.95 ... $ brA : num [1:1000(1d)] 0.488 0.296 0.399 0.493 0.444 ... $ sigma: num [1:1000(1d)] 0.928 1 0.839 0.938 0.948 ...
  39. Check the chain • Sometimes it doesn’t work • First

    and most important check: trace plot  &"4: ).$ ƽ  'ĶĴłĿIJ ƐƎ 5SBDF QMPU PG UIF .BSLPW DIBJO GSPN UIF SVHHFEOFTT NPEFM (ǃǏƼ./) ćJT JT B DMFBO IFBMUIZ .BSLPW DIBJO CPUI TUBUJPOBSZ BOE warmup ĽĹļŁ " USBDF QMPU NFSFMZ QMPUT UIF TBNQMFT JO TFRVFOUJBM PSEFS KPJO .BSLPWT QBUI UISPVHI UIF JTMBOET JO UIF NFUBQIPS BU UIF TUBSU PG UIF D USBDF QMPU PG FBDI QBSBNFUFS DBO IFMQ UP EJBHOPTF NBOZ DPNNPO QSP DPNF UP SFDPHOJ[F B IFBMUIZ GVODUJPOJOH .BSLPW DIBJO RVJDL DIFDLT B MPU PG QFBDF PG NJOE " USBDF QMPU JTOU UIF MBTU UIJOH BOBMZTUT EP UP JO #VU JUT OFBSMZ BMXBZT UIF ĕSTU *O UIF UFSSBJO SVHHFEOFTT FYBNQMF UIF USBDF QMPU TIPXT B WFSZ I XJUI +'*/ǿ(Ǖǡǎ./)Ȁ ćF SFTVMU JT TIPXO JO 'ĶĴłĿIJ Ɛƌ &BDI QMPU JO UIJT ĕHVSF JT TJNJM ZPV KVTU VTFE GPS FYBNQMF +'*/ǿ+*./ɶǢ/4+ ʙǫ'ǫȀ CVU XJUI TPNF MBCFMJOH UP IFMQ PVU :PV DBO UIJOL PG UIF [JH[BHHJOH USBDF PG FBDI UIF DIBJO UPPL UISPVHI FBDI EJNFOTJPO PG QBSBNFUFS TQBDF *UT FBT UIF [PPNFE QMPU JO UIF MPXFS SJHIU DPSOFS XIJDI EJTQMBZT POMZ UIF XBSNVQ ćF HSBZ SFHJPO JO FBDI QMPU UIF ĕSTU POFUIPVTBOE TBNQMFT NBS QMFT %VSJOH BEBQUBUJPO UIF .BSLPW DIBJO JT MFBSOJOH UP NPSF FďDJ QPTUFSJPS EJTUSJCVUJPO 4P UIFTF TBNQMFT BSF OPU OFDFTTBSJMZ SFMJBCMF UP BSF BVUPNBUJDBMMZ EJTDBSEFE CZ 3/-/ǡ.(+' . XIJDI SFUVSOT PO JO UIF XIJUF SFHJPOT PG 'ĶĴłĿIJ Ɛƌ
  40. Warmup • What is “warmup”? • Adaptation to posterior for

    efficient sampling • Samples during warmup NOT from posterior • Automatically discarded by precis/summary and other functions • Warmup is NOT “burn in” — burn in is front part of Markov chain using Gibbs or Metropolis  &"4: ).$ warmup
  41. Convergence diagnostics • n_eff: “effective” number of samples • R-hat:

    Should approach 1 as chains converge • Both may mislead; treat as alarms, not security blankets - ʡ )*-(ǿǍǢǎǍȀǢ .$"( ʡ 0#4ǿǍǢǏȀ Ȁ Ǣ /ʙǡ/-$( Ȁ ćFSF JT POF DIBOHF UP OPUF IFSF CVU UP FYQMBJO MBUFS ćF VOJGPSN QSJPS PO .$"( IBT CFFO DIBOHFE UP B IBMG$Įłİĵņ QSJPS ćF $BVDIZ EJTUSJCVUJPO JT B VTFGVM UIJDLUBJMFE QSPCBCJMJUZ EJTUSJCVUJPO SFMBUFE UP UIF 4UVEFOU U EJTUSJCVUJPO ćFSFT BO 0WFSUIJOLJOH CPY BCPVU JU MBUFS JO UIF DIBQUFS QBHF   :PV DBO UIJOL PG JU BT B HPPE XFBLMZSFHVMBSJ[JOH QSJPS GPS TUBOEBSE EFWJBUJPOT 8FMM VTF JU BHBJO MBUFS JO UIF DIBQUFS "OE ZPVMM IBWF NBOZ DIBODFT UP HFU VTFE UP JU BT UIF CPPL DPOUJOVFT "ęFS NFTTBHFT BCPVU USBOTMBUJOH DPNQJMJOH BOE TBNQMJOH TFF UIF 0WFSUIJOLJOH CPY GVSUIFS EPXO GPS TPNF FYQMBOBUJPOT PG UIFTF NFTTBHFT (+Ǐ./) SFUVSOT BO PCKFDU UIBU DPOUBJOT B CVODI PG TVNNBSZ JOGPSNBUJPO BT XFMM BT TBNQMFT GSPN UIF QPTUFSJPS EFOTJUZ PG BMM QBSBNFUFST :PV DBO DPNQBSF FTUJNBUFT 3 DPEF  +- $.ǿ(Ǖǡǎ./)Ȁ  ) / 1 '*2 - Ǎǡǖǒ 0++ - Ǎǡǖǒ )Ǿ !! #/  ǖǡǏǐ ǍǡǎǑ ǕǡǖǏ ǖǡǑǖ ǐǐǕ ǎ - ǶǍǡǏǎ ǍǡǍǕ ǶǍǡǐǔ ǶǍǡǍǓ ǐǏǑ ǎ  Ƕǎǡǖǒ ǍǡǏǏ ǶǏǡǐǖ ǶǎǡǒǍ ǐǏǖ ǎ - ǍǡǑǍ Ǎǡǎǐ ǍǡǏǍ ǍǡǔǍ ǐǎǕ ǎ .$"( Ǎǡǖǒ ǍǡǍǒ ǍǡǕǑ ǎǡǍǒ ǏǕǕ ǎ ćFTF FTUJNBUFT BSF WFSZ TJNJMBS UP UIF RVBESBUJD BQQSPYJNBUJPO #VU OPUF B GFX OFX UIJOHT 'JSTU UIF DPOĕEFODF CPVOEBSJFT JO UIF UBCMF KVTU BCPWF BSF IJHIFTU QPTUFSJPS EFOTJUZ JOUFS WBMT )1%* QBHF  OPU PSEJOBSZ QFSDFOUJMF JOUFSWBMT 1*  4FDPOE UIFSF BSF UXP DPMVNOT
  42. How many samples? • Use warmup and iter to control

    number of samples • First run: Just getting comfortable • warmup=1000, iter=2000 fine • Eventually: • Enough warmup so mixing good, sampling efficient • Enough sampling for your purpose (n_eff) • Means? 200 usually enough • 99th centile? 20-thousand or more • Poorly mixing chains always require more samples
  43. How many chains? • Use chains to control number of

    chains • First run: One chain, see how it goes • Tune: Multiple chains, to check convergence/ stationarity • Final run: One chain is fine, more if you have processors to spare • My heuristic: 1 short / 4 medium / 2 long