Upgrade to Pro — share decks privately, control downloads, hide ads and more …

L10 Statistical Rethinking Winter 2019

L10 Statistical Rethinking Winter 2019

Lecture 10 of the Dec 2018 through March 2019 edition of Statistical Rethinking. Covers Chapter 9, Markov chain Monte Carlo.

Richard McElreath

January 25, 2019
Tweet

More Decks by Richard McElreath

Other Decks in Education

Transcript

  1. probability of water 0 0.5 1 n = 1 W

    L W W W L W L W confidence probability of water 0 0.5 1 n = 2 W L W W W L W L W probability of water 0 0.5 1 n = 3 W L W W W L W L W probability of water 0 0.5 1 n = 4 W L W W W L W L W confidence probability of water 0 0.5 1 n = 5 W L W W W L W L W probability of water 0 0.5 1 n = 6 W L W W W L W L W probability of water 0 0.5 1 n = 7 W L W W W L W L W confidence probability of water 0 0.5 1 n = 8 W L W W W L W L W probability of water 0 0.5 1 n = 9 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W proportion water 0 0.5 1 plausibility n = 0 W L W W W L W L W
  2. Computing the posterior 1. Analytical approach (often impossible) 2. Grid

    approximation (very intensive) 3. Quadratic approximation (limited) 4. Markov chain Monte Carlo (intensive)
  3. Contract: King Markov must visit each island in proportion to

    its population size. Here’s how he does it...
  4. (1) Flip a coin to choose island on left or

    right. Call it the “proposal” island. 1 2 1 2
  5. (2) Find population of proposal island. (1) Flip a coin

    to choose island on left or right. Call it the “proposal” island. proposal
  6. (2) Find population of proposal island. 1 2 3 4

    5 6 7 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. proposal
  7. (2) Find population of proposal island. 1 2 3 4

    5 6 7 p5 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. proposal
  8. (2) Find population of proposal island. 1 2 3 4

    5 6 7 p5 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. (3) Find population of current island. p4 proposal
  9. (2) Find population of proposal island. 1 2 3 4

    5 6 7 p5 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. (3) Find population of current island. p4 (4) Move to proposal, with probability = p5 p4 proposal
  10. (2) Find population of proposal island. 1 2 3 4

    5 6 7 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. (3) Find population of current island. (4) Move to proposal, with probability = p5 p4 (5) Repeat from (1)
  11. (2) Find population of proposal island. 1 2 3 4

    5 6 7 (1) Flip a coin to choose island on left or right. Call it the “proposal” island. (3) Find population of current island. (4) Move to proposal, with probability = p5 p4 (5) Repeat from (1) This procedure ensures visiting each island in proportion to its population, in the long run.
  12. Markov chain Monte Carlo • Markov chain Monte Carlo (MCMC)

    • Understand the approach • Meet different algorithms • Interfaces to MCMC: Stan & ulam • How to sample responsibly • How to recognize and fix problems
  13. Metropolis algorithm HVBSBOUFFT UIBU UIF LJOH XJMM CF GPVOE PO

    FBDI JTMBOE JO QSPQPSUJPO UP JUT QPQVMBUJPO :PV DBO QSPWF UIJT UP ZPVSTFMG CZ TJNVMBUJOH ,JOH .BSLPWT KPVSOFZ )FSFT B TIPS PG DPEF UP EP UIJT TUPSJOH UIF IJTUPSZ PG UIF LJOHT JTMBOE QPTJUJPOT JO UIF WFDUPS +*.$/ )0(Ǿ2 &. ʚǶ ǎ ǒ +*.$/$*). ʚǶ - +ǿǍǢ)0(Ǿ2 &.Ȁ 0-- )/ ʚǶ ǎǍ !*- ǿ $ $) ǎǣ)0(Ǿ2 &. Ȁ ȃ ȕ - *- 0-- )/ +*.$/$*) +*.$/$*).ȁ$Ȃ ʚǶ 0-- )/ ȕ !'$+ *$) /* " ) -/ +-*+*.' +-*+*.' ʚǶ 0-- )/ ʔ .(+' ǿ ǿǶǎǢǎȀ Ǣ .$5 ʙǎ Ȁ ȕ )*2 (& .0- # '**+. -*0) /# -#$+ '"* $! ǿ +-*+*.' ʚ ǎ Ȁ +-*+*.' ʚǶ ǎǍ $! ǿ +-*+*.' ʛ ǎǍ Ȁ +-*+*.' ʚǶ ǎ ȕ (*1 Ǩ +-*Ǿ(*1 ʚǶ +-*+*.'ȅ0-- )/ 0-- )/ ʚǶ $! '. ǿ -0)$!ǿǎȀ ʚ +-*Ǿ(*1 Ǣ +-*+*.' Ǣ 0-- )/ Ȁ Ȅ
  14. 2 4 6 8 10 0 5 10 15 20

    25 island number of weeks after 100 weeks 2 4 6 8 10 0 20 40 60 80 100 island number of weeks after 500 weeks 2 4 6 8 10 0 100 200 300 400 island number of weeks after 2000 weeks 0 2000 4000 6000 8000 10000 2 4 6 8 10 week island
  15. Markov’s chain of visits • Converges to correct proportions, in

    the long run • No matter which island starts • As long as proposals are symmetric • Example of Metropolis algorithm 2 4 6 8 10 0 500 1000 1500 island number of weeks after 10000 weeks
  16. Metropolis and MCMC • Usual use is to draw samples

    from a posterior distribution • “Islands”: parameter values • “Population size”: proportional to posterior probability • Works for any number of dimensions (parameters) • Works for continuous as well as discrete parameters
  17. Metropolis and MCMC • Metropolis: Simple version of Markov chain

    Monte Carlo (MCMC) • Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller (1953) into radicals (by collision, or possibly radiation, as in aromatic hydrocarbons). (b) The molecules dissociate, but the resulting radi- cals recombine without escaping from the liquid cage. (c) The molecules dissociate and escape from the cage. In this case we would not expect them to move more than a few molecular diameters through the dense medium before being thermalized. In accordance with the notation introduced by Burton, Magee, and Samuel,22 the molecules following 22 Burton, Magee, and Samuel, J. Chern. Phys. 20, 760 (1952). THE JOURNAL OF CHEMICAL PHYSICS tions that the ionized H2 0 molecules will become the H2 0t molecules, but this is not likely to be a complete correspondence. In conclusion we would like to emphasize that the qualitative result of this section is not critically de- pendent on the exact values of the physical parameters used. However, this treatment is classical, and a correct treatment must be wave mechanical; therefore the result of this section cannot be taken as an a priori theoretical prediction. The success of the radical diffu- sion model given above lends some plausibility to the occurrence of electron capture as described by this crude calculation. Further work is clearly needed. VOLUME 21, NUMBER 6 JUNE, 1953 Equation of State Calculations by Fast Computing Machines NICHOLAS METROPOLIS, ARIANNA W. ROSENBLUTH, MARSHALL N. ROSENBLUTH, AND AUGUSTA H. TELLER, Los Alamos Scientific Laboratory, Los Alamos, New Mexico AND EDWARD TELLER, * Department of Physics, University of Chicago, Chicago, Illinois (Received March 6, 1953) A general method, suitable for fast computing machines, for investigatiflg such properties as equations of state for substances consisting of interacting individual molecules is described. The method consists of a modified Monte Carlo integration over configuration space. Results for the two-dimensional rigid-sphere system have been obtained on the Los Alamos MANIAC and are presented here. These results are compared to the free volume equation of state and to a four-term virial coefficient expansion.
  18. MANIAC: Mathematical Analyzer, Numerical Integrator, and Computer MANIAC: 1000 pounds

    5 kilobytes of memory 70k multiplications/sec Your laptop: 4–7 pounds 2–8 million kilobytes Billions of multiplications/sec
  19. Metropolis and MCMC • Metropolis: Simple version of Markov chain

    Monte Carlo (MCMC) • Chain: Sequence of draws from distribution • Markov chain: History doesn’t matter, just where you are now • Monte Carlo: Random simulation Andrei Andreyevich Markov (Ма́рков) (1856–1922)
  20. Why MCMC? • Sometimes can’t write an integrated posterior •

    Even when can, often cannot use it • Many problems are like this: Multilevel models, networks, phylogenies, spatial models • Optimization not a good strategy in high dimensions — must have full distribution • MCMC is not fancy. It is old and essential.
  21. MCMC strategies • Metropolis: Granddaddy of them all • Metropolis-Hastings

    (MH): More general • Gibbs sampling (GS): Efficient version of MH • Metropolis and Gibbs are “guess and check” strategies • Hamiltonian Monte Carlo (HMC) fundamentally different • New methods being developed, but future belongs to the gradient
  22. Metropolis gets stuck   ."3,07 $)"*/ .0/5& $"3-0 -1.5

    -1.0 -0.5 0.0 0.5 1.0 1.5 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 a1 a2 step size 0.1, accept rate 0.62 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 a1 a2 step size 0.25, accept rate 0.34 'ĶĴłĿIJ ƑƋ .FUSPQPMJT DIBJOT VOEFS IJHI DPSSFMBUJPO 'JMMFE QPJOUT JOEJ DBUF BDDFQUFE QSPQPTBMT 0QFO QPJOUT BSF SFKFDUFE QSPQPTBMT #PUI QMPUT TIPX  QSPQPTBMT VOEFS EJČFSFOU QSPQPTBM EJTUSJCVUJPO TUFQ TJ[FT -Fę Figure 9.3 small steps, slow walk bigs steps, low accept rate
  23. Hamiltonian Monte Carlo • Problem with Gibbs sampling (GS) •

    High dimension spaces are concentrated • GS gets stuck, degenerates towards random walk • Inefficient because re-explores • Hamiltonian dynamics to the rescue • represent parameter state as particle • flick it around frictionless log-posterior • record positions • no more “guess and check” • all proposals are good proposals William Rowan Hamilton (1805–1865) Commemorated on Irish Euro coin
  24. Hamiltonian parable • King Monty’s kingdom is a narrow valley

    N–S • Population distribution inversely proportional to altitude • Algorithm: • Start driving randomly N or S at random speed • Car speeds up as it goes downhill • Car slows as it goes uphill, might turn around • Drive for pre-specified duration, then stop • Repeat • Stopping positions will be proportional to population North South
  25. Hamiltonian parable • King Monty’s kingdom is a narrow valley

    N–S • Population distribution inversely proportional to altitude • Algorithm: • Start driving randomly N or S at random speed • Car speeds up as it goes downhill • Car slows as it goes uphill, might turn around • Drive for pre-specified duration, then stop • Repeat • Stopping positions will be proportional to population
  26. 0 100 200 300 time position 0 south north 'ĶĴłĿIJ

    Ƒƍ ,JOH .POUZT 3PZBM %SJWF ćF KPVSOFZ CFHJOT BU UJNF  PO UIF GBS MFę ćF WFIJDMF JT HJWFO B SBOEPN NPNFOUVN BOE B SBOEPN EJSFD UJPO FJUIFS OPSUI UPQ PS TPVUI CPUUPN  ćF UIJDLOFTT PG UIF QBUI TIPXT NPNFOUVN BU FBDI NPNFOU ćF WFIJDMF USBWFMT MPTJOH NPNFOUVN VQIJMM PS HBJOJOH JU EPXOIJMM "ęFS B ĕYFE BNPVOU PG UJNF UIFZ TUPQ BOE NBLF Figure 9.5
  27. Hamiltonian Monte Carlo • Location is parameter values • Really

    simulate motion on frictionless surface • Surface is minus-log-posterior • Series of simulations, each starting from previous • Stopping points comprise valid MCMC samples -3 -2 -1 0 1 2 3 1 2 3 4 5 x -log(dnorm(x)) -3 -2 -1 0 1 2 3 0.0 0.1 0.2 0.3 0.4 x dnorm(x)
  28. Figure 9.6 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 -0.3

    mux -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 -0.3 mux -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 a1 a2 1 2 3 4 Posterior correlation -0.9 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 a1 a2 50 trajectories 'ĶĴłĿIJ ƑƎ ).$ USBKFDUPSJFT 5PQMFę 8JUI UIF SJHIU DPNCJOBUJPO PG MFBQGSPH TUFQT BOE TUFQ TJ[F UIF JOEJWJEVBM QBUIT KVNQ BSPVOE BOE QSPEVDF IJHIMZ JOEFQFOEFOU TBNQMFT GSPN UIF QPTUFSJPS 5PQSJHIU 8JUI UIF XSPOH
  29. Hamiltonian Monte Carlo • Why does HMC work much better?

    • Doesn’t get stuck — follows gradient • Extra variables (momentum, energy) provide diagnostics • But also requires more • Gradients — curvature of log-posterior • “Mass” of particle • Number of leaps in a single trajectory • Size of individual leaps • These need to be tuned right • Gradients are unique to each model -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 -0.3 -0.2 -0. mux 1 3 4 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 a1 a2 1 2 3 4 Posterior correlation -0.9 'ĶĴłĿIJ ƑƎ ).$ USBKFDUPSJFT 5P MFBQGSPH TUFQT BOE TUFQ TJ[F UIF JOEJ IJHIMZ JOEFQFOEFOU TBNQMFT GSPN UI DPNCJOBUJPO TFRVFOUJBM TBNQMFT DBO DIBJO JO UIF UPQSJHIU XJMM TUJMM XPSL MFę ).$ SFBMMZ TIJOFT XIFO UIF Q
  30.   ."3,07 $)"*/ .0/5& $"3-0 3 DPEF  ȕ

     ) . /* - /0-) ) "Ƕ'*"Ƕ+-*$'$/4 (4Ǒ ʚǶ !0)/$*)ǿ , Ǣ ʙǍ Ǣ ʙǎ Ǣ &ʙǍ Ǣ ʙǎ Ȁ ȃ (04 ʚǶ ,ȁǎȂ (03 ʚǶ ,ȁǏȂ  ʚǶ .0(ǿ )*-(ǿ4Ǣ(04ǢǎǢ'*"ʙȀ Ȁ ʔ .0(ǿ )*-(ǿ3Ǣ(03ǢǎǢ'*"ʙȀ Ȁ ʔ )*-(ǿ(04ǢǢǢ'*"ʙȀ ʔ )*-(ǿ(03Ǣ&ǢǢ'*"ʙȀ - /0-)ǿ Ƕ Ȁ Ȅ /PX UIF HSBEJFOU GVODUJPO SFRVJSFT UXP QBSUJBM EFSJWBUJWFT -VDLJMZ (BVTTJBO EFSJWBUJWFT BSF WFSZ DMFBO ćF EFSJWBUJWF PG UIF MPHBSJUIN PG BOZ VOJWBSJBUF (BVTTJBO XJUI NFBO B BOE TUBOEBSE EFWJBUJPO C XJUI SFTQFDU UP B JT ∂ MPH /(Z|B, C) ∂B = Z − B C "OE TJODF UIF EFSJWBUJWF PG B TVN JT B TVN PG EFSJWBUJWFT UIJT JT BMM XF OFFE UP XSJUF UIF HSBEJFOUT ∂6 ∂µY = ∂ MPH /(Y|µY, ) ∂µY + ∂ MPH /(µY|, .) ∂µY = J YJ − µY  +  − µY . "OE UIF HSBEJFOU GPS µZ IBT UIF TBNF GPSN /PX JO DPEF GPSN 3 DPEF  ȕ "-$ )/ !0)/$*) ȕ )  1 /*- *! +-/$'  -$1/$1 . *!  2$/# - .+ / /* 1 /*- , (4Ǿ"-Ǒ ʚǶ !0)/$*)ǿ , Ǣ ʙǍ Ǣ ʙǎ Ǣ &ʙǍ Ǣ ʙǎ Ȁ ȃ (04 ʚǶ ,ȁǎȂ (03 ʚǶ ,ȁǏȂ ǎ ʚǶ .0(ǿ 4 Ƕ (04 Ȁ ʔ ǿ Ƕ (04ȀȅʟǏ ȕȅ(04 Ǐ ʚǶ .0(ǿ 3 Ƕ (03 Ȁ ʔ ǿ& Ƕ (03ȀȅʟǏ ȕȅ(04 - /0-)ǿ ǿ Ƕǎ Ǣ ǶǏ Ȁ Ȁ ȕ ) "/$1  ) -"4 $. ) "Ƕ'*"Ƕ+-* Ȅ ȕ / ./ / . /ǡ. ǿǔȀ 4 ʚǶ -)*-(ǿǒǍȀ 3 ʚǶ -)*-(ǿǒǍȀ 3 ʚǶ .ǡ)0( -$ǿ.' ǿ3ȀȀ 4 ʚǶ .ǡ)0( -$ǿ.' ǿ4ȀȀ ćF HSBEJFOU GVODUJPO BCPWF JTOU UPP CBE GPS UIJT NPEFM #VU JU DBO CF UFSSJGZJOH GPS B SFBTPOBCMZ DPNQMFY NPEFM ćBU JT XIZ UPPMT MJLF 4UBO CVJME UIF HSBEJFOUT EZOBNJDBMMZ VTJOH UIF NPEFM EFĕOJ UJPO /PX XF BSF SFBEZ UP WJTJU UIF IFBSU 5P VOEFSTUBOE TPNF PG UIF EFUBJMT IFSF ZPV TIPVME SFBE 3BEGPSE /FBMT DIBQUFS JO UIF )BOECPPL PG .BSLPW $IBJO .POUF $BSMP "SNFE XJUI UIF MPHQPTUFSJPS BOE HSBEJFOU GVODUJPOT IFSFT UIF DPEF UP QSPEVDF 'ĶĴłĿIJ ƑƎ 3 DPEF  '$--4ǿ.#+ Ȁ ȕ !*- !)4 --*2.  ʚǶ '$./ǿȀ ɶ, ʚǶ ǿǶǍǡǎǢǍǡǏȀ +- ʚǶ Ǎǡǐ +'*/ǿ  Ǣ 4'ʙǫ(04ǫ Ǣ 3'ʙǫ(03ǫ Ǣ 3'$(ʙǿǶ+-Ǣ+-Ȁ Ǣ 4'$(ʙǿǶ+-Ǣ+-Ȁ Ȁ ./ + ʚǶ ǍǡǍǐ ʚǶ ǎǎ ȕ ǍǡǍǐȅǏǕ !*- Ƕ/0-). ǶǶǶ ǎǎ !*- 2*-&$)" 3(+' )Ǿ.(+' . ʚǶ Ǒ +/#Ǿ*' ʚǶ *'ǡ'+#ǿǫ'&ǫǢǍǡǒȀ +*$)/.ǿ ɶ,ȁǎȂ Ǣ ɶ,ȁǏȂ Ǣ +#ʙǑ Ǣ *'ʙǫ'&ǫ Ȁ !*- ǿ $ $) ǎǣ)Ǿ.(+' . Ȁ ȃ  ʚǶ Ǐǿ (4Ǒ Ǣ (4Ǿ"-Ǒ Ǣ ./ + Ǣ Ǣ ɶ, Ȁ  )".*-50/*"/ .0/5& $"3-0  $! ǿ )Ǿ.(+' . ʚ ǎǍ Ȁ ȃ !*- ǿ % $) ǎǣ Ȁ ȃ Ǎ ʚǶ .0(ǿɶ+/-%ȁ%ǢȂʟǏȀȅǏ ȕ &$) /$ ) -"4 '$) .ǿ ɶ/-%ȁ%ǣǿ%ʔǎȀǢǎȂ Ǣ ɶ/-%ȁ%ǣǿ%ʔǎȀǢǏȂ Ǣ *'ʙ+/#Ǿ*' Ǣ '2ʙǎʔǏȉ Ǎ Ȁ Ȅ +*$)/.ǿ ɶ/-%ȁǎǣ ʔǎǢȂ Ǣ +#ʙǎǓ Ǣ *'ʙǫ2#$/ ǫ Ǣ  3ʙǍǡǐǒ Ȁ --*2.ǿ ɶ/-%ȁ ǢǎȂ Ǣ ɶ/-%ȁ ǢǏȂ Ǣ ɶ/-%ȁ ʔǎǢǎȂ Ǣ ɶ/-%ȁ ʔǎǢǏȂ Ǣ --ǡ' )"/#ʙǍǡǐǒ Ǣ --ǡ% ʙ Ǎǡǔ Ȁ / 3/ǿ ɶ/-%ȁ ʔǎǢǎȂ Ǣ ɶ/-%ȁ ʔǎǢǏȂ Ǣ $ Ǣ  3ʙǍǡǕ Ǣ +*.ʙǑ Ǣ *!!. /ʙǍǡǑ Ȁ Ȅ +*$)/.ǿ ɶ/-%ȁ ʔǎǢǎȂ Ǣ ɶ/-%ȁ ʔǎǢǏȂ Ǣ +#ʙ$! '. ǿ ɶ +/ʙʙǎ Ǣ ǎǓ Ǣ ǎ Ȁ Ǣ *'ʙ$! '. ǿ .ǿɶ ȀʛǍǡǎ Ǣ ǫ- ǫ Ǣ ǫ'&ǫ Ȁ Ȁ Ȅ ćF GVODUJPO Ǐ JT CVJMU JOUP - /#$)&$)" *U JT CBTFE VQPO POF PG 3BEGPSE /FBMT FYBNQMF TDSJQUT *U JTOU BDUVBMMZ UPP DPNQMJDBUFE -FUT UPVS UISPVHI JU POF TUFQ BU B UJNF UP UBLF UIF NBHJD BXBZ ćJT GVODUJPO SVOT B TJOHMF USBKFDUPSZ BOE TP QSPEVDFT B TJOHMF TBNQMF :PV OFFE UP VTF JU SFQFBUFEMZ UP CVJME B DIBJO ćBUT XIBU UIF MPPQ BCPWF EPFT ćF ĕSTU DIVOL PG UIF GVODUJPO DIPPTFT SBOEPN NPNFOUVN‰UIF ĘJDL PG UIF QBSUJDMF‰BOE JOJUJBMJ[FT UIF USBKFDUPSZ 3 DPEF  Ǐ ʚǶ !0)/$*) ǿǢ "-ǾǢ +.$'*)Ǣ Ǣ 0-- )/Ǿ,Ȁ ȃ , ʙ 0-- )/Ǿ, + ʙ -)*-(ǿ' )"/#ǿ,ȀǢǍǢǎȀ ȕ -)*( !'$& Ƕ + $. (*( )/0(ǡ 0-- )/Ǿ+ ʙ + ȕ &  #'! ./ + !*- (*( )/0( / /#  "$))$)" + ʙ + Ƕ +.$'*) ȉ "-Ǿǿ,Ȁ ȅ Ǐ ȕ $)$/$'$5 **&& +$)" Ƕ .1 . /-% /*-4 ,/-% ʚǶ (/-$3ǿǢ)-*2ʙ ʔǎǢ)*'ʙ' )"/#ǿ,ȀȀ +/-% ʚǶ ,/-% ,/-%ȁǎǢȂ ʚǶ 0-- )/Ǿ, +/-%ȁǎǢȂ ʚǶ + ćFO UIF BDUJPO DPNFT JO B MPPQ PWFS MFBQGSPH TUFQT TUFQT BSF UBLFO VTJOH UIF HSBEJFOU UP DPNQVUF B MJOFBS BQQSPYJNBUJPO PG UIF MPHQPTUFSJPS TVSGBDF BU FBDI QPJOU 3 DPEF  ȕ '/ -)/ !0'' ./ +. !*- +*.$/$*) ) (*( )/0( !*- ǿ $ $) ǎǣ Ȁ ȃ , ʙ , ʔ +.$'*) ȉ + ȕ 0'' ./ + !*- /# +*.$/$*) ȕ &  !0'' ./ + !*- /# (*( )/0(Ǣ 3 +/ / ) *! /-% /*-4 $! ǿ $Ǧʙ Ȁ ȃ + ʙ + Ƕ +.$'*) ȉ "-Ǿǿ,Ȁ +/-%ȁ$ʔǎǢȂ ʚǶ + Ȅ ,/-%ȁ$ʔǎǢȂ ʚǶ , Ȅ /PUJDF IPX UIF TUFQ TJ[F +.$'*) JT BEEFE UP UIF QPTJUJPO BOE NPNFOUVN WFDUPST *U JT JO UIJT XBZ UIBU UIF QBUI JT POMZ BO BQQSPYJNBUJPO CFDBVTF JU JT B TFSJFT PG MJOFBS KVNQT OPU BO BDUVBM TNPPUI DVSWF ćJT DBO IBWF JNQPSUBOU DPOTFRVFODFT JG UIF MPHQPTUFSJPS CFOET TIBSQMZ BOE UIF TJNVMBUJPO KVNQT PWFS B CFOE "MM UIBU SFNBJOT JT DMFBO VQ FOTVSF UIF QSPQPTBM JT TZNNFUSJD TP UIF .BSLPW DIBJO JT WBMJE BOE EFDJEF XIFUIFS UP BDDFQU PS SFKFDU UIF QSPQPTBM 3 DPEF  ȕ &  #'! ./ + !*- (*( )/0( / /# ) + ʙ + Ƕ +.$'*) ȉ "-Ǿǿ,Ȁ ȅ Ǐ +/-%ȁ ʔǎǢȂ ʚǶ +
  31. Figure 9.6  )".*-50/*"/ .0/5& $"3-0  -0.3 -0.2 -0.1

    0.0 0.1 0.2 0.3 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 mux muy 1 2 3 4 2D Gaussian, L = 11 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 mux muy 2D Gaussian, L = 28 1.5 Posterior correlation -0.9 1.5 50 trajectories The U-Turn Problem
  32. Stan is NUTS • No U-Turn Sampler (NUTS2): Adaptive Hamiltonian

    Monte Carlo • Implemented in Stan (rstan: mc-stan.org) • Stan figures out gradient for you • autodiff, back-propagation Formula Stan model C++ model Reusable Sampler
  33. HMC Praxis • Back to terrain ruggedness • Re-approximate posterior

    • Practical details • What to expect from healthy Markov chains
  34. One hand QUAP’ing ɶ-0"" Ǿ./ ʚǶ ɶ-0""  ȅ (3ǿɶ-0""

    Ȁ ɶ$ ʚǶ $! '. ǿ ɶ*)/Ǿ!-$ʙʙǎ Ǣ ǎ Ǣ Ǐ Ȁ 4P ZPV SFNFNCFS UIF PME XBZ XFSF HPJOH UP SFQFBU UIF QSPDFEVSF GPS ĕUUJOH UIF JOUFSBDUJPO NPEFM ćJT NPEFM BJNT UP QSFEJDU MPH (%1 XJUI UFSSBJO SVHHFEOFTT DPOUJOFOU BOE UIF JOUFSBDUJPO PG UIF UXP )FSFT UIF XBZ UP EP JU XJUI ,0+ KVTU MJLF CFGPSF 3 DPEF  (Ǖǡǒ ʚǶ ,0+ǿ '$./ǿ '*"Ǿ"+Ǿ./ ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ ȁ$Ȃ ʔ ȁ$Ȃȉǿ -0"" Ǿ./ Ƕ ǍǡǏǎǒ Ȁ Ǣ ȁ$Ȃ ʡ )*-(ǿ ǎ Ǣ Ǎǡǎ Ȁ Ǣ ȁ$Ȃ ʡ )*-(ǿ Ǎ Ǣ Ǎǡǐ Ȁ Ǣ .$"( ʡ  3+ǿ ǎ Ȁ Ȁ Ǣ /ʙ Ȁ +- $.ǿ (Ǖǡǒ Ǣ  +/#ʙǏ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ ȁǎȂ ǍǡǕǖ ǍǡǍǏ ǍǡǕǓ Ǎǡǖǎ ȁǏȂ ǎǡǍǒ ǍǡǍǎ ǎǡǍǐ ǎǡǍǔ ȁǎȂ Ǎǡǎǐ ǍǡǍǔ ǍǡǍǎ ǍǡǏǒ ȁǏȂ ǶǍǡǎǑ ǍǡǍǒ ǶǍǡǏǐ ǶǍǡǍǓ .$"( Ǎǡǎǎ ǍǡǍǎ ǍǡǎǍ ǍǡǎǏ +VTU BT ZPV TBX JO UIF QSFWJPVT DIBQUFS  1SFQBSBUJPO #VU OPX XFMM BMTP ĕU UIJT NPEFM VTJOH )BNJMUPOJBO .POUF $BSMP ćJT NFBOT UIFSF XJMM CF OP NPSF RVBESBUJD BQQSPYJNBUJPO‰JG UIF QPTUFSJPS EJTUSJCVUJPO JT OPO
  35.   ."3,07 $)"*/ .0/5& $"3-0 EPOU IBWF UP EP

    UIJT #VU EPJOH TP BWPJET DPNNPO QSPCMFNT 'PS FYBNQMF JG BOZ PG UIF VOVTFE WBSJBCMFT IBWF NJTTJOH WBMVFT  UIFO 4UBO XJMM SFGVTF UP XPSL 8FWF BMSFBEZ QSFUSBOTGPSNFE BMM UIF WBSJBCMFT /PX XF OFFE B TMJN MJTU PG UIF WBSJBCMFT XF XJMM VTF DPEF  /Ǿ.'$( ʚǶ '$./ǿ '*"Ǿ"+Ǿ./ ʙ ɶ'*"Ǿ"+Ǿ./Ǣ -0"" Ǿ./ ʙ ɶ-0"" Ǿ./Ǣ $ ʙ .ǡ$)/ " -ǿ ɶ$ Ȁ Ȁ ./-ǿ/Ǿ.'$(Ȁ $./ *! ǐ ɶ '*"Ǿ"+Ǿ./ǣ )0( ȁǎǣǎǔǍȂ ǍǡǕǕ ǍǡǖǓǒ ǎǡǎǓǓ ǎǡǎǍǑ Ǎǡǖǎǒ ǡǡǡ ɶ -0"" Ǿ./ ǣ )0( ȁǎǣǎǔǍȂ ǍǡǎǐǕ Ǎǡǒǒǐ ǍǡǎǏǑ ǍǡǎǏǒ ǍǡǑǐǐ ǡǡǡ ɶ $ ǣ $)/ ȁǎǣǎǔǍȂ ǎ Ǐ Ǐ Ǐ Ǐ Ǐ Ǐ Ǐ Ǐ ǎ ǡǡǡ *U JT CFUUFS UP VTF B '$./ UIBO B /ǡ!-( CFDBVTF UIF FMFNFOUT JO B '$./ DBO CF BOZ MFOHUI *O B /ǡ!-( BMM UIF FMFNFOUT NVTU CF UIF TBNF MFOHUI 8JUI TPNF NPEFMT UP DPNF MBUFS MJLF NVMUJMFWFM NPEFMT JU JTOU VOVTVBM UP IBWF WBSJBCMFT PG EJČFSFOU MFOHUIT  4BNQMJOH GSPN UIF QPTUFSJPS /PX QSPWJEFE ZPV IBWF UIF -./) QBDLBHF JOTUBMMFE NDTUBOPSH ZPV DBO HFU TBNQMFT GSPN UIF QPTUFSJPS EJTUSJCVUJPO XJUI UIJT DPEF DPEF  (ǖǡǎ ʚǶ 0'(ǿ Hamiltonian Flows • Interface to Stan: ulam QMFT ZPV NBOBHFE UP HFU #/ JT B DPNQMJDBUFE FTUJNBUF PG UIF DPOWFSHFODF PG UIF .BSLPW DIBJOT UP UIF UBSHFU EJTUSJCVUJPO *U TIPVME BQQSPBDI  GSPN BCPWF XIFO BMM JT XFMM  4BNQMJOH BHBJO JO QBSBMMFM ćF FYBNQMF TP GBS JT B WFSZ FBTZ QSPCMFN GPS .$.$ 4P FWFO UIF EFGBVMU  TBNQMFT JT FOPVHI GPS BDDVSBUF JOGFSFODF *O GBDU BT GFX BT  FČFDUJWF TBNQMFT JT VTVBMMZ QMFOUZ GPS B HPPE BQQSPYJNBUJPO PG UIF QPTUFSJPS #VU XF BMTP XBOU UP SVO NVMUJQMF DIBJOT GPS SFBTPOT XFMM EJTDVTT JO NPSF EFQUI JO UIF OFYU TFDUJPOT ćFSF XJMM CF TQFDJĕD BEWJDF JO 4FDUJPO  QBHF   'PS OPX JUT XPSUI OPUJOH UIBU ZPV DBO FBTJMZ QBSBMMFMJ[F UIPTF DIBJOT BT XFMM ćFZ DBO BMM SVO BU UIF TBNF UJNF JOTUFBE PG JO TFRVFODF 4P BT MPOH BT ZPVS DPNQVUFS IBT GPVS DPSFT JU QSPCBCMZ EPFT JU XPOU UBLF MPOHFS UP SVO GPVS DIBJOT UIBO POF DIBJO 5P SVO GPVS JOEFQFOEFOU .BSLPW DIBJOT GPS UIF NPEFM BCPWF BOE UP EJTUSJCVUF UIFN BDSPTT TFQBSBUF DPSFT JO ZPVS DPNQVUFS KVTU JODSFBTF UIF OVNCFS PG #$). BOE BEE B *- . BSHVNFOU 3 DPEF  (ǖǡǎ ʚǶ 0'(ǿ '$./ǿ '*"Ǿ"+Ǿ./ ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ ȁ$Ȃ ʔ ȁ$Ȃȉǿ -0"" Ǿ./ Ƕ ǍǡǏǎǒ Ȁ Ǣ ȁ$Ȃ ʡ )*-(ǿ ǎ Ǣ Ǎǡǎ Ȁ Ǣ ȁ$Ȃ ʡ )*-(ǿ Ǎ Ǣ Ǎǡǐ Ȁ Ǣ .$"( ʡ  3+ǿ ǎ Ȁ Ȁ Ǣ /ʙ/Ǿ.'$( Ǣ #$).ʙǑ Ǣ *- .ʙǑ Ǣ $/ -ʙǎǍǍǍ Ȁ ćFSF BSF B CVODI PG PQUJPOBM BSHVNFOUT UIBU BMMPX VT UP UVOF BOE DVTUPNJ[F UIF QSPDFTT 8FMM CSJOH UIFN VQ BT UIFZ BSF OFFEFE 'PS OPX LFFQ JO NJOE UIBU .#*2 XJMM SFNJOE ZPV PG UIF NPEFM GPSNVMB BOE BMTP IPX MPOH FBDI DIBJO UPPL UP SVO
  36. Hamiltonian Flows • What happens when you use ulam? •

    Translates formula into raw Stan model code • Stan then builds a custom NUTS sampler • Sampler runs • Samples fed back to R ćFTF FTUJNBUFT BSF WFSZ TJNJMBS UP UIF RVBESBUJD BQQSPYJNBUJPO #VU OPUF UIBU UIFSF BSF UXP OFX DPMVNOT )Ǿ !! BOE #/ ćFTF DPMVNOT QSPWJEF .$.$ EJBHOPTUJD DSJUFSJB UP IFMQ ZPV UFMM IPX XFMM UIF TBNQMJOH XPSLFE 8FMM EJTDVTT UIFN JO EFUBJM MBUFS JO UIF DIBQUFS 'PS OPX JUT FOPVHI UP LOPX UIBU )Ǿ !! JT B DSVEF FTUJNBUF PG UIF OVNCFS PG JOEFQFOEFOU TBN QMFT ZPV NBOBHFE UP HFU #/ JT B DPNQMJDBUFE FTUJNBUF PG UIF DPOWFSHFODF PG UIF .BSLPW DIBJOT UP UIF UBSHFU EJTUSJCVUJPO *U TIPVME BQQSPBDI  GSPN BCPWF XIFO BMM JT XFMM  4BNQMJOH BHBJO JO QBSBMMFM ćF FYBNQMF TP GBS JT B WFSZ FBTZ QSPCMFN GPS .$.$ 4P FWFO UIF EFGBVMU  TBNQMFT JT FOPVHI GPS BDDVSBUF JOGFSFODF *O GBDU BT GFX BT  FČFDUJWF TBNQMFT JT VTVBMMZ QMFOUZ GPS B HPPE BQQSPYJNBUJPO PG UIF QPTUFSJPS #VU XF BMTP XBOU UP SVO NVMUJQMF DIBJOT GPS SFBTPOT XFMM EJTDVTT JO NPSF EFQUI JO UIF OFYU TFDUJPOT ćFSF XJMM CF TQFDJĕD BEWJDF JO 4FDUJPO  QBHF   'PS OPX JUT XPSUI OPUJOH UIBU ZPV DBO FBTJMZ QBSBMMFMJ[F UIPTF DIBJOT BT XFMM ćFZ DBO BMM SVO BU UIF TBNF UJNF JOTUFBE PG JO TFRVFODF 4P BT MPOH BT ZPVS DPNQVUFS IBT GPVS DPSFT JU QSPCBCMZ EPFT JU XPOU UBLF MPOHFS UP SVO GPVS DIBJOT UIBO POF DIBJO 5P SVO GPVS JOEFQFOEFOU .BSLPW DIBJOT GPS UIF NPEFM BCPWF BOE UP EJTUSJCVUF UIFN BDSPTT TFQBSBUF DPSFT JO ZPVS DPNQVUFS KVTU JODSFBTF UIF OVNCFS PG #$). BOE BEE B *- . BSHVNFOU 3 DPEF  (ǖǡǎ ʚǶ 0'(ǿ '$./ǿ '*"Ǿ"+Ǿ./ ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ ȁ$Ȃ ʔ ȁ$Ȃȉǿ -0"" Ǿ./ Ƕ ǍǡǏǎǒ Ȁ Ǣ ȁ$Ȃ ʡ )*-(ǿ ǎ Ǣ Ǎǡǎ Ȁ Ǣ ȁ$Ȃ ʡ )*-(ǿ Ǎ Ǣ Ǎǡǐ Ȁ Ǣ .$"( ʡ  3+ǿ ǎ Ȁ Ȁ Ǣ /ʙ/Ǿ.'$( Ǣ #$).ʙǑ Ǣ *- .ʙǑ Ǣ $/ -ʙǎǍǍǍ Ȁ ćFSF BSF B CVODI PG PQUJPOBM BSHVNFOUT UIBU BMMPX VT UP UVOF BOE DVTUPNJ[F UIF QSPDFTT 8FMM CSJOH UIFN VQ BT UIFZ BSF OFFEFE 'PS OPX LFFQ JO NJOE UIBU .#*2 XJMM SFNJOE ZPV PG
  37. Hamiltonian Flows /ʙ/Ǿ.'$( Ǣ #$).ʙǑ Ǣ *- .ʙǑ Ǣ $/

    -ʙǎǍǍǍ Ȁ ćFSF BSF B CVODI PG PQUJPOBM BSHVNFOUT UIBU BMMPX VT UP UVOF BOE DVTUPNJ[ 8FMM CSJOH UIFN VQ BT UIFZ BSF OFFEFE 'PS OPX LFFQ JO NJOE UIBU .#*2 XJMM UIF NPEFM GPSNVMB BOE BMTP IPX MPOH FBDI DIBJO UPPL UP SVO .#*2ǿ (ǖǡǎ Ȁ ($'/*)$) *)/ -'* ++-*3$(/$*) ǏǍǍǍ .(+' . !-*( Ǒ #$). (+'$)" 0-/$*). ǿ. *).Ȁǣ 2-(0+ .(+' /*/' #$)ǣǎ ǍǡǎǏ ǍǡǍǕ ǍǡǏǍ #$)ǣǏ ǍǡǎǏ ǍǡǍǕ Ǎǡǎǖ #$)ǣǐ ǍǡǎǏ ǍǡǍǕ ǍǡǏǍ #$)ǣǑ ǍǡǎǏ ǍǡǍǕ ǍǡǏǍ *-(0'ǣ '*"Ǿ"+Ǿ./ ʡ )*-(ǿ(0Ǣ .$"(Ȁ (0 ʚǶ ȁ$Ȃ ʔ ȁ$Ȃ ȉ ǿ-0"" Ǿ./ Ƕ ǍǡǏǎǒȀ (0 ʚǶ ȁ$Ȃ ʔ ȁ$Ȃȉǿ -0"" Ǿ./ Ƕ ǍǡǏǎǒ Ȁ Ǣ ȁ$Ȃ ʡ )*-(ǿ ǎ Ǣ Ǎǡǎ Ȁ Ǣ ȁ$Ȃ ʡ )*-(ǿ Ǎ Ǣ Ǎǡǐ Ȁ Ǣ .$"( ʡ  3+ǿ ǎ Ȁ Ȁ Ǣ /ʙ/Ǿ.'$( Ǣ #$).ʙǑ Ǣ *- .ʙǑ Ǣ $/ -ʙǎǍǍǍ Ȁ ćFSF BSF B CVODI PG PQUJPOBM BSHVNFOUT UIBU BMMPX VT UP UVOF BOE DVTUPNJ[F UI 8FMM CSJOH UIFN VQ BT UIFZ BSF OFFEFE 'PS OPX LFFQ JO NJOE UIBU .#*2 XJMM SFN UIF NPEFM GPSNVMB BOE BMTP IPX MPOH FBDI DIBJO UPPL UP SVO .#*2ǿ (ǖǡǎ Ȁ ($'/*)$) *)/ -'* ++-*3$(/$*) ǏǍǍǍ .(+' . !-*( Ǒ #$). (+'$)" 0-/$*). ǿ. *).Ȁǣ 2-(0+ .(+' /*/' #$)ǣǎ ǍǡǎǏ ǍǡǍǕ ǍǡǏǍ #$)ǣǏ ǍǡǎǏ ǍǡǍǕ Ǎǡǎǖ #$)ǣǐ ǍǡǎǏ ǍǡǍǕ ǍǡǏǍ #$)ǣǑ ǍǡǎǏ ǍǡǍǕ ǍǡǏǍ • Num samples = total minus warmup • Default warmup is half
  38. Hamiltonian Flows • n_eff: number of effective samples • can

    be larger than actual samples! • Rhat: Convergence diagnostic — “1” is good ȁ$Ȃ ʡ )*-(ǿǍǢ ǍǡǐȀ .$"( ʡ  3+ǿǎȀ ćFSF XFSF  TBNQMFT GSPN BMM  DIBJOT CFDBVTF FBDI  TBNQMF DIBJO ĕTU IBMG PG UIF TBNQMFT UP BEBQU 4PNFUIJOH DVSJPVT IBQQFOT XIFO XF MPP 3 DPEF  +- $.ǿ (ǖǡǎ Ǣ Ǐ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( Ǎǡǎǎ ǍǡǍǎ ǍǡǎǍ ǍǡǎǏ ǏǓǑǎ ǎ ȁǎȂ Ǎǡǎǐ ǍǡǍǔ ǍǡǍǏ ǍǡǏǓ ǏǑǕǑ ǎ ȁǏȂ ǶǍǡǎǑ ǍǡǍǒ ǶǍǡǏǐ ǶǍǡǍǓ ǏǒǑǓ ǎ ȁǎȂ ǍǡǕǖ ǍǡǍǏ ǍǡǕǓ Ǎǡǖǎ ǐǐǐǎ ǎ ȁǏȂ ǎǡǍǒ ǍǡǍǎ ǎǡǍǐ ǎǡǍǔ ǐǏǑǐ ǎ *G UIFSF XFSF POMZ  TBNQMFT JO UPUBM IPX XF DBO IBWF NPSF UIBO  GPS FBDI QBSBNFUFS *UT OP NJTUBLF ćF BEBQUJWF /654 TBNQMFS UIBU 4UB JU DBO BDUVBMMZ QSPEVDF TFRVFOUJBM TBNQMFT UIBU BSF CFUUFS UIBO VODPSSFMB DBO FYQMPSF UIF QPTUFSJPS EJTUSJCVUJPO TP FďDJFOUMZ UIBU JU DBO CFBU SBOEP JO BDUJPO  7JTVBMJ[BUJPO #Z QMPUUJOH UIF TBNQMFT ZPV DBO HFU B EJSFDU BQQ (BVTTJBO RVBESBUJD UIF BDUVBM QPTUFSJPS EFOTJUZ IBT UVSOFE PVU UP CF 6TF
  39. Figure 9.7 sigma -0.1 0.1 0.3 0.84 0.88 0.92 0.09

    0.11 0.13 -0.1 0.1 0.3 -0.03 b.1 0 0 b.2 -0.30 -0.15 0.00 0.84 0.88 0.92 0 0.11 0 a.1 0.09 0.11 0.13 -0.08 0.06 -0.30 -0.15 0.00 -0.05 0.04 1.01 1.04 1.07 1.01 1.04 1.07 a.2 UIF NPEFM PCKFDU TP UIBU 3 LOPXT UP EJTQMBZ QBSBNFUFS OBNFT BOE QBSBNFUFS DPSSF EF  +$-.ǿ (ǖǡǎ Ȁ 'ĶĴłĿIJ ƑƏ TIPXT UIF SFTVMUJOH QMPU ćJT JT B QBJST QMPU TP JUT TUJMM B NBUSJY PG CJWBSJBU QMPUT #VU OPX BMPOH UIF EJBHPOBM UIF TNPPUIFE IJTUPHSBN PG FBDI QBSBNFUFS JT TIPX XJUI JUT OBNF "OE JO UIF MPXFS USJBOHMF PG UIF NBUSJY CFMPX UIF EJBHPOBM UIF DPS CFUXFFO FBDI QBJS PG QBSBNFUFST JT TIPXO XJUI TUSPOHFS DPSSFMBUJPOT JOEJDBUFE CZ TJ[F 'PS UIJT NPEFM BOE UIFTF EBUB UIF SFTVMUJOH QPTUFSJPS EJTUSJCVUJPO JT RVJUF OFBSM WBSJBUF (BVTTJBO ćF EFOTJUZ GPS .$"( JT DFSUBJOMZ TLFXFE JO UIF FYQFDUFE EJSFDUJ PUIFSXJTF UIF RVBESBUJD BQQSPYJNBUJPO EPFT BMNPTU BT XFMM BT )BNJMUPOJBO .POU ćJT JT B WFSZ TJNQMF LJOE PG NPEFM TUSVDUVSF PG DPVSTF XJUI (BVTTJBO QSJPST TP BO NBUFMZ RVBESBUJD QPTUFSJPS TIPVME CF OP TVSQSJTF -BUFS XFMM TFF TPNF NPSF FYPUJD Q EJTUSJCVUJPOT 0WFSUIJOLJOH 4UBO NFTTBHFT 8IFO ZPV ĕU B NPEFM VTJOH (+Ǐ./) 3 XJMM ĕSTU USBOT NPEFM GPSNVMB JOUP B 4UBO MBOHVBHF NPEFM ćFO JU TFOET UIBU NPEFM UP 4UBO ćF NFTT TFF JO ZPVS 3 DPOTPMF BSF TUBUVT VQEBUFT GSPN 4UBO 4UBO ĕSTU BHBJO USBOTMBUFT UIF NPEFM JOUP $ DPEF ćBU DPEF JT UIFO TFOU UP B $ DPNQJMFS UP CVJME BO FYFDVUBCMF ĕMF JT B TQ TBNQMJOH FOHJOF GPS ZPVS NPEFM ćFO 4UBO GFFET UIF EBUB BOE TUBSUJOH WBMVFT UP UIBU FYFDVU BOE JG BMM HPFT XFMM TBNQMJOH CFHJOT :PV XJMM TFF 4UBO DPVOU UISPVHI UIF JUFSBUJPOT %VSJOH T ZPV NJHIU PDDBTJPOBMMZ TFF B TDBSZ MPPLJOH XBSOJOH TPNFUIJOH MJLF UIJT
  40. Check the chains • Sometimes it doesn’t work • Good

    chains: • Converge to same target distribution • Once there, explore efficiently • Different ways to check • Trace plots • Convergence diagnostics (n_eff, Rhat) • Special warnings (divergent transitions)
  41. Check the chains • Trace plot: Plot of sequential samples

    in chain • Shows some problems, but not all • Want to see “hairy caterpillar”  &"4: ).$ ƽ  'ĶĴłĿIJ ƐƎ 5SBDF QMPU PG UIF .BSLPW DIBJO GSPN UIF SVHHFEOFTT NPEFM warmup 'PS OPX MFUT NFFU UIF NPTU CSPBEMZ VTFGVM UPPM GPS EJBHOPT ĽĹļŁ " USBDF QMPU NFSFMZ QMPUT UIF TBNQMFT JO TFRVFOUJBM PSEFS .BSLPWT QBUI UISPVHI UIF JTMBOET JO UIF NFUBQIPS BU UIF TUBSU PG U USBDF QMPU PG FBDI QBSBNFUFS JT PęFO UIF CFTU UIJOH GPS EJBHOPTJOH PODF ZPV DPNF UP SFDPHOJ[F B IFBMUIZ GVODUJPOJOH .BSLPW DIBJO QSPWJEF B MPU PG QFBDF PG NJOE " USBDF QMPU JTOU UIF MBTU UIJOH BOB PVUQVU #VU JUT OFBSMZ BMXBZT UIF ĕSTU *O UIF UFSSBJO SVHHFEOFTT FYBNQMF UIF USBDF QMPU TIPXT B WF XJUI 3 DPEF  /- +'*/ǿ (ǖǡǎ Ȁ ćF SFTVMU JT TIPXO JO 'ĶĴłĿIJ ƑƐ "DUVBMMZ UIF ĕHVSF TIPXT UIF U :PV DBO HFU UIJT CZ BEEJOH #$).ʙǎ UP UIF DBMM :PV DBO UIJOL FBDI QBSBNFUFS BT UIF QBUI UIF DIBJO UPPL UISPVHI FBDI EJNFOTJP
  42. 500 1000 1500 2000 8.8 9.0 9.2 9.4 9.6 n_eff

    = 267 a 500 1000 1500 2000 −0.4 −0.2 0.0 n_eff = 265 bR 500 1000 1500 2000 −2.5 −2.0 −1.5 n_eff = 296 bA 500 1000 1500 2000 0.0 0.2 0.4 0.6 0.8 n_eff = 294 bAR 500 1000 1500 2000 0.80 0.90 1.00 1.10 n_eff = 471 sigma “Hairy caterpillar ocular inspection test”
  43. Warmup • What is “warmup”? • Adaptation to posterior for

    efficient sampling • Figures out good step size • Samples during warmup NOT from posterior • Automatically discarded by precis/summary and other functions • Warmup is NOT “burn in” warmup   ."3,07 $)"*/ 500 1000 1500 2000 8.8 9.0 9.2 9.4 9.6 n_eff = 267 a 500 1000 1500 2000 −2.5 −2.0 −1.5 n_eff = 296 bA .80 0.90 1.00 1.10 n_eff = 471 sigma
  44.   ."3,07 $)"*/ .0/5& $"3-0 ȁ$Ȃ ʡ )*-(ǿǍǢ ǍǡǐȀ

    .$"( ʡ  3+ǿǎȀ ćFSF XFSF  TBNQMFT GSPN BMM  DIBJOT CFDBVTF FBDI  TBNQMF DIBJO VTFT CZ EFBVMU U ĕTU IBMG PG UIF TBNQMFT UP BEBQU 4PNFUIJOH DVSJPVT IBQQFOT XIFO XF MPPL BU UIF TVNNBS 3 DPEF  +- $.ǿ (ǖǡǎ Ǣ Ǐ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( Ǎǡǎǎ ǍǡǍǎ ǍǡǎǍ ǍǡǎǏ ǏǓǑǎ ǎ ȁǎȂ Ǎǡǎǐ ǍǡǍǔ ǍǡǍǏ ǍǡǏǓ ǏǑǕǑ ǎ ȁǏȂ ǶǍǡǎǑ ǍǡǍǒ ǶǍǡǏǐ ǶǍǡǍǓ ǏǒǑǓ ǎ ȁǎȂ ǍǡǕǖ ǍǡǍǏ ǍǡǕǓ Ǎǡǖǎ ǐǐǐǎ ǎ ȁǏȂ ǎǡǍǒ ǍǡǍǎ ǎǡǍǐ ǎǡǍǔ ǐǏǑǐ ǎ *G UIFSF XFSF POMZ  TBNQMFT JO UPUBM IPX XF DBO IBWF NPSF UIBO  FČFDUJWF TBNQ Convergence diagnostics • n_eff: “effective” number of samples • n_eff/n < 0.1, be alarmed • R-hat • R-hat: crudely, ratio of variance between chains to variance within chains • Should approach 1 • Both may mislead
  45. A wild chain • Two observations: {–1,1} • Estimate mean

    and standard deviation DBO SFBDI  FWFO GPS BO JOWBMJE DIBJO 4P WJFX JU QFSIBQT BT B TJHOBM PG EBOHFS CVU OFWFS PG TBGFUZ 'PS DPOWFOUJPOBM NPEFMT UIFTF NFUSJDT UZQJDBMMZ XPSL XFMM  5BNJOH B XJME DIBJO 0OF DPNNPO QSPCMFN XJUI TPNF NPEFMT JT UIBU UIFSF BSF CSPBE ĘBU SFHJPOT PG UIF QPTUFSJPS EFOTJUZ ćJT IBQQFOT NPTU PęFO BT ZPV NJHIU HVFTT XIFO POF VTFT ĘBU QSJPST ćF QSPCMFN UIJT DBO HFOFSBUF JT B XJME XBOEFSJOH .BSLPW DIBJO UIBU FSSBUJDBMMZ TBNQMFT FYUSFNFMZ QPTJUJWF BOE FYUSFNFMZ OFHBUJWF QBSBNFUFS WBMVFT -FUT MPPL BU B TJNQMF FYBNQMF ćF DPEF CFMPX USJFT UP FTUJNBUF UIF NFBO BOE TUBOEBSE EFWJBUJPO PG UIF UXP (BVTTJBO PCTFSWBUJPOT − BOE  #VU JU VTFT UPUBMMZ ĘBU QSJPST 3 DPEF  4 ʚǶ ǿǶǎǢǎȀ . /ǡ. ǿǎǎȀ (ǖǡǏ ʚǶ 0'(ǿ '$./ǿ 4 ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ '+# Ǣ '+# ʡ )*-(ǿ Ǎ Ǣ ǎǍǍǍ Ȁ Ǣ .$"( ʡ  3+ǿ ǍǡǍǍǍǎ Ȁ Ȁ Ǣ /ʙ'$./ǿ4ʙ4Ȁ Ǣ #$).ʙǏ Ȁ /PX MFUT MPPL BU UIF +- $. PVUQVU 3 DPEF  +- $.ǿ (ǖǡǏ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/
  46. A wild chain CSPBE ĘBU SFHJPOT PG UIF QPTUFSJPS EFOTJUZ

    ćJT IBQQFOT NPTU PęFO BT ZPV NJHIU HVFTT XIFO POF VTFT ĘBU QSJPST ćF QSPCMFN UIJT DBO HFOFSBUF JT B XJME XBOEFSJOH .BSLPW DIBJO UIBU FSSBUJDBMMZ TBNQMFT FYUSFNFMZ QPTJUJWF BOE FYUSFNFMZ OFHBUJWF QBSBNFUFS WBMVFT -FUT MPPL BU B TJNQMF FYBNQMF ćF DPEF CFMPX USJFT UP FTUJNBUF UIF NFBO BOE TUBOEBSE EFWJBUJPO PG UIF UXP (BVTTJBO PCTFSWBUJPOT − BOE  #VU JU VTFT UPUBMMZ ĘBU QSJPST 3 DPEF  4 ʚǶ ǿǶǎǢǎȀ . /ǡ. ǿǎǎȀ (ǖǡǏ ʚǶ 0'(ǿ '$./ǿ 4 ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ '+# Ǣ '+# ʡ )*-(ǿ Ǎ Ǣ ǎǍǍǍ Ȁ Ǣ .$"( ʡ  3+ǿ ǍǡǍǍǍǎ Ȁ Ȁ Ǣ /ʙ'$./ǿ4ʙ4Ȁ Ǣ #$).ʙǏ Ȁ /PX MFUT MPPL BU UIF +- $. PVUQVU 3 DPEF  +- $.ǿ (ǖǡǏ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( ǒǖǓǡǖǖ ǎǒǑǏǡǕǒ Ǐǡǎǔ ǏǑǓǎǡǑǑ ǒǔ ǎǡǍǓ '+# ǶǒǖǡǑǒ ǐǒǒǡǒǐ ǶǕǎǎǡǑǒ ǐǒǖǡǑǖ ǐǏ ǎǡǍǒ 8IPB ćJT QPTUFSJPS DBOU CF SJHIU ćF NFBO PG − BOE  JT [FSP TP XFSF IPQJOH UP HFU B NFBO WBMVF GPS '+# BSPVOE [FSP *OTUFBE XF HFU DSB[Z WBMVFT BOE JNQMBVTJCMZ XJEF JO UFSWBMT *OGFSFODF GPS .$"( JT OP CFUUFS ćF )Ǿ !! BOE #/ EJBHOPTUJDT EPOU MPPL HPPE FJUIFS 8F IBWF  TBNQMFT UP XPSL XJUI IFSF CVU UIF FTUJNBUFE FČFDUJWF TBNQMF TJ[FT BSF XIFO POF VTFT ĘBU QSJPST ćF QSPCMFN UIJT DBO HFOFSBUF JT B XJME XBOEFSJOH .BSLPW DIBJO UIBU FSSBUJDBMMZ TBNQMFT FYUSFNFMZ QPTJUJWF BOE FYUSFNFMZ OFHBUJWF QBSBNFUFS WBMVFT -FUT MPPL BU B TJNQMF FYBNQMF ćF DPEF CFMPX USJFT UP FTUJNBUF UIF NFBO BOE TUBOEBSE EFWJBUJPO PG UIF UXP (BVTTJBO PCTFSWBUJPOT − BOE  #VU JU VTFT UPUBMMZ ĘBU QSJPST 3 DPEF  4 ʚǶ ǿǶǎǢǎȀ . /ǡ. ǿǎǎȀ (ǖǡǏ ʚǶ 0'(ǿ '$./ǿ 4 ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ '+# Ǣ '+# ʡ )*-(ǿ Ǎ Ǣ ǎǍǍǍ Ȁ Ǣ .$"( ʡ  3+ǿ ǍǡǍǍǍǎ Ȁ Ȁ Ǣ /ʙ'$./ǿ4ʙ4Ȁ Ǣ #$).ʙǏ Ȁ /PX MFUT MPPL BU UIF +- $. PVUQVU 3 DPEF  +- $.ǿ (ǖǡǏ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( ǒǖǓǡǖǖ ǎǒǑǏǡǕǒ Ǐǡǎǔ ǏǑǓǎǡǑǑ ǒǔ ǎǡǍǓ '+# ǶǒǖǡǑǒ ǐǒǒǡǒǐ ǶǕǎǎǡǑǒ ǐǒǖǡǑǖ ǐǏ ǎǡǍǒ 8IPB ćJT QPTUFSJPS DBOU CF SJHIU ćF NFBO PG − BOE  JT [FSP TP XFSF IPQJOH UP HFU B NFBO WBMVF GPS '+# BSPVOE [FSP *OTUFBE XF HFU DSB[Z WBMVFT BOE JNQMBVTJCMZ XJEF JO UFSWBMT *OGFSFODF GPS .$"( JT OP CFUUFS ćF )Ǿ !! BOE #/ EJBHOPTUJDT EPOU MPPL HPPE FJUIFS 8F IBWF  TBNQMFT UP XPSL XJUI IFSF CVU UIF FTUJNBUFE FČFDUJWF TBNQMF TJ[FT BSF  BOE 
  47. A wild chain '+# ʡ )*-(ǿ Ǎ Ǣ ǎǍǍǍ Ȁ

    Ǣ .$"( ʡ  3+ǿ ǍǡǍǍǍǎ Ȁ Ȁ Ǣ /ʙ'$./ǿ4ʙ4Ȁ Ǣ #$).ʙǏ Ȁ /PX MFUT MPPL BU UIF +- $. PVUQVU 3 DPEF  +- $.ǿ (ǖǡǏ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( ǒǖǓǡǖǖ ǎǒǑǏǡǕǒ Ǐǡǎǔ ǏǑǓǎǡǑǑ ǒǔ ǎǡǍǓ '+# ǶǒǖǡǑǒ ǐǒǒǡǒǐ ǶǕǎǎǡǑǒ ǐǒǖǡǑǖ ǐǏ ǎǡǍǒ 8IPB ćJT QPTUFSJPS DBOU CF SJHIU ćF NFBO PG − BOE  JT [FSP TP XFSF IPQJOH UP HFU B NFBO WBMVF GPS '+# BSPVOE [FSP *OTUFBE XF HFU DSB[Z WBMVFT BOE JNQMBVTJCMZ XJEF JO UFSWBMT *OGFSFODF GPS .$"( JT OP CFUUFS ćF )Ǿ !! BOE #/ EJBHOPTUJDT EPOU MPPL HPPE FJUIFS 8F IBWF  TBNQMFT UP XPSL XJUI IFSF CVU UIF FTUJNBUFE FČFDUJWF TBNQMF TJ[FT BSF  BOE  :PV TIPVME BMTP TFF B XBSOJOH MJLF -)$)" ( .." .ǣ ǎǣ # - 2 - ǔǍ $1 -" )/ /-).$/$*). !/ - 2-(0+ǡ )- .$)" +/Ǿ '/ *1 Ǎǡǖǒ (4 # '+ǡ  #//+ǣȅȅ(Ƕ./)ǡ*-"ȅ($.ȅ2-)$)".ǡ#/('ȕ$1 -" )/Ƕ/-).$/$*).Ƕ!/ -Ƕ2-(0+ ćFSF JT VTFGVM BEWJDF BU UIF 63- ćF RVJDL WFSTJPO JT UIBU 4UBO EFUFDUFE QSPCMFNT JO FY QMPSJOH BMM PG UIF QPTUFSJPS ćFSF BSF QPSUJPOT PG JU UIBU JU NJTTFE ćFTF BSF ıĶŃIJĿĴIJĻŁ ŁĿĮĻŀĶŁĶļĻŀ *MM HJWF B NPSF UIPSPVHI FYQMBOBUJPO JO B MBUFS DIBQUFS 'PS TJNQMFS NPE FMT JODSFBTJOH UIF +/Ǿ '/ DPOUSPM QBSBNFUFS XJMM VTVBMMZ SFNPWF UIFN ćJT JT FY QMBJOFE NPSF JO UIF 0WFSUIJOLJOH CPY BU UIF FOE PG UIJT TFDUJPO :PV DBO USZ BEEJOH *)Ƕ /PX MFUT MPPL BU UIF +- $. PVUQVU 3  +- $.ǿ (ǖǡǏ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( ǒǖǓǡǖǖ ǎǒǑǏǡǕǒ Ǐǡǎǔ ǏǑǓǎǡǑǑ ǒǔ ǎǡǍǓ '+# ǶǒǖǡǑǒ ǐǒǒǡǒǐ ǶǕǎǎǡǑǒ ǐǒǖǡǑǖ ǐǏ ǎǡǍǒ 8IPB ćJT QPTUFSJPS DBOU CF SJHIU ćF NFBO PG − BOE  JT [FSP TP XFSF IPQJOH UP HFU B NFBO WBMVF GPS '+# BSPVOE [FSP *OTUFBE XF HFU DSB[Z WBMVFT BOE JNQMBVTJCMZ XJEF JO UFSWBMT *OGFSFODF GPS .$"( JT OP CFUUFS ćF )Ǿ !! BOE #/ EJBHOPTUJDT EPOU MPPL HPPE FJUIFS 8F IBWF  TBNQMFT UP XPSL XJUI IFSF CVU UIF FTUJNBUFE FČFDUJWF TBNQMF TJ[FT BSF  BOE  :PV TIPVME BMTP TFF B XBSOJOH MJLF -)$)" ( .." .ǣ ǎǣ # - 2 - ǔǍ $1 -" )/ /-).$/$*). !/ - 2-(0+ǡ )- .$)" +/Ǿ '/ *1 Ǎǡǖǒ (4 # '+ǡ  #//+ǣȅȅ(Ƕ./)ǡ*-"ȅ($.ȅ2-)$)".ǡ#/('ȕ$1 -" )/Ƕ/-).$/$*).Ƕ!/ -Ƕ2-(0+ ćFSF JT VTFGVM BEWJDF BU UIF 63- ćF RVJDL WFSTJPO JT UIBU 4UBO EFUFDUFE QSPCMFNT JO FY QMPSJOH BMM PG UIF QPTUFSJPS ćFSF BSF QPSUJPOT PG JU UIBU JU NJTTFE ćFTF BSF ıĶŃIJĿĴIJĻŁ ŁĿĮĻŀĶŁĶļĻŀ *MM HJWF B NPSF UIPSPVHI FYQMBOBUJPO JO B MBUFS DIBQUFS 'PS TJNQMFS NPE FMT JODSFBTJOH UIF +/Ǿ '/ DPOUSPM QBSBNFUFS XJMM VTVBMMZ SFNPWF UIFN ćJT JT FY QMBJOFE NPSF JO UIF 0WFSUIJOLJOH CPY BU UIF FOE PG UIJT TFDUJPO :PV DBO USZ BEEJOH *)Ƕ /-*'ʙ'$./ǿ+/Ǿ '/ʙǍǡǖǖȀ UP UIF 0'( DBMM‰0'(T EFGBVMU JT  #VU JU XPOU IFMQ NVDI JO UIJT TQFDJĕD DBTF ćJT QSPCMFN SVOT EFFQFS XJUI UIF NPEFM JUTFMG :PV TIPVME BMTP TFF B TFDPOE XBSOJOH • What the what is a divergent transition? • Hamiltonian approximation “broke” • Chain has trouble exploring some part of posterior
  48. Figure 9.9 200 400 600 800 1000 0 10000 30000

    n_eff = 57 sigma 200 400 600 800 1000 -2000 0 1000 n_eff = 32 alpha 200 400 600 800 1000 1 2 3 4 5 6 7 n_eff = 317 sigma 200 400 600 800 1000 -4 -2 0 2 4 6 n_eff = 284 alpha 'ĶĴłĿIJ ƑƑ %JBHOPTJOH BOE IFBMJOH B TJDL .BSLPW DIBJO 5PQ SPX 5SBDF QMPU GSPN UXP JOEFQFOEFOU DIBJOT EFĕOFE CZ NPEFM (ǖǡǏ ćFTF DIBJOT
  49. A wild chain • Problem is flat priors • Flat

    means flat forever • Much probability out to thousands • Also a problem in BUGS/JAGS • Fix with weakly informative priors PDDBTJPOBMMZ UP FYUSFNF WBMVFT ćJT JT OPU B IFBMUIZ QBJS PG DIBJOT BOE EF VTFGVM TBNQMFT NF UIJT QBSUJDVMBS DIBJO CZ VTJOH XFBLMZ JOGPSNBUJWF QSJPST ćF SFBTPO ESJęT XJMEMZ JO CPUI EJNFOTJPOT JT UIBU UIFSF JT WFSZ MJUUMF EBUB KVTU UXP ĘBU QSJPST ćF ĘBU QSJPST TBZ UIBU FWFSZ QPTTJCMF WBMVF PG UIF QBSBNFUFS F B QSJPSJ 'PS QBSBNFUFST UIBU DBO UBLF B QPUFOUJBMMZ JOĕOJUF OVNCFS PG UIJT NFBOT UIF .BSLPW DIBJO OFFET UP PDDBTJPOBMMZ TBNQMF TPNF QSFUUZ BVTJCMF WBMVFT MJLF OFHBUJWF  NJMMJPO ćFTF FYUSFNF ESJęT PWFSXIFMN JLFMJIPPE XFSF TUSPOHFS UIFO UIF DIBJO XPVME CF ĕOF CFDBVTF JU XPVME  UBLF NVDI JOGPSNBUJPO JO UIF QSJPS UP TUPQ UIJT GPPMJTIOFTT FWFO XJUIPVU TF UIJT NPEFM ZJ ∼ /PSNBM(µ, σ) µ = α α ∼ /PSNBM(, ) σ ∼ &YQPOFOUJBM() 1000 _eff = 57 200 400 600 800 1000 -2000 0 1000 n_eff = 32 alpha 1000 eff = 317 200 400 600 800 1000 -4 -2 0 2 4 6 n_eff = 284 alpha IFBMJOH B TJDL .BSLPW DIBJO 5PQ SPX 5SBDF IBJOT EFĕOFE CZ NPEFM (ǖǡǏ ćFTF DIBJOT PU CF VTFE GPS JOGFSFODF #PUUPN SPX "EEJOH FF (ǖǡǐ DMFBST VQ UIF DPOEJUJPO SJHIU BXBZ PS JOGFSFODF
  50. A tame chain FBTJMZ PWFSDPNFT UIFTF QSJPST :FU UIF QPTUFSJPS

    DBOOPU CF TVDDFTTGVMMZ BQ QSPYJNBUFE XJUIPVU UIFN *WF KVTU BEEFE XFBLMZ JOGPSNBUJWF QSJPST GPS α BOE σ 8FMM QMPU UIFTF QSJPST JO B NPNFOU TP ZPV XJMM CF BCMF UP TFF KVTU IPX XFBL UIFZ BSF #VU MFUT SFBQQSPYJNBUF UIF QPTUFSJPS ĕSTU 3 DPEF  . /ǡ. ǿǎǎȀ (ǖǡǐ ʚǶ 0'(ǿ '$./ǿ 4 ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ '+# Ǣ '+# ʡ )*-(ǿ ǎ Ǣ ǎǍ Ȁ Ǣ .$"( ʡ  3+ǿ ǎ Ȁ Ȁ Ǣ /ʙ'$./ǿ4ʙ4Ȁ Ǣ #$).ʙǏ Ȁ +- $.ǿ (ǖǡǐ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( ǎǡǒǐ Ǎǡǔǒ ǍǡǓǓ ǏǡǕǓ ǐǎǔ ǎ '+# ǍǡǍǑ ǎǡǎǑ Ƕǎǡǔǎ ǎǡǕǍ ǏǕǑ ǎ ćBUT NVDI CFUUFS 5BLF B MPPL BU UIF CPUUPN SPX JO 'ĶĴłĿIJ ƑƑ ćJT USBDF QMPU MPPLT IFBMUIZ #PUI DIBJOT BSF TUBUJPOBSZ BSPVOE UIF TBNF WBMVFT BOE NJYJOH JT HPPE /P NPSF XJME EFUPVST JOUP UIF UIPVTBOET "OE UIPTF EJWFSHFOU USBOTJUJPOT TIPVME CF HPOF 5P BQQSFDJBUF XIBU IBT IBQQFOFE UBLF B MPPL BU UIF QSJPST EBTIFE BOE QPTUFSJPST CMVF JO 'ĶĴłĿIJ ƑƉƈ #PUI UIF (BVTTJBO QSJPS GPS α BOE UIF FYQPOFOUJBM QSJPS GPS σ DPOUBJO WFSZ HSBEVBM EPXOIJMM TMPQFT ćFZ BSF TP HSBEVBM UIBU FWFO XJUI POMZ UXP PCTFSWBUJPOT BT JO
  51. Figure 9.9 200 400 600 800 1000 0 10000 30000

    n_eff = 57 sigma 200 400 600 800 1000 -2000 0 1000 n_eff = 32 alpha 200 400 600 800 1000 1 2 3 4 5 6 7 n_eff = 317 sigma 200 400 600 800 1000 -4 -2 0 2 4 6 n_eff = 284 alpha 'ĶĴłĿIJ ƑƑ %JBHOPTJOH BOE IFBMJOH B TJDL .BSLPW DIBJO 5PQ SPX 5SBDF QMPU GSPN UXP JOEFQFOEFOU DIBJOT EFĕOFE CZ NPEFM (ǖǡǏ ćFTF DIBJOT
  52. A tame chain Even with only 2 observations, these priors

    have no effect on inference! Except to allow you to make inferences...  $"3& "/% '&&%*/( 0' :063 ."3,07 $)"*/  -15 -10 -5 0 5 10 15 0.0 0.1 0.2 0.3 0.4 alpha Density posterior prior 0 2 4 6 8 10 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 sigma Density 'ĶĴłĿIJ ƑƉƈ 1SJPS EBTIFE BOE QPTUFSJPS CMVF GPS UIF NPEFM XJUI XFBLMZ JOGPSNBUJWF QSJPST (ǖǡǐ &WFO XJUI POMZ UXP PCTFSWBUJPOT UIF MJLFMJIPPE FBTJMZ PWFSDPNFT UIFTF QSJPST :FU UIF QPTUFSJPS DBOOPU CF TVDDFTTGVMMZ BQ QSPYJNBUFE XJUIPVU UIFN Figure 9.10
  53. The Folk Theorem of Statistical Computing “When you have computational

    problems, often there’s a problem with your model.” –Andrew Gelman
  54.  $"3& "/% '&&%*/( 0' :063 ."3,07 $)"*/  4

    ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ ǎ ʔ Ǐ Ǣ ǎ ʡ )*-(ǿ Ǎ Ǣ ǎǍǍǍ ȀǢ Ǐ ʡ )*-(ǿ Ǎ Ǣ ǎǍǍǍ ȀǢ .$"( ʡ  3+ǿ ǎ Ȁ Ȁ Ǣ /ʙ'$./ǿ4ʙ4Ȁ Ǣ #$).ʙǏ Ȁ +- $.ǿ (ǖǡǑ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( ǎǡǍǒ ǍǡǍǕ Ǎǡǖǎ ǎǡǎǔ ǒ ǎǡǐǏ Ǐ ǏǎǡǑǓ Ǐǖǎǡǖǔ ǶǑǔǐǡǔǔ Ǒǔǒǡǒǖ Ǐ ǏǡǍǎ ǎ ǶǏǎǡǏǔ ǏǖǎǡǖǓ ǶǑǔǒǡǒǑ ǑǔǐǡǖǓ Ǐ ǏǡǍǎ ćPTF FTUJNBUFT MPPL TVTQJDJPVT BOE UIF )Ǿ !! BOE #/ WBMVFT BSF UFSSJCMF ćF NFBOT GPS Unidentified chains Ǣ .ʙǎ Ȁ OPX UIF SJHIU BOTXFS ćFO XF ĕU UIJT NPEFM ZJ ∼ /PSNBM(µ, σ) µ = α + α σ ∼ &YQPOFOUJBM() P QBSBNFUFST α BOE α XIJDI DBOOPU CF JEFOUJĕFE 0OMZ UIFJS TIPVME CF BCPVU [FSP BęFS FTUJNBUJPO BJO BOE TFF XIBU IBQQFOT ćJT DIBJO JT HPJOH UP UBLF NVDI T #VU JU TIPVME TUJMM ĕOJTI BęFS B GFX NJOVUFT TVDI QBSBNFUFST MPPL MJLF JOTJEF PG B .BSLPW DI UIFN JO QSJODJQMF CZ VTJOH B MJUUMF QSJPS JOGPSNBUJ DIBJOT QSPEVDFE JO UIJT FYBNQMF XJMM FYIJCJU DIB UIF TBNF QBUUFSO JO ZPVS PXO NPEFMT ZPVMM IBWF 5P DPOTUSVDU B OPOJEFOUJĕBCMF NPEFM XF ĕST JBO EJTUSJCVUJPO XJUI NFBO [FSP BOE TUBOEBSE EFW 3 DPEF  . /ǡ. ǿǑǎȀ 4 ʚǶ -)*-(ǿ ǎǍǍ Ǣ ( )ʙǍ Ǣ .ʙǎ Ȁ #Z TJNVMBUJOH UIF EBUB XF LOPX UIF SJHIU BOTXFS ZJ ∼ /PSNBM( µ = α + α σ ∼ &YQPOFO ćF MJOFBS NPEFM DPOUBJOT UXP QBSBNFUFST α BOE TVN DBO CF JEFOUJĕFE BOE JU TIPVME CF BCPVU [FSP -FUT SVO UIF .BSLPW DIBJO BOE TFF XIBU IB MPOHFS UIBO UIF QSFWJPVT POFT #VU JU TIPVME TUJMM ĕ 3 DPEF  (ǖǡǑ ʚǶ 0'(ǿ '$./ǿ #Z TJNVMBUJOH UIF EBUB XF LOPX UIF SJHIU BOTXFS ćFO XF ĕU UIJT NPEFM ZJ ∼ /PSNBM(µ, σ) µ = α + α σ ∼ &YQPOFOUJBM() ćF MJOFBS NPEFM DPOUBJOT UXP QBSBNFUFST α BOE α XIJDI DBOOPU CF JEFOUJĕFE 0OMZ UIFJS TVN DBO CF JEFOUJĕFE BOE JU TIPVME CF BCPVU [FSP BęFS FTUJNBUJPO -FUT SVO UIF .BSLPW DIBJO BOE TFF XIBU IBQQFOT ćJT DIBJO JT HPJOH UP UBLF NVDI MPOHFS UIBO UIF QSFWJPVT POFT #VU JU TIPVME TUJMM ĕOJTI BęFS B GFX NJOVUFT 3 DPEF  (ǖǡǑ ʚǶ 0'(ǿ '$./ǿ
  55. Figure 9.11 200 400 600 800 1000 0.9 1.1 1.3

    n_eff = 5 sigma 200 400 600 800 1000 -500 0 500 n_eff = 2 a2 200 400 600 800 1000 -500 0 500 n_eff = 2 a1 200 400 600 800 1000 0.9 1.1 1.3 n_eff = 287 sigma 200 400 600 800 1000 -20 0 10 20 n_eff = 244 a2 200 400 600 800 1000 -20 0 10 20 n_eff = 245 a1 'ĶĴłĿIJ ƑƉƉ -Fę DPMVNO " DIBJO XJUI XBOEFSJOH QBSBNFUFST ǎ BOE Ǐ
  56. Unidentified chains CVU JU XPOU IFMQ NVDI ćFSF JT TPNFUIJOH

    TFSJPVTMZ XSPOH IFSF -PPLJOH BU UIF USBDF QMPU SFWFBMT NPSF ćF MFę DPMVNO JO 'ĶĴłĿIJ ƑƉƉ TIPXT UXP .BSLPW DIBJOT GSPN UIF NPEFM BCPWF ćFTF DIBJOT EP OPU MPPL MJLF UIFZ BSF TUBUJPOBSZ OPS EP UIFZ TFFN UP CF NJYJOH WFSZ XFMM *OEFFE XIFO ZPV TFF B QBUUFSO MJLF UIJT JU JT SFBTPO UP XPSSZ %POU VTF UIFTF TBNQMFT "HBJO XFBL QSJPST DBO SFTDVF VT /PX UIF NPEFM ĕUUJOH DPEF JT 3 DPEF  (ǖǡǒ ʚǶ 0'(ǿ '$./ǿ 4 ʡ )*-(ǿ (0 Ǣ .$"( Ȁ Ǣ (0 ʚǶ ǎ ʔ Ǐ Ǣ ǎ ʡ )*-(ǿ Ǎ Ǣ ǎǍ ȀǢ Ǐ ʡ )*-(ǿ Ǎ Ǣ ǎǍ ȀǢ .$"( ʡ  3+ǿ ǎ Ȁ Ȁ Ǣ /ʙ'$./ǿ4ʙ4Ȁ Ǣ #$).ʙǏ Ȁ +- $.ǿ (ǖǡǒ Ȁ ( ) . ǒǡǒʉ ǖǑǡǒʉ )Ǿ !! #/ .$"( ǎǡǍǑ ǍǡǍǕ Ǎǡǖǐ ǎǡǎǕ ǏǕǔ ǎ Ǐ ǍǡǏǏ ǔǡǏǐ ǶǎǍǡǓǔ ǎǏǡǏǓ ǏǑǑ ǎ ǎ ǶǍǡǍǐ ǔǡǏǏ ǶǎǏǡǍǖ ǎǍǡǕǔ ǏǑǒ ǎ
  57. Figure 9.11 200 400 600 800 1000 0.9 1.1 1.3

    n_eff = 5 sigma 200 400 600 800 1000 -500 0 500 n_eff = 2 a2 200 400 600 800 1000 -500 0 500 n_eff = 2 a1 200 400 600 800 1000 0.9 1.1 1.3 n_eff = 287 sigma 200 400 600 800 1000 -20 0 10 20 n_eff = 244 a2 200 400 600 800 1000 -20 0 10 20 n_eff = 245 a1 'ĶĴłĿIJ ƑƉƉ -Fę DPMVNO " DIBJO XJUI XBOEFSJOH QBSBNFUFST ǎ BOE Ǐ
  58. Homeward • Updated book PDF (20 Jan) & rethinking 1.82

    • data(Wines2012): Interactions and Markov chains • Next week: Maximizing Entropy for Inferential Justice Generalized Linear Models (GLMs)