Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Typical Sets: What They Are and How to (Hopefully) Find Them

233e9b37ec2401b09f675de6ba1a13c6?s=47 Josh Speagle
September 20, 2017

Typical Sets: What They Are and How to (Hopefully) Find Them

Although typical sets are important in understanding how/why sampling algorithms (do not) work, they are rarely taught when most astronomers are introduced to sampling methods such as Markov Chain Monte Carlo (MCMC). I introduce the idea of typical sets using some basic examples and show why they make sampling difficult in higher dimensions. I then outline how their behavior shapes various MCMC algorithms such as (Adaptive) Metropolis-Hastings, ensemble sampling, and Hamiltonian Monte Carlo. See https://github.com/joshspeagle/typical_sets for additional resources.

233e9b37ec2401b09f675de6ba1a13c6?s=128

Josh Speagle

September 20, 2017
Tweet

Transcript

  1. Typical Sets: What They Are and How to (Hopefully) Find

    Them Josh Speagle jspeagle@cfa.harvard.edu Based on this talk by Michael Betancourt at StanCon.
  2. Intended Audience • Some experience with the basics of Bayesian

    statistics.
  3. Intended Audience • Some experience with the basics of Bayesian

    statistics. • Some experience using MCMC for research.
  4. Intended Audience • Some experience with the basics of Bayesian

    statistics. • Some experience using MCMC for research. • Have heard of ensemble sampling methods such as emcee.
  5. Bayesian Inference

  6. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem
  7. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem Parameters
  8. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem Data Parameters
  9. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem Data Parameters Model
  10. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem
  11. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem Prior
  12. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem Prior Likelihood
  13. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem Prior Likelihood Posterior
  14. Bayesian Inference Pr , M = Pr , M Pr

    |M Pr M Bayes’ Theorem Prior Likelihood Posterior Evidence
  15. Bayesian Inference = ℒ Bayes’ Theorem ≡ Ω ℒ Posterior

    Likelihood Prior Evidence
  16. Bayesian Inference = ℒ Bayes’ Theorem Posterior Likelihood Prior Evidence

    ≡ Ω ℒ
  17. Where is the posterior? ≡ Ω

  18. Where is the posterior? ≡ {: =}

  19. Where is the posterior? ≡ 0 ∞

  20. Where is the posterior? ≡ 0 ∞ =

  21. Where is the posterior? ≡ 0 ∞ “Amplitude” “Volume” =

  22. = Where is the posterior? ≡ 0 ∞ “Typical Set”

  23. Typical Sets: Gaussian Example

  24. Typical Sets: Gaussian Example ∝ 0 ∞ − 2 2

  25. Typical Sets: Gaussian Example ∝ 0 ∞ − 2 2

    ∝ 0 ∞ − 2 2 −1
  26. Typical Distance Typical Sets: Gaussian Example

  27. = Where is the posterior? ≡ 0 ∞ “Typical Set”

  28. = Where is the posterior? ≡ 0 ∞ “Typical Set”

  29. = Where is the posterior? ≡ 0 ∞ “Typical Set”

  30. = Where is the posterior? ≡ 0 ∞ “Typical Set”

    MCMC wants to draw samples from this “shell”
  31. Tension in the Metropolis Update ′ = min 1, ′

    ′ ′
  32. Tension in the Metropolis Update ′ = min 1, ′

    ′ ′ Proposal
  33. Tension in the Metropolis Update ′ = min 1, ′

    ′ ′ “Volume”
  34. Tension in the Metropolis Update ′ = min 1, ′

    ′ ′ “Volume” “Amplitude”
  35. Metropolis-Hastings

  36. Metropolis-Hastings ′ = Normal ′ = , =

  37. Metropolis-Hastings ′ = Normal ′ = , = Typical Distance

  38. Metropolis-Hastings ′ = Normal ′ = , = Typical Distance

  39. Metropolis-Hastings ′ = Normal ′ = , =

  40. Metropolis-Hastings ′ = Normal ′ = , =

  41. Ideal Metropolis-Hastings ′ = Normal ′ = , = Typical

    Separation
  42. Ideal Metropolis-Hastings ′ = Normal ′ = , = Typical

    Separation M-H
  43. Ideal Metropolis-Hastings ′ = Normal ′ = , = s

    Typical Separation Adaptive M-H
  44. Ensemble Sampling

  45. Ensemble Sampling

  46. Ensemble Sampling

  47. Ensemble Sampling

  48. Ensemble Sampling

  49. Ensemble Sampling

  50. emcee ′ = min 1, ′ −1 ~ = 1

    from 1 , 0 otherwise “Stretch” factor
  51. Ideal Typical Separation emcee M-H

  52. Ideal Typical Separation emcee M-H emcee

  53. Ideal Typical Separation emcee M-H emcee

  54. Ideal Typical Separation emcee M-H emcee After weighting by acceptance

    probability
  55. emcee ′ = min 1, ′ −1 ~ = 1

    from 1 , 0 otherwise “Stretch” factor
  56. emcee ′ = min 1, ′ −1 ~ = 1

    from 1 , 0 otherwise “Stretch” factor 
  57. Summary • Volume scales as . • The posterior density

    depends on both volume and amplitude. • Most of the posterior is concentrated in a “shell” around the best solution called the typical set. • MCMC draws samples from the typical set.
  58. But what about corner plots?

  59. But what about corner plots? 2-dimensional projection of D-dimensional shell

  60. But what about corner plots? 2-dimensional projection of D-dimensional shell

  61. But what about corner plots? 2-dimensional projection of D-dimensional shell

  62. Hamiltonian Monte Carlo

  63. Hamiltonian Monte Carlo

  64. Hamiltonian Monte Carlo

  65. Hamiltonian Monte Carlo Treat the particle at position q as

    a point mass with mass matrix M and momentum p. Pr , ∝ , = − −1 2 Hamiltonian
  66. Hamiltonian Monte Carlo Pr , ∝ , = − −1

    2 Treat the particle at position q as a point mass with mass matrix M and momentum p. = = −1 = − = ln Hamiltonian Hamilton’s Equations
  67. Hamiltonian Monte Carlo ′, −′ , = min 1, Pr

    ′, −′ Pr , ∼ Normal = , =
  68. Typical Distance Hamiltonian Monte Carlo ∼ Normal = , =

  69. Typical Distance Hamiltonian Monte Carlo ∼ Normal = , =

  70. Ideal Typical Separation M-H emcee Hamiltonian Monte Carlo ∼ Normal

    = , =
  71. Ideal Typical Separation M-H emcee Hamiltonian Monte Carlo ∼ Normal

    = , = HMC