Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to (Dynamic) Nested Sampling

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Josh Speagle Josh Speagle
September 26, 2017

An Introduction to (Dynamic) Nested Sampling

Nested Sampling is a relatively new method for estimating the Bayesian evidence (with the posterior estimated as a byproduct) that integrates over the posterior by sampling in nested "shells" of constant likelihood. Its ability to sample from complex, multi-modal distributions in a flexible yet efficient way combined with several available sampling packages has contributed to its growing popularity in (astro)physics. In this talk I outline the basic motivation and theory behind Nested Sampling, derive various statistical properties associated with the method, and discuss how it is applied in practice. I also talk about how the overall framework can be extended in Dynamic Nested Sampling to accommodate adding samples "dynamically" during the course of a run. These samples can be allocated to maximize arbitrary objective functions, allowing Dynamic Nested Sampling to function as a posterior-oriented sampling method such as MCMC but with the added benefit of well-defined stopping criteria. I end with an application of Dynamic Nested Sampling to a variety of synthetic and real-world problems using an open-source Python package I've been developing (https://github.com/joshspeagle/dynesty/).

Avatar for Josh Speagle

Josh Speagle

September 26, 2017
Tweet

More Decks by Josh Speagle

Other Decks in Research

Transcript

  1. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem
  2. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem Prior
  3. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem Prior Likelihood
  4. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem Prior Likelihood Posterior
  5. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem Prior Likelihood Posterior Evidence
  6. Posterior Estimation via Sampling , −1 , … , 2

    , 1 ∈ Ω , −1 , … , 2 , 1 ⇒ = =1 − i Samples Weights
  7. Posterior Estimation via Sampling , −1 , … , 2

    , 1 ∈ Ω , −1 , … , 2 , 1 ⇒ = =1 − i Samples Weights 1, 1, … , 1, 1 MCMC
  8. Posterior Estimation via Sampling , −1 , … , 2

    , 1 ∈ Ω , −1 , … , 2 , 1 ⇒ = =1 − i Samples Weights 1, 1, … , 1, 1 MCMC , −1 −1 , … , 2 2 , 1 1 Importance Sampling
  9. Posterior Estimation via Sampling , −1 , … , 2

    , 1 ∈ Ω , −1 , … , 2 , 1 ⇒ = =1 − i Samples Weights Nested Sampling?
  10. Motivation: Integrating the Posterior ≡ Ω ℒ ≡ :ℒ >

    “Prior Volume” Feroz et al. (2013)
  11. Motivation: Integrating the Posterior = 0 ∞ ≡ :ℒ >

    “Prior Volume” Feroz et al. (2013)
  12. Motivation: Integrating the Posterior = 0 1 ℒ ≡ :ℒ

    > “Prior Volume” Feroz et al. (2013)
  13. Motivation: Integrating the Posterior ≈ =1 ℒ , … ,

    … ≡ :ℒ > “Prior Volume” Feroz et al. (2013) ℒ , … , … can be rectangles, trapezoids, etc. Amplitude Differential Volume Δ
  14. Motivation: Integrating the Posterior ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …
  15. Motivation: Integrating the Posterior ≈ =1 = ℒ , …

    , … Directly proportional to typical set. = We get posteriors “for free” Importance Weight ~ , … , …
  16. Motivation: Sampling the Posterior Sampling directly from the likelihood ℒ

    is hard. Pictures from this 2010 talk by Skilling.
  17. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling.
  18. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. −1
  19. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. −1
  20. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. −1
  21. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. +1
  22. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. +1 MCMC: Solving a Hard Problem once. vs Nested Sampling: Solving an Easier Problem many times.
  23. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …
  24. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , … 
  25. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  ???
  26. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  ??? ~ ~ Unif PDF CDF Probability Integral Transform
  27. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  ??? ~ ~ Unif PDF CDF Probability Integral Transform Need to sample from the constrained prior.
  28. Pictures from this 2010 talk by Skilling. +1 +1 =

    =0 0 0 , … , ~ Unif i. i. d. Estimating the Prior Volume Posterior
  29. Pictures from this 2010 talk by Skilling. +1 +1 =

    =0 0 0 , … , ~ Unif 0 ≡ 1 i. i. d. Estimating the Prior Volume Posterior
  30. Pictures from this 2010 talk by Skilling. +1 +1 =

    =0 0 , … , ~ Unif i. i. d. 0 ≡ 1 Estimating the Prior Volume Posterior
  31. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  ???
  32. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  Pr 1 , 2 , … , where 1st & 2nd moments can be computed
  33. Nested Sampling Algorithm ℒ > ℒ−1 > ⋯ > ℒ2

    > ℒ1 > 0 ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 1 2 −1
  34. Nested Sampling Algorithm (Ideal) ℒ > ℒ−1 > ⋯ >

    ℒ2 > ℒ1 > 0 Samples sequentially drawn from constrained prior ℒ>ℒ . ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 1 2 −1
  35. Nested Sampling Algorithm (Naïve) ℒ > ℒ−1 > ⋯ >

    ℒ2 > ℒ1 > 0 1 ~ 1. Samples sequentially drawn from prior . 2. New point only accepted if ℒ+1 > ℒ.
  36. Nested Sampling Algorithm (Naïve) ℒ > ℒ−1 > ⋯ >

    ℒ2 > ℒ1 > 0 1 ~ 2 1. Samples sequentially drawn from prior . 2. New point only accepted if ℒ+1 > ℒ.
  37. ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 >

    0 Nested Sampling Algorithm (Naïve) ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 1 ~ 2 −1 1. Samples sequentially drawn from prior . 2. New point only accepted if ℒ+1 > ℒ.
  38. Components of the Algorithm 1. Adding more particles. 2. Knowing

    when to stop. 3. What to do after stopping.
  39. Adding More Particles ℒ1 1 > ⋯ > ℒ2 1

    > ℒ1 1 > 0 Skilling (2006)
  40. Adding More Particles ℒ1 1 > ⋯ > ℒ2 1

    > ℒ1 1 > 0 Skilling (2006) +1 1 = =0
  41. Adding More Particles ℒ1 1 > ⋯ > ℒ2 1

    > ℒ1 1 > 0 ℒ2 2 > ⋯ > ℒ2 2 > ℒ1 2 > 0 Skilling (2006)
  42. Adding More Particles ℒ1 1 > ⋯ > ℒ2 1

    > ℒ1 1 > 0 ℒ2 2 > ⋯ > ℒ2 2 > ℒ1 2 > 0 Skilling (2006) +1 1 = =0 +1 2 = =0
  43. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  44. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  45. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  46. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  47. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  48. Adding More Particles ln +1 = =0 ln 0 ,

    … , ~ Unif Skilling (2006)
  49. Adding More Particles ln +1 = =0 ln 0 ,

    … , ~ 2 1 , 2 ~ Unif ⇒ 1 , 2 Order Statistics Skilling (2006)
  50. Adding More Particles ln +1 = =0 ln 0 ,

    … , ~ Beta 2,1 Skilling (2006) 1 , 2 ~ Unif ⇒ 1 , 2 Order Statistics
  51. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  52. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 One run with 2 “live points” = 2 runs with 1 live point Skilling (2006) live points dead points
  53. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 One run with K “live points” = K runs with 1 live point Skilling (2006) live points dead points
  54. Adding More Particles ln +1 = =0 ln 0 ,

    … , ~ Beta 2,1 Skilling (2006) 1 , 2 ~ Unif ⇒ 1 , 2 Order Statistics live points dead points
  55. Adding More Particles 0 , … , ~ Beta K,

    1 Skilling (2006) 1 , … , ~ Unif ⇒ 1 , … , Order Statistics ln +1 = =0 ln
  56. Adding More Particles 0 , … , ~ Beta K,

    1 Skilling (2006) ln +1 = =0 ln + 1 1 , … , ~ Unif ⇒ 1 , … , Order Statistics
  57. “Recycling” the Final Set of Particles … , +1 ,

    … ~ 1 , … , ∼ Unif ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0
  58. “Recycling” the Final Set of Particles … , +1 ,

    … ~ 1 , … , ∼ Unif ℒ+ > ⋯ > ℒ+1 > ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0
  59. “Recycling” the Final Set of Particles … , +1 ,

    … ~ 1 , … , ∼ Unif ℒ+ > ⋯ > ℒ+1 > ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 + = +1 −+1
  60. “Recycling” the Final Set of Particles … , + ,

    … ~ −+1 1 , … , ∼ Unif ℒ+ > ⋯ > ℒ+1 > ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 + = +1 −+1
  61. “Recycling” the Final Set of Particles … , + ,

    … ~ −+1 1 , … , ∼ Unif ℒ+ > ⋯ > ℒ+1 > ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 + = +1 −+1 Rényi Representation
  62. “Recycling” the Final Set of Particles … , , …

    ∼ =1 =1 +1 , 1 , … , +1 ∼ Expo Rényi Representation … , + , … ~ −+1 1 , … , ∼ Unif
  63. “Recycling” the Final Set of Particles … , , …

    ∼ =1 =1 +1 , 1 , … , +1 ∼ Expo Rényi Representation … , + , … ~ −+1 1 , … , ∼ Unif 1 2 3 −1 …
  64. “Recycling” the Final Set of Particles … , , …

    ∼ =1 =1 +1 , 1 , … , +1 ∼ Expo Rényi Representation … , + , … ~ −+1 1 , … , ∼ Unif 1 2 3 +1 …
  65. “Recycling” the Final Set of Particles ln + = =1

    ln + 1 + =1 ln − + 1 − + 2 Exponential Shrinkage Uniform Shrinkage
  66. Nested Sampling Uncertainties Pictures from this 2010 talk by Skilling.

    Statistical uncertainties • Unknown prior volumes
  67. Nested Sampling Uncertainties Pictures from this 2010 talk by Skilling.

    Statistical uncertainties Sampling uncertainties • Unknown prior volumes • Number of samples (counting) • Discrete point estimates for contours • Particle path dependencies
  68. Sampling Error: Poisson Uncertainties = ∼ ? ? ? Δ

    ln Based on Skilling (2006) and Keeton (2011) “Distance” from prior to posterior
  69. Sampling Error: Poisson Uncertainties ≡ Ω ln Based on Skilling

    (2006) and Keeton (2011) Kullback-Leibler divergence from to  “information gained”.
  70. Sampling Error: Poisson Uncertainties ≡ Ω ln = 1 0

    1 ℒ ln ℒ − ln Based on Skilling (2006) and Keeton (2011) Kullback-Leibler divergence from to  “information gained”.
  71. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln
  72. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln ∼ ln
  73. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln ∼ ln ∼ Δ ln 2
  74. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln ∼ ln ∼ Δ ln 2 ∼ Δ ln
  75. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln ∼ ln ∼ Δ ln 2 ∼ =1 + Δ Δ ln = 1/
  76. Sampling Error: Monte Carlo Noise Formalism following Higson et al.

    (2017) and Chopin and Robert (2010) = Ω = 1 0 1 X ℒ
  77. Sampling Error: Monte Carlo Noise = Ω = 1 0

    1 X ℒ Formalism following Higson et al. (2017) and Chopin and Robert (2010) X = ℒ = ℒ
  78. Sampling Error: Monte Carlo Noise = Ω = 1 0

    1 X ℒ Formalism following Higson et al. (2017) and Chopin and Robert (2010) X = ℒ = ℒ
  79. Sampling Error: Monte Carlo Noise = Ω = 1 0

    1 X ℒ Formalism following Higson et al. (2017) and Chopin and Robert (2010) X = ℒ = ℒ
  80. Sampling Error: Monte Carlo Noise Formalism following Higson et al.

    (2017) and Chopin and Robert (2010) ≈ =1 + = Ω = 1 0 1 X ℒ
  81. Exploring Sampling Uncertainties One run with K “live points” =

    K runs with 1 live point! ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0
  82. Exploring Sampling Uncertainties One run with K “live points” =

    K runs with 1 live point! ℒ1 1 > ⋯ > ℒ2 1 > ℒ1 1 > 0 ℒ2 2 > ⋯ > ℒ2 2 > ℒ1 2 > 0
  83. Exploring Sampling Uncertainties One run with K “live points” =

    K runs with 1 live point! ℒ ∙ = ℒ 1 , ℒ 2 , … Original run “strand” “strand”
  84. Exploring Sampling Uncertainties ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand”
  85. Exploring Sampling Uncertainties ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand” ℒ ∙ ′ = ℒ 1 ′ , ℒ 2 ′ , … We would like to sample K paths from the set of all possible paths P ℒ , … . However, we don’t have access to it.
  86. Exploring Sampling Uncertainties ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand” We would like to sample K paths from the set of all possible paths P ℒ , … . However, we don’t have access to it. Use bootstrap estimator. ℒ ∙ ′ = ℒ 1 , ℒ 1 , ℒ 2 , …
  87. Exploring Sampling Uncertainties ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand” ′ ≈ =1 ′ ′ ′ = ′ ′ We would like to sample K paths from the set of all possible paths P ℒ , … . However, we don’t have access to it. Use bootstrap estimator.
  88. Method 0: Sampling from the Prior Higson et al. (2017)

    arxiv:1704.03459 Sampling from the prior becomes exponentially more inefficient as time goes on.
  89. Method 1: Constrained Uniform Sampling Feroz et al. (2009) Proposal:

    Bound the iso-likelihood contours in real time and sample from the newly constrained prior.
  90. Method 1: Constrained Uniform Sampling Feroz et al. (2009) Issues:

    • How to ensure bounds always encompass iso-likelihood contours? • How to generate flexible bounds?
  91. Method 1: Constrained Uniform Sampling Feroz et al. (2009) Issues:

    • How to ensure bounds always encompass iso-likelihood contours? • How to generate flexible bounds? Bootstrapping. Easier with uniform (transformed) prior.
  92. Method 2: “Evolving” Previous Samples Proposal: Generate independent samples subject

    to the likelihood constraint by “evolving” copies of current live points.
  93. Method 2: “Evolving” Previous Samples Proposal: Generate independent samples subject

    to the likelihood constraint by “evolving” copies of current live points.
  94. Method 2: “Evolving” Previous Samples Proposal: Generate independent samples subject

    to the likelihood constraint by “evolving” copies of current live points. • Random walks (i.e. MCMC) • Slice sampling • Random trajectories (i.e. HMC) PolyChord
  95. Method 2: “Evolving” Previous Samples Issues: • How to ensure

    samples are independent (thinning) and properly distributed within likelihood constraint? • How to generate efficient proposals? • Random walks (i.e. MCMC) • Slice sampling • Random trajectories (i.e. HMC) PolyChord
  96. Summary: (Static) Nested Sampling 1. Estimates the evidence . 2.

    Estimates the posterior . 3. Possesses well-defined stopping criteria.
  97. Summary: (Static) Nested Sampling 1. Estimates the evidence . 2.

    Estimates the posterior . 3. Possesses well-defined stopping criteria. 4. Combining runs improves inference.
  98. Summary: (Static) Nested Sampling 1. Estimates the evidence . 2.

    Estimates the posterior . 3. Possesses well-defined stopping criteria. 4. Combining runs improves inference. 5. Sampling and statistical uncertainties can be simulated from a single run.
  99. Dynamic Nested Sampling ℒmax 1 > ℒ 1 1 >

    ⋯ > ℒ 2 1 > ℒ1 1 > ℒ min 1 ℒ2 2 > ⋯ > ℒ2 2 > ℒ1 2 > 0
  100. Dynamic Nested Sampling ℒ2 2 > ⋯ > ℒ1 1

    … > ℒ1 1 > ℒ2 2 > ℒ1 2 > 0 2 live points 2 live points 1 + 2 live points
  101. Dynamic Nested Sampling ℒ2 2 > ⋯ > ℒ1 1

    … > ℒ1 1 > ℒ2 2 > ℒ1 2 > 0 2 live points 2 live points 1 + 2 live points ln +1 = =1 ln
  102. Dynamic Nested Sampling 2 live points 2 live points 1

    + 2 live points ln +1 = =1 ln +1 ≥ : +1 ∼ Beta(+1 , 1) Constant/Increasing ℒ2 2 > ⋯ > ℒ1 1 … > ℒ1 1 > ℒ2 2 > ℒ1 2 > 0
  103. Dynamic Nested Sampling ℒ2 2 > ⋯ > ℒ1 1

    … > ℒ1 1 > ℒ2 2 > ℒ1 2 > 0 2 live points 2 live points 1 + 2 live points ln +1 = =1 ln +1 ≥ : +1 ∼ Beta(+1 , 1) +> < : +1 , … + ∼ , … , −+1 Constant/Increasing Decreasing sequence
  104. Dynamic Nested Sampling ln = =1 1 ln + 1

    + =1 2 ln 1 − + 1 1 − + 2 + ⋯ + =1 final ln final − + 1 final − + 2 Exponential shrinkage Uniform shrinkage
  105. Benefits of Dynamic Nested Sampling • Can accommodate new “strands”

    within a particular range of prior volumes without changing overall statistical framework. • Particles can be adaptively added until stopping criteria are reached, allowing targeted estimation.
  106. Benefits of Dynamic Nested Sampling • Can accommodate new “strands”

    within a particular range of prior volumes without changing overall statistical framework. • Particles can be adaptively added until stopping criteria are reached, allowing targeted estimation.
  107. Sampling Uncertainties (Static) ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand” “strand” We would like to sample K paths from the set of all possible paths P ℒ , … . However, we don’t have access to it. Use bootstrap estimator. ℒ ∙ ′ = ℒ 1 , ℒ 1 , ℒ 2 , …
  108. Sampling Uncertainties (Dynamic) ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run ℒ min 1 = −∞ (originated from the prior) “strand” ℒ min 2 = 2 (originated interior to the prior)
  109. Sampling Uncertainties (Dynamic) ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run ℒ min 1 = −∞ (originated from the prior) “strand” ℒ min 2 = 2 (originated interior to the prior) We would like to sample: paths from P ℒ , … and paths from P ℒ , … .
  110. Sampling Uncertainties (Dynamic) ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run ℒ min 1 = −∞ (originated from the prior) “strand” ℒ min 2 = 2 (originated interior to the prior) We would like to sample: paths from P ℒ , … and paths from P ℒ , … . Use stratified bootstrap estimator. ℒ ∙ ′ = ℒ 1 , ℒ 1 , ℒ 2 , …
  111. Dynamic Nested Sampling • Can accommodate new “strands” within a

    particular range of prior volumes without changing overall statistical framework. • Particles can be adaptively added until stopping criteria are reached, allowing targeted estimation.
  112. Dynamic Nested Sampling • Can accommodate new “strands” within a

    particular range of prior volumes without changing overall statistical framework. • Particles can be adaptively added until stopping criteria are reached, allowing targeted estimation.
  113. Allocating Samples Higson et al. (2017) arxiv:1704.03459 ∼ > ∼

    , = + 1 − Posterior weight Evidence weight Weight Function
  114. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary?
  115. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. “True” posterior constructed over same “domain”.
  116. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. ≡ Ω ln
  117. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. = =1 ln
  118. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. = =1 ln We want access to P , but we don’t know .
  119. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. ′ = =1 ′ ln ′ We want access to Pr , but we don’t know . Use bootstrap estimator.
  120. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. ′ = =1 ′ ln ′ We want access to Pr , but we don’t know . Use bootstrap estimator. Random variable
  121. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. Possible stopping criterion: fractional (%) variation in H. We want access to Pr , but we don’t know . Use bootstrap estimator.
  122. Dynamic Nested Sampling Summary 1. Can sample from multi-modal distributions.

    2. Can simultaneously estimate the evidence and posterior . 3. Combining independent runs improves inference (“trivially parallelizable”). 4. Can simulate uncertainties (sampling and statistical) from a single run. 5. Enables adaptive sample allocation during runtime using arbitrary weight functions. 6. Possesses evidence/posterior-based stopping criteria.
  123. Dynamic Nested Sampling with dynesty dynesty.readthedocs.io • Pure Python. •

    Easy to use. • Modular. • Open source. • Parallelizable. • Flexible bounding/sampling methods. • Thorough documentation!
  124. Application: • All results are preliminary but agree with results

    from MCMC methods (derived using emcee). • Samples allocated with 100% posterior weight, automated stopping criterion (2% fractional error in simulated KLD). • dynesty was substantially (~3-6x) more efficient at generating good samples than emcee, before thinning.
  125. Application: Modeling Galaxy SEDs = ln ∗ , ln ,

    5 , 6 , 2 D=15 With: Joel Leja, Ben Johnson, Charlie Conroy
  126. Application: Supernovae Light Curves Fig: Open Supernova Catalog (LSQ12dlf), James

    Guillochon With: James Guillochon, Kaisey Mandel = , 4 , 3 , 3 , 2 D=12
  127. Dynamic Nested Sampling with dynesty dynesty.readthedocs.io • Pure Python. •

    Easy to use. • Modular. • Open source. • Parallelizable. • Flexible bounding/sampling methods. • Thorough documentation!