Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to (Dynamic) Nested Sampling

Josh Speagle
September 26, 2017

An Introduction to (Dynamic) Nested Sampling

Nested Sampling is a relatively new method for estimating the Bayesian evidence (with the posterior estimated as a byproduct) that integrates over the posterior by sampling in nested "shells" of constant likelihood. Its ability to sample from complex, multi-modal distributions in a flexible yet efficient way combined with several available sampling packages has contributed to its growing popularity in (astro)physics. In this talk I outline the basic motivation and theory behind Nested Sampling, derive various statistical properties associated with the method, and discuss how it is applied in practice. I also talk about how the overall framework can be extended in Dynamic Nested Sampling to accommodate adding samples "dynamically" during the course of a run. These samples can be allocated to maximize arbitrary objective functions, allowing Dynamic Nested Sampling to function as a posterior-oriented sampling method such as MCMC but with the added benefit of well-defined stopping criteria. I end with an application of Dynamic Nested Sampling to a variety of synthetic and real-world problems using an open-source Python package I've been developing (https://github.com/joshspeagle/dynesty/).

Josh Speagle

September 26, 2017
Tweet

More Decks by Josh Speagle

Other Decks in Research

Transcript

  1. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem
  2. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem Prior
  3. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem Prior Likelihood
  4. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem Prior Likelihood Posterior
  5. Background Pr , M = Pr , M Pr |M

    Pr M Bayes’ Theorem Prior Likelihood Posterior Evidence
  6. Posterior Estimation via Sampling , −1 , … , 2

    , 1 ∈ Ω , −1 , … , 2 , 1 ⇒ = =1 − i Samples Weights
  7. Posterior Estimation via Sampling , −1 , … , 2

    , 1 ∈ Ω , −1 , … , 2 , 1 ⇒ = =1 − i Samples Weights 1, 1, … , 1, 1 MCMC
  8. Posterior Estimation via Sampling , −1 , … , 2

    , 1 ∈ Ω , −1 , … , 2 , 1 ⇒ = =1 − i Samples Weights 1, 1, … , 1, 1 MCMC , −1 −1 , … , 2 2 , 1 1 Importance Sampling
  9. Posterior Estimation via Sampling , −1 , … , 2

    , 1 ∈ Ω , −1 , … , 2 , 1 ⇒ = =1 − i Samples Weights Nested Sampling?
  10. Motivation: Integrating the Posterior ≡ Ω ℒ ≡ :ℒ >

    “Prior Volume” Feroz et al. (2013)
  11. Motivation: Integrating the Posterior = 0 ∞ ≡ :ℒ >

    “Prior Volume” Feroz et al. (2013)
  12. Motivation: Integrating the Posterior = 0 1 ℒ ≡ :ℒ

    > “Prior Volume” Feroz et al. (2013)
  13. Motivation: Integrating the Posterior ≈ =1 ℒ , … ,

    … ≡ :ℒ > “Prior Volume” Feroz et al. (2013) ℒ , … , … can be rectangles, trapezoids, etc. Amplitude Differential Volume Δ
  14. Motivation: Integrating the Posterior ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …
  15. Motivation: Integrating the Posterior ≈ =1 = ℒ , …

    , … Directly proportional to typical set. = We get posteriors “for free” Importance Weight ~ , … , …
  16. Motivation: Sampling the Posterior Sampling directly from the likelihood ℒ

    is hard. Pictures from this 2010 talk by Skilling.
  17. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling.
  18. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. −1
  19. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. −1
  20. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. −1
  21. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. +1
  22. Motivation: Sampling the Posterior Sampling uniformly within bound ℒ >

    is easier. Pictures from this 2010 talk by Skilling. +1 MCMC: Solving a Hard Problem once. vs Nested Sampling: Solving an Easier Problem many times.
  23. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …
  24. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , … 
  25. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  ???
  26. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  ??? ~ ~ Unif PDF CDF Probability Integral Transform
  27. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  ??? ~ ~ Unif PDF CDF Probability Integral Transform Need to sample from the constrained prior.
  28. Pictures from this 2010 talk by Skilling. +1 +1 =

    =0 0 0 , … , ~ Unif i. i. d. Estimating the Prior Volume Posterior
  29. Pictures from this 2010 talk by Skilling. +1 +1 =

    =0 0 0 , … , ~ Unif 0 ≡ 1 i. i. d. Estimating the Prior Volume Posterior
  30. Pictures from this 2010 talk by Skilling. +1 +1 =

    =0 0 , … , ~ Unif i. i. d. 0 ≡ 1 Estimating the Prior Volume Posterior
  31. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  ???
  32. Estimating the Prior Volume ≈ =1 ≡ :ℒ > “Prior

    Volume” Feroz et al. (2013) = ℒ , … , …  Pr 1 , 2 , … , where 1st & 2nd moments can be computed
  33. Nested Sampling Algorithm ℒ > ℒ−1 > ⋯ > ℒ2

    > ℒ1 > 0 ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 1 2 −1
  34. Nested Sampling Algorithm (Ideal) ℒ > ℒ−1 > ⋯ >

    ℒ2 > ℒ1 > 0 Samples sequentially drawn from constrained prior ℒ>ℒ . ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 1 2 −1
  35. Nested Sampling Algorithm (Naïve) ℒ > ℒ−1 > ⋯ >

    ℒ2 > ℒ1 > 0 1 ~ 1. Samples sequentially drawn from prior . 2. New point only accepted if ℒ+1 > ℒ.
  36. Nested Sampling Algorithm (Naïve) ℒ > ℒ−1 > ⋯ >

    ℒ2 > ℒ1 > 0 1 ~ 2 1. Samples sequentially drawn from prior . 2. New point only accepted if ℒ+1 > ℒ.
  37. ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 >

    0 Nested Sampling Algorithm (Naïve) ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 1 ~ 2 −1 1. Samples sequentially drawn from prior . 2. New point only accepted if ℒ+1 > ℒ.
  38. Components of the Algorithm 1. Adding more particles. 2. Knowing

    when to stop. 3. What to do after stopping.
  39. Adding More Particles ℒ1 1 > ⋯ > ℒ2 1

    > ℒ1 1 > 0 Skilling (2006)
  40. Adding More Particles ℒ1 1 > ⋯ > ℒ2 1

    > ℒ1 1 > 0 Skilling (2006) +1 1 = =0
  41. Adding More Particles ℒ1 1 > ⋯ > ℒ2 1

    > ℒ1 1 > 0 ℒ2 2 > ⋯ > ℒ2 2 > ℒ1 2 > 0 Skilling (2006)
  42. Adding More Particles ℒ1 1 > ⋯ > ℒ2 1

    > ℒ1 1 > 0 ℒ2 2 > ⋯ > ℒ2 2 > ℒ1 2 > 0 Skilling (2006) +1 1 = =0 +1 2 = =0
  43. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  44. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  45. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  46. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  47. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  48. Adding More Particles ln +1 = =0 ln 0 ,

    … , ~ Unif Skilling (2006)
  49. Adding More Particles ln +1 = =0 ln 0 ,

    … , ~ 2 1 , 2 ~ Unif ⇒ 1 , 2 Order Statistics Skilling (2006)
  50. Adding More Particles ln +1 = =0 ln 0 ,

    … , ~ Beta 2,1 Skilling (2006) 1 , 2 ~ Unif ⇒ 1 , 2 Order Statistics
  51. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 Skilling (2006)
  52. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 One run with 2 “live points” = 2 runs with 1 live point Skilling (2006) live points dead points
  53. Adding More Particles ℒ1 1 > ℒ2 2 > ⋯

    > ℒ2 1 > ℒ2 2 > ℒ1 2 > ℒ1 1 > 0 One run with K “live points” = K runs with 1 live point Skilling (2006) live points dead points
  54. Adding More Particles ln +1 = =0 ln 0 ,

    … , ~ Beta 2,1 Skilling (2006) 1 , 2 ~ Unif ⇒ 1 , 2 Order Statistics live points dead points
  55. Adding More Particles 0 , … , ~ Beta K,

    1 Skilling (2006) 1 , … , ~ Unif ⇒ 1 , … , Order Statistics ln +1 = =0 ln
  56. Adding More Particles 0 , … , ~ Beta K,

    1 Skilling (2006) ln +1 = =0 ln + 1 1 , … , ~ Unif ⇒ 1 , … , Order Statistics
  57. “Recycling” the Final Set of Particles … , +1 ,

    … ~ 1 , … , ∼ Unif ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0
  58. “Recycling” the Final Set of Particles … , +1 ,

    … ~ 1 , … , ∼ Unif ℒ+ > ⋯ > ℒ+1 > ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0
  59. “Recycling” the Final Set of Particles … , +1 ,

    … ~ 1 , … , ∼ Unif ℒ+ > ⋯ > ℒ+1 > ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 + = +1 −+1
  60. “Recycling” the Final Set of Particles … , + ,

    … ~ −+1 1 , … , ∼ Unif ℒ+ > ⋯ > ℒ+1 > ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 + = +1 −+1
  61. “Recycling” the Final Set of Particles … , + ,

    … ~ −+1 1 , … , ∼ Unif ℒ+ > ⋯ > ℒ+1 > ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0 + = +1 −+1 Rényi Representation
  62. “Recycling” the Final Set of Particles … , , …

    ∼ =1 =1 +1 , 1 , … , +1 ∼ Expo Rényi Representation … , + , … ~ −+1 1 , … , ∼ Unif
  63. “Recycling” the Final Set of Particles … , , …

    ∼ =1 =1 +1 , 1 , … , +1 ∼ Expo Rényi Representation … , + , … ~ −+1 1 , … , ∼ Unif 1 2 3 −1 …
  64. “Recycling” the Final Set of Particles … , , …

    ∼ =1 =1 +1 , 1 , … , +1 ∼ Expo Rényi Representation … , + , … ~ −+1 1 , … , ∼ Unif 1 2 3 +1 …
  65. “Recycling” the Final Set of Particles ln + = =1

    ln + 1 + =1 ln − + 1 − + 2 Exponential Shrinkage Uniform Shrinkage
  66. Nested Sampling Uncertainties Pictures from this 2010 talk by Skilling.

    Statistical uncertainties • Unknown prior volumes
  67. Nested Sampling Uncertainties Pictures from this 2010 talk by Skilling.

    Statistical uncertainties Sampling uncertainties • Unknown prior volumes • Number of samples (counting) • Discrete point estimates for contours • Particle path dependencies
  68. Sampling Error: Poisson Uncertainties = ∼ ? ? ? Δ

    ln Based on Skilling (2006) and Keeton (2011) “Distance” from prior to posterior
  69. Sampling Error: Poisson Uncertainties ≡ Ω ln Based on Skilling

    (2006) and Keeton (2011) Kullback-Leibler divergence from to  “information gained”.
  70. Sampling Error: Poisson Uncertainties ≡ Ω ln = 1 0

    1 ℒ ln ℒ − ln Based on Skilling (2006) and Keeton (2011) Kullback-Leibler divergence from to  “information gained”.
  71. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln
  72. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln ∼ ln
  73. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln ∼ ln ∼ Δ ln 2
  74. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln ∼ ln ∼ Δ ln 2 ∼ Δ ln
  75. Sampling Error: Poisson Uncertainties = ∼ Δ ln Based on

    Skilling (2006) and Keeton (2011) ln ∼ ln ∼ Δ ln 2 ∼ =1 + Δ Δ ln = 1/
  76. Sampling Error: Monte Carlo Noise Formalism following Higson et al.

    (2017) and Chopin and Robert (2010) = Ω = 1 0 1 X ℒ
  77. Sampling Error: Monte Carlo Noise = Ω = 1 0

    1 X ℒ Formalism following Higson et al. (2017) and Chopin and Robert (2010) X = ℒ = ℒ
  78. Sampling Error: Monte Carlo Noise = Ω = 1 0

    1 X ℒ Formalism following Higson et al. (2017) and Chopin and Robert (2010) X = ℒ = ℒ
  79. Sampling Error: Monte Carlo Noise = Ω = 1 0

    1 X ℒ Formalism following Higson et al. (2017) and Chopin and Robert (2010) X = ℒ = ℒ
  80. Sampling Error: Monte Carlo Noise Formalism following Higson et al.

    (2017) and Chopin and Robert (2010) ≈ =1 + = Ω = 1 0 1 X ℒ
  81. Exploring Sampling Uncertainties One run with K “live points” =

    K runs with 1 live point! ℒ > ℒ−1 > ⋯ > ℒ2 > ℒ1 > 0
  82. Exploring Sampling Uncertainties One run with K “live points” =

    K runs with 1 live point! ℒ1 1 > ⋯ > ℒ2 1 > ℒ1 1 > 0 ℒ2 2 > ⋯ > ℒ2 2 > ℒ1 2 > 0
  83. Exploring Sampling Uncertainties One run with K “live points” =

    K runs with 1 live point! ℒ ∙ = ℒ 1 , ℒ 2 , … Original run “strand” “strand”
  84. Exploring Sampling Uncertainties ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand”
  85. Exploring Sampling Uncertainties ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand” ℒ ∙ ′ = ℒ 1 ′ , ℒ 2 ′ , … We would like to sample K paths from the set of all possible paths P ℒ , … . However, we don’t have access to it.
  86. Exploring Sampling Uncertainties ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand” We would like to sample K paths from the set of all possible paths P ℒ , … . However, we don’t have access to it. Use bootstrap estimator. ℒ ∙ ′ = ℒ 1 , ℒ 1 , ℒ 2 , …
  87. Exploring Sampling Uncertainties ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand” ′ ≈ =1 ′ ′ ′ = ′ ′ We would like to sample K paths from the set of all possible paths P ℒ , … . However, we don’t have access to it. Use bootstrap estimator.
  88. Method 0: Sampling from the Prior Higson et al. (2017)

    arxiv:1704.03459 Sampling from the prior becomes exponentially more inefficient as time goes on.
  89. Method 1: Constrained Uniform Sampling Feroz et al. (2009) Proposal:

    Bound the iso-likelihood contours in real time and sample from the newly constrained prior.
  90. Method 1: Constrained Uniform Sampling Feroz et al. (2009) Issues:

    • How to ensure bounds always encompass iso-likelihood contours? • How to generate flexible bounds?
  91. Method 1: Constrained Uniform Sampling Feroz et al. (2009) Issues:

    • How to ensure bounds always encompass iso-likelihood contours? • How to generate flexible bounds? Bootstrapping. Easier with uniform (transformed) prior.
  92. Method 2: “Evolving” Previous Samples Proposal: Generate independent samples subject

    to the likelihood constraint by “evolving” copies of current live points.
  93. Method 2: “Evolving” Previous Samples Proposal: Generate independent samples subject

    to the likelihood constraint by “evolving” copies of current live points.
  94. Method 2: “Evolving” Previous Samples Proposal: Generate independent samples subject

    to the likelihood constraint by “evolving” copies of current live points. • Random walks (i.e. MCMC) • Slice sampling • Random trajectories (i.e. HMC) PolyChord
  95. Method 2: “Evolving” Previous Samples Issues: • How to ensure

    samples are independent (thinning) and properly distributed within likelihood constraint? • How to generate efficient proposals? • Random walks (i.e. MCMC) • Slice sampling • Random trajectories (i.e. HMC) PolyChord
  96. Summary: (Static) Nested Sampling 1. Estimates the evidence . 2.

    Estimates the posterior . 3. Possesses well-defined stopping criteria.
  97. Summary: (Static) Nested Sampling 1. Estimates the evidence . 2.

    Estimates the posterior . 3. Possesses well-defined stopping criteria. 4. Combining runs improves inference.
  98. Summary: (Static) Nested Sampling 1. Estimates the evidence . 2.

    Estimates the posterior . 3. Possesses well-defined stopping criteria. 4. Combining runs improves inference. 5. Sampling and statistical uncertainties can be simulated from a single run.
  99. Dynamic Nested Sampling ℒmax 1 > ℒ 1 1 >

    ⋯ > ℒ 2 1 > ℒ1 1 > ℒ min 1 ℒ2 2 > ⋯ > ℒ2 2 > ℒ1 2 > 0
  100. Dynamic Nested Sampling ℒ2 2 > ⋯ > ℒ1 1

    … > ℒ1 1 > ℒ2 2 > ℒ1 2 > 0 2 live points 2 live points 1 + 2 live points
  101. Dynamic Nested Sampling ℒ2 2 > ⋯ > ℒ1 1

    … > ℒ1 1 > ℒ2 2 > ℒ1 2 > 0 2 live points 2 live points 1 + 2 live points ln +1 = =1 ln
  102. Dynamic Nested Sampling 2 live points 2 live points 1

    + 2 live points ln +1 = =1 ln +1 ≥ : +1 ∼ Beta(+1 , 1) Constant/Increasing ℒ2 2 > ⋯ > ℒ1 1 … > ℒ1 1 > ℒ2 2 > ℒ1 2 > 0
  103. Dynamic Nested Sampling ℒ2 2 > ⋯ > ℒ1 1

    … > ℒ1 1 > ℒ2 2 > ℒ1 2 > 0 2 live points 2 live points 1 + 2 live points ln +1 = =1 ln +1 ≥ : +1 ∼ Beta(+1 , 1) +> < : +1 , … + ∼ , … , −+1 Constant/Increasing Decreasing sequence
  104. Dynamic Nested Sampling ln = =1 1 ln + 1

    + =1 2 ln 1 − + 1 1 − + 2 + ⋯ + =1 final ln final − + 1 final − + 2 Exponential shrinkage Uniform shrinkage
  105. Benefits of Dynamic Nested Sampling • Can accommodate new “strands”

    within a particular range of prior volumes without changing overall statistical framework. • Particles can be adaptively added until stopping criteria are reached, allowing targeted estimation.
  106. Benefits of Dynamic Nested Sampling • Can accommodate new “strands”

    within a particular range of prior volumes without changing overall statistical framework. • Particles can be adaptively added until stopping criteria are reached, allowing targeted estimation.
  107. Sampling Uncertainties (Static) ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run “strand” “strand” “strand” We would like to sample K paths from the set of all possible paths P ℒ , … . However, we don’t have access to it. Use bootstrap estimator. ℒ ∙ ′ = ℒ 1 , ℒ 1 , ℒ 2 , …
  108. Sampling Uncertainties (Dynamic) ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run ℒ min 1 = −∞ (originated from the prior) “strand” ℒ min 2 = 2 (originated interior to the prior)
  109. Sampling Uncertainties (Dynamic) ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run ℒ min 1 = −∞ (originated from the prior) “strand” ℒ min 2 = 2 (originated interior to the prior) We would like to sample: paths from P ℒ , … and paths from P ℒ , … .
  110. Sampling Uncertainties (Dynamic) ℒ ∙ = ℒ 1 , ℒ

    2 , … Original run ℒ min 1 = −∞ (originated from the prior) “strand” ℒ min 2 = 2 (originated interior to the prior) We would like to sample: paths from P ℒ , … and paths from P ℒ , … . Use stratified bootstrap estimator. ℒ ∙ ′ = ℒ 1 , ℒ 1 , ℒ 2 , …
  111. Dynamic Nested Sampling • Can accommodate new “strands” within a

    particular range of prior volumes without changing overall statistical framework. • Particles can be adaptively added until stopping criteria are reached, allowing targeted estimation.
  112. Dynamic Nested Sampling • Can accommodate new “strands” within a

    particular range of prior volumes without changing overall statistical framework. • Particles can be adaptively added until stopping criteria are reached, allowing targeted estimation.
  113. Allocating Samples Higson et al. (2017) arxiv:1704.03459 ∼ > ∼

    , = + 1 − Posterior weight Evidence weight Weight Function
  114. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary?
  115. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. “True” posterior constructed over same “domain”.
  116. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. ≡ Ω ln
  117. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. = =1 ln
  118. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. = =1 ln We want access to P , but we don’t know .
  119. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. ′ = =1 ′ ln ′ We want access to Pr , but we don’t know . Use bootstrap estimator.
  120. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. ′ = =1 ′ ln ′ We want access to Pr , but we don’t know . Use bootstrap estimator. Random variable
  121. How Many Samples is Enough? • In any sampling-based approach

    to estimating with , how many samples are necessary? Assume general case: we want D-dimensional and i densities to be “close”. Possible stopping criterion: fractional (%) variation in H. We want access to Pr , but we don’t know . Use bootstrap estimator.
  122. Dynamic Nested Sampling Summary 1. Can sample from multi-modal distributions.

    2. Can simultaneously estimate the evidence and posterior . 3. Combining independent runs improves inference (“trivially parallelizable”). 4. Can simulate uncertainties (sampling and statistical) from a single run. 5. Enables adaptive sample allocation during runtime using arbitrary weight functions. 6. Possesses evidence/posterior-based stopping criteria.
  123. Dynamic Nested Sampling with dynesty dynesty.readthedocs.io • Pure Python. •

    Easy to use. • Modular. • Open source. • Parallelizable. • Flexible bounding/sampling methods. • Thorough documentation!
  124. Application: • All results are preliminary but agree with results

    from MCMC methods (derived using emcee). • Samples allocated with 100% posterior weight, automated stopping criterion (2% fractional error in simulated KLD). • dynesty was substantially (~3-6x) more efficient at generating good samples than emcee, before thinning.
  125. Application: Modeling Galaxy SEDs = ln ∗ , ln ,

    5 , 6 , 2 D=15 With: Joel Leja, Ben Johnson, Charlie Conroy
  126. Application: Supernovae Light Curves Fig: Open Supernova Catalog (LSQ12dlf), James

    Guillochon With: James Guillochon, Kaisey Mandel = , 4 , 3 , 3 , 2 D=12
  127. Dynamic Nested Sampling with dynesty dynesty.readthedocs.io • Pure Python. •

    Easy to use. • Modular. • Open source. • Parallelizable. • Flexible bounding/sampling methods. • Thorough documentation!