Exoplanet population inference: a tutorial

Exoplanet population inference: a tutorial

My talk at the KITP exostar conference

00c684a144d49f612a51e855eb326d6c?s=128

Dan Foreman-Mackey

May 22, 2019
Tweet

Transcript

  1. Exoplanet Population Inference A Tutorial Dan Foreman-Mackey CCA@Flatiron // dfm.io

  2. Today I'll mostly talk about transiting exoplanets*. The methods can

    apply more broadly . * this is what I know about and work on!
  3. 1 Exoplanet population inference

  4. 1 10 100 orbital period [days] 1 10 planet radius

    [R ] data: NASA Exoplanet Archive
  5. leteness model 2013; Farr et al. is shortcoming ch pipeline

    and igura et al. . hortcoming by eteness of the 2014) through s. In this study, Kepler pipeline rive the planet Kepler planet other highlight the systematic ce rates with ) and Dong & ysis where we recalculate the input assump- Figure 1. Fractional completeness model for the host to Kepler-22b (KIC: 10593626) in the Q1-Q16 pipeline run using the analytic model described in Section 2. t 10 Burke et al. Burke, Christiansen et al. (2015)
  6. Take these catalogs and get the physics of planet formation

    and evolution.
  7. That's hard .

  8. 1 10 100 orbital period [days] 1 10 planet radius

    [R ] data: NASA Exoplanet Archive
  9. Fulton & Petigura (2018) 8. Planets with g grazing transit

    covariances w darkening duri After applying these Where possible, properties to the Ke radius and temper parameters. We cou stellar population b directed specificall population. After fil We calculated pla efficiency methodolo the detection sensit recovery tests perfo K02403.01 17.98 K00988.01 60.03 Note. This table contains filters described in Sectio (This table is available in Figure 5. The distribution of close-in planet sizes. The top panel shows the distribution from Fulton et al. (2017) and the bottom panel is the updated distribution from this work. The solid line shows the number of planets per star with orbital periods less than 100days as a function of planet size. A deep
  10. 2 What is an occurrence rate?

  11. 1 The expected number of planets per star.

  12. 2 The fraction of stars with planets.

  13. 3 The expected number of planets per star per unit

    planet property .
  14. 4 etc.

  15. None of these definitions is inherently better than the others.

  16. But. They are all different .

  17. They have different units .

  18. They all depend on a specific (often unstated) definition of

    "planets" .
  19. So. It can be hard to compare and understand how

    they relate.
  20. Them: * "The occurrence rate is 10%." Y'all: "what does

    it all mean?!?1?" * including me and others in the room
  21. Them: * "The occurrence rate is 10%." Y'all: "what does

    it all mean?!?1?" * including me and others in the room
  22. Fulton & Petigura (2018) 8. Planets with g grazing transit

    covariances w darkening duri After applying these Where possible, properties to the Ke radius and temper parameters. We cou stellar population b directed specificall population. After fil We calculated pla efficiency methodolo the detection sensit recovery tests perfo K02403.01 17.98 K00988.01 60.03 Note. This table contains filters described in Sectio (This table is available in Figure 5. The distribution of close-in planet sizes. The top panel shows the distribution from Fulton et al. (2017) and the bottom panel is the updated distribution from this work. The solid line shows the number of planets per star with orbital periods less than 100days as a function of planet size. A deep
  23. Fulton & Petigura (2018) 8. Planets with g grazing transit

    covariances w darkening duri After applying these Where possible, properties to the Ke radius and temper parameters. We cou stellar population b directed specificall population. After fil We calculated pla efficiency methodolo the detection sensit recovery tests perfo K02403.01 17.98 K00988.01 60.03 Note. This table contains filters described in Sectio (This table is available in Figure 5. The distribution of close-in planet sizes. The top panel shows the distribution from Fulton et al. (2017) and the bottom panel is the updated distribution from this work. The solid line shows the number of planets per star with orbital periods less than 100days as a function of planet size. A deep what do these numbers mean?
  24. Fulton & Petigura (2018) 8. Planets with g grazing transit

    covariances w darkening duri After applying these Where possible, properties to the Ke radius and temper parameters. We cou stellar population b directed specificall population. After fil We calculated pla efficiency methodolo the detection sensit recovery tests perfo K02403.01 17.98 K00988.01 60.03 Note. This table contains filters described in Sectio (This table is available in Figure 5. The distribution of close-in planet sizes. The top panel shows the distribution from Fulton et al. (2017) and the bottom panel is the updated distribution from this work. The solid line shows the number of planets per star with orbital periods less than 100days as a function of planet size. A deep what do these numbers mean? The expected number of planets per star with a period in the range 0–100 days and radius in the given bin .
  25. Simulations github.com/dfm/exostar19 expected number of planets per star

  26. 3 How to estimate an occurrence rate?

  27. Inverse detection efficiency Probabilistic modeling Approximate Bayesian Computation 1 2

    3
  28. 1 Inverse detection efficiency Nexpect = 1 Ntot N X

    j=1 1 Pdet(xj) Note: don't do this!
  29. 2 Probabilistic modeling Nexpect = arg maxNexpect p(Nobs, {xj }

    | Nexpect, Ntot)
  30. 3 Approximate Bayesian Computation

  31. 3 Approximate Bayesian Computation

  32. Inverse detection efficiency Probabilistic modeling Approximate Bayesian Computation 1 2

    3
  33. Inverse detection efficiency Probabilistic modeling Approximate Bayesian Computation 1 2

    3 ≈ =
  34. Inverse detection efficiency Probabilistic modeling Approximate Bayesian Computation 1 2

    3 ≈ =
  35. P(qj ) true number of planets nj, xj observed number

    of planets the properties of the planets and the star want have
  36. P(nj | xj , qj ) observed number of planets

    true number of planets the properties of the planets and the star
  37. Start with either zero or one planet(s).

  38. There are four options.

  39. value of P(nj | xj , qj ) 1 1–Pdet

    (xj ) 0 Pdet (xj ) qj = 0 1 true number of planets nj =0 1 observed number of planets
  40. But. We don't know the true number of planets.

  41. Marginalize!

  42. P(nj | xj) = X qj 2{0, 1} P(qj) P(nj

    | xj, qj) = Q P(nj | xj, qj= 1) + (1 Q) P(nj | xj, qj= 0)
  43. P(nj | xj) = X qj 2{0, 1} P(qj) P(nj

    | xj, qj) = Q P(nj | xj, qj= 1) + (1 Q) P(nj | xj, qj= 0)
  44. P(nj | xj) = X qj 2{0, 1} P(qj) P(nj

    | xj, qj) = Q P(nj | xj, qj= 1) + (1 Q) P(nj | xj, qj= 0) this is the parameter that we want to fit for!
  45. But. We don't know the properties of the unobserved planets

    .
  46. Marginalize!

  47. P(nj = 1) = p(xj) P(nj = 1 | xj)

    = p(xj) Q P(nj = 1 | xj, qj= 1) P(nj = 0) = Z p(xj) P(nj = 0 | xj) dxj = 1 Q Z p(xj) P(nj = 1 | xj, qj= 1) dxj = 1 Q P0 systems with no planets systems with detected planets
  48. P(nj = 1) = p(xj) P(nj = 1 | xj)

    = p(xj) Q P(nj = 1 | xj, qj= 1) P(nj = 0) = Z p(xj) P(nj = 0 | xj) dxj = 1 Q Z p(xj) P(nj = 1 | xj, qj= 1) dxj = 1 Q P0 detection probability systems with no planets systems with detected planets
  49. Put it all together. An exercise for the reader…

  50. Q = N1 N0 + N1 1 P0 6= 1

    N0 + N1 N1 X j=1 1 Pj the occurrence rate the fraction of stars with observed planets
  51. Q = N1 N0 + N1 1 P0 6= 1

    N0 + N1 N1 X j=1 1 Pj P0 = Z p(xj) P(nj = 1 | xj, qj= 1) dxj the detection probability averaged over the distribution of planet and stellar properties the occurrence rate the fraction of stars with observed planets
  52. Q = N1 N0 + N1 1 P0 6= 1

    N0 + N1 N1 X j=1 1 Pj P0 = Z p(xj) P(nj = 1 | xj, qj= 1) dxj the detection probability averaged over the distribution of planet and stellar properties the occurrence rate the fraction of stars with observed planets
  53. see: dfm.io/posts/histogram1

  54. truth: 50 inverse-detection-efficiency gives: 28.5 ± 5.5 see: dfm.io/posts/histogram1

  55. truth: 50 inverse-detection-efficiency gives: 28.5 ± 5.5 maximum-likelihood gives: 54.0

    ± 10.4 see: dfm.io/posts/histogram1
  56. Inverse detection efficiency is not the right estimator.

  57. Instead, take the fraction of detections and divide by the

    average detection efficiency*. * averaged over the correct distribution for all planet and star properties
  58. The key ingredient is the detection efficiency model.

  59. leteness model 2013; Farr et al. is shortcoming ch pipeline

    and igura et al. . hortcoming by eteness of the 2014) through s. In this study, Kepler pipeline rive the planet Kepler planet other highlight the systematic ce rates with ) and Dong & ysis where we recalculate the input assump- Figure 1. Fractional completeness model for the host to Kepler-22b (KIC: 10593626) in the Q1-Q16 pipeline run using the analytic model described in Section 2. t 10 Burke et al. Burke, Christiansen et al. (2015)
  60. Remember : an occurrence rate depends on a lot of

    decisions!
  61. Stellar sample Range of planet parameters Units Planet multiplicity 1

    2 3 4
  62. 4 Complications

  63. Multiplicity Uncertainties False positives Heterogeneous catalogs 1 2 3 (planetary

    and stellar) 4
  64. You end up needing to do an integral over all

    the properties of all the planets and false positives that you didn't observe .
  65. 1 Mathematica™ can't do that integral.

  66. 2 Eric Agol can't do that integral.

  67. 3 MCMC can't do that integral*. * in finite time.

  68. This is where you use approximate Bayesian computation (ABC).

  69. This is where you use approximate Bayesian computation (ABC). likelihood-free

    inference.
  70. Likelihood-free inference is a method for doing rigorous inference with

    stochastic models .
  71. If you can simulate it then you can do inference.

    a realistic catalog The promise of "likelihood-free inference".
  72. PLANET OCCURRENCE RATES 11 Figure 2. Inferred occurrence rates for

    Kepler’s DR25 planet candidates associated with high-quality FGK target stars. These rares are based on a combined detection and vetting efficiency model that was fit to flux-level planet injection tests. The numerical values of the occurrence Hsu et al. (2019)
  73. There's still lots to do!

  74. EPOS; Mulders et al. (2018) no additional s indicate the

    Figure 10. Comparison of simulated planets for the example model (blue) with detected planets (orange). The comparison region (black box) excludes hot
  75. EPOS; Mulders et al. (2018) no additional s indicate the

    Figure 10. Comparison of simulated planets for the example model (blue) with detected planets (orange). The comparison region (black box) excludes hot
  76. 5 Take homes

  77. An occurrence rate needs to come with a lot of

    metadata.
  78. Comparing occurrence rates: Check the units . Check the parameter

    ranges .
  79. Don't sum the inverse detection probabilities for your planets! *

    a more reliable estimator is just as easy to compute!
  80. If you're using a method that seems intuitive , make

    sure the math checks out !
  81. Likelihood-free inference seems like a promising way forward. * a.k.a.

    Approximate Bayesian Computation (ABC)
  82. It's over.

  83. Extras.

  84. p({nj }, {xj } | Q) = [1 Q P0]N0

    2 4 N1 Y j=1 Q p(xj) P(nj = 1 | xj, qj= 1) 3 5
  85. log p({nj }, {xj } | Q) = N0 log

    (1 Q P0) + N1 log Q + constant
  86. log p({nj }, {xj } | Q) = N0 log

    (1 Q P0) + N1 log Q + constant Q = N1 N0 + N1 1 P0 6= 1 N0 + N1 N1 X j=1 1 Pj
  87. Simulations github.com/dfm/exostar19 "truth" fraction of stars with planets expected number

    of planets per star
  88. Note: this is preliminary & really just a toy… assuming:

    no mutual inclination only geometric transit probability 0.5 < RP /REarth < 8; 10 < a/Rstar < 30 Kepler data: github.com/dfm/exostar19
  89. 0.5 < RP /REarth < 8; 10 < a/Rstar <

    30 Kepler data: github.com/dfm/exostar19 Note: this is preliminary & really just a toy… assuming: no mutual inclination only geometric transit probability