Slide 1

Slide 1 text

EXOPLANET POPULATIONS Inferring from noisy, incomplete catalogs Dan Foreman-Mackey CCPP@NYU // github.com/dfm // @exoplaneteer // dfm.io

Slide 2

Slide 2 text

Engineering

Slide 3

Slide 3 text

Data Science

Slide 4

Slide 4 text

Photo credit: James Silvester silvesterphoto.tumblr.com not data science.

Slide 5

Slide 5 text

cb Flickr user Marcin Wichary data science.

Slide 6

Slide 6 text

data science.

Slide 7

Slide 7 text

Data Science

Slide 8

Slide 8 text

Exoplanet population inference and the abundance of Earth analogs from noisy, incomplete catalogs Foreman-Mackey, Hogg & Morton (arXiv:1406.3020)

Slide 9

Slide 9 text

The Punchline Existing methods for population inference are not robust Current data don't support claims of high Earth analog abundance There should be some transiting Earth analogs in the Kepler data

Slide 10

Slide 10 text

The Question What can we say about the population of exoplanets given all of the pixels ever downloaded by Kepler?

Slide 11

Slide 11 text

Photo credit: NASA Kepler

Slide 12

Slide 12 text

Photo credit: NASA

Slide 13

Slide 13 text

Kepler-32

Slide 14

Slide 14 text

Kepler-32

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

... ...

Slide 17

Slide 17 text

Data from: NASA Exoplanet Archive

Slide 18

Slide 18 text

Earth

Slide 19

Slide 19 text

... ...

Slide 20

Slide 20 text

Jupiter

Slide 21

Slide 21 text

Data from: NASA Exoplanet Archive

Slide 22

Slide 22 text

5 10 20 30 40 50 100 200 300 400 Orbital period (days) 0.5 1 2 3 4 5 10 20 Planet size (Earth-radii) 0 10 20 30 40 50 60 70 80 90 100 Survey Completeness (C) % F o d li c in p c m r o g o f Figure credit: Petigura, Howard & Marcy (2013)

Slide 23

Slide 23 text

The Question What can we say about the population of exoplanets given all of the pixels ever downloaded by Kepler?

Slide 24

Slide 24 text

5 10 20 30 40 50 100 200 300 400 Orbital period (days) 0.5 1 2 3 4 5 10 20 Planet size (Earth-radii) 0 10 20 30 40 50 60 70 80 90 100 Survey Completeness (C) % F o d li c in p c m r o g o f Figure credit: Petigura, Howard & Marcy (2013)

Slide 25

Slide 25 text

6.25 12.5 25 50 100 200 400 Orbital period (days) 0.5 1 2 4 8 16 Planet size (Earth-radii) 4.9 0.6% 3.5 0.4% 0.3 0.1% 0.2 0.1% 6.6 0.9% 6.1 0.7% 0.8 0.2% 0.2 0.2% 7.7 1.3% 7.0 0.9% 0.4 0.2% 0.6 0.3% 5.8 1.6% 7.5 1.3% 1.3 0.6% 0.6 0.3% 3.2 1.6% 6.2 1.5% 2.0 0.8% 1.1 0.6% 5.0 2.1% 1.6 1.0% 1.3 0.6% 0% 1% 2% 3% 4% 5% 6% 7% 8% Planet Occurrence Fi o d a in in w e p o co B p o w th re e p R Figure credit: Petigura, Howard & Marcy (2013)

Slide 26

Slide 26 text

inverse-detection-efficiency maximum likelihood the method

Slide 27

Slide 27 text

inverse-detection-efficiency maximum likelihood the method “non-parametric” Howard et al. (2011), Dressing & Charbonneau (2013), Petigura et al. (2013), and more… ✓j = 1 j K X k=1 1[wk 2 j] Q(wk)

Slide 28

Slide 28 text

inverse-detection-efficiency maximum likelihood the method “non-parametric” Howard et al. (2011), Dressing & Charbonneau (2013), Petigura et al. (2013), and more… ✓j = 1 j K X k=1 1[wk 2 j] Q(wk) “parametric” Tabachnik & Tremaine (2002), Youdin (2011), and more… – 6 – e 2002; Youdin 2011 for some of the examples from the exoplanet litera p({ wk } | ✓ ) = exp ✓ Z ˆ ✓ ( w ) d w ◆ K Y k=1 ˆ ✓ ( wk ) .

Slide 29

Slide 29 text

The Inverse-Detection-Efficiency Procedure ✓j = 1 j K X k=1 1[wk 2 j] Q(wk) (a weighted histogram)

Slide 30

Slide 30 text

The Inverse-Detection-Efficiency Procedure ✓j = 1 j K X k=1 1[wk 2 j] Q(wk) (a weighted histogram) BAD IDEA

Slide 31

Slide 31 text

The Inverse-Detection-Efficiency Procedure

Slide 32

Slide 32 text

The Inverse-Detection-Efficiency Procedure truth: 50 inverse-detection-efficiency gives: 28.5 ± 5.5 maximum-likelihood gives: 54.0 ± 10.4

Slide 33

Slide 33 text

A Better Inverse-Detection-Efficiency Procedure (see: dfm.io/posts/histogram1) ✓j = Nj Z j Q(w) dw

Slide 34

Slide 34 text

WHOLE STORY but that's not the yet

Slide 35

Slide 35 text

Exoplanet population inference and the abundance of Earth analogs from noisy, incomplete catalogs Foreman-Mackey, Hogg & Morton (arXiv:1406.3020)

Slide 36

Slide 36 text

5 10 20 30 40 50 100 200 300 400 Orbital period (days) 0.5 1 2 3 4 5 10 20 Planet size (Earth-radii) 0 10 20 30 40 50 60 70 80 90 100 Survey Completeness (C) % F o d li c in p c m r o g o f Figure credit: Petigura, Howard & Marcy (2013) typical error bar

Slide 37

Slide 37 text

How do you make a histogram of noisy measurements?

Slide 38

Slide 38 text

How do you infer the True distribution of noisy measurements?

Slide 39

Slide 39 text

truth w p(w)

Slide 40

Slide 40 text

ignoring uncertainties truth w p(w)

Slide 41

Slide 41 text

ignoring uncertainties truth intuitive resampling w p(w)

Slide 42

Slide 42 text

ignoring uncertainties truth intuitive resampling w p(w) BAD IDEA

Slide 43

Slide 43 text

Hierarchical Inference

Slide 44

Slide 44 text

The Question What can we say about the population of exoplanets given all of the pixels ever downloaded by Kepler?

Slide 45

Slide 45 text

k = 1, · · · , K ✓ wk xk per-object parameters (period, radius, etc.) per-object observations global population p({ xk } | ✓ ) = Z p({ xk }, { wk } | ✓ ) d{ wk }

Slide 46

Slide 46 text

k = 1, · · · , K ✓ wk xk per-object parameters (period, radius, etc.) per-object observations global population p({ xk } | ✓ ) = Z p({ xk }, { wk } | ✓ ) d{ wk }

Slide 47

Slide 47 text

p({ xk } | ✓ ) = Z p({ xk }, { wk } | ✓ ) d{ wk } = Z p({ xk } | { wk }) p({ wk } | ✓ ) d{ wk }

Slide 48

Slide 48 text

p({ xk } | ✓ ) = Z p({ xk }, { wk } | ✓ ) d{ wk } = Z p({ xk } | { wk }) p({ wk } | ✓ ) d{ wk } HARD this is ™ (generally impossible)

Slide 49

Slide 49 text

Hogg, Myers, & Bovy (2010) Inferring the eccentricity distribution [1008.4146]

Slide 50

Slide 50 text

What is a catalog? posterior samples interim prior w (n) k ⇠ p( wk | xk, ↵ ) – 8 – d, we will reuse the hard work that went into building the c each entry in a catalog is a representation of the posterior p( wk | xk , ↵ ) = p( xk | wk ) p( wk | ↵ ) p( xk | ↵ ) ameters wk conditioned on the observations of that objec minder that the catalog was produced under a specific c tive”— interim prior p( wk | ↵ ). This prior was chosen by th s di↵erent from the likelihood p( wk | ✓ ) from Equation (2). we can use these posterior measurements to simplify Equ n many common cases, be evaluated e ciently. To find thi

Slide 51

Slide 51 text

p ( { xk } | ✓) p ( { xk } | ↵) ⇡ exp ✓ Z ˆ✓(w) dw ◆ K Y k=1 1 Nk Nk X n=1 ˆ✓(w (n) k ) p (w (n) k | ↵) p({ xk } | ✓ ) = Z p({ xk } | { wk }) p({ wk } | ✓ ) d{ wk } maths w (n) k ⇠ p( wk | xk, ↵ ) posterior samples includes completeness & uncertainties!

Slide 52

Slide 52 text

the original "interim" prior the observable rate density likelihood of pixels given population expected # of observable exoplanets p ( { xk } | ✓) p ( { xk } | ↵) ⇡ exp ✓ Z ˆ✓(w) dw ◆ K Y k=1 1 Nk Nk X n=1 ˆ✓(w (n) k ) p (w (n) k | ↵) sum over posterior samples product over objects The "Money Equation™"

Slide 53

Slide 53 text

github.com/dfm/exopop

Slide 54

Slide 54 text

the original "interim" prior the observable rate density likelihood of pixels given population expected # of observable exoplanets p ( { xk } | ✓) p ( { xk } | ↵) ⇡ exp ✓ Z ˆ✓(w) dw ◆ K Y k=1 1 Nk Nk X n=1 ˆ✓(w (n) k ) p (w (n) k | ↵) sum over posterior samples product over objects The "Money Equation™"

Slide 55

Slide 55 text

The Rate Density Model e sample. hing to note here is that ˆ ✓ is the rate density of exo observe taking into account the geometric transit probabil cies. In practice, we can model the observable rate density ˆ ✓ ( w ) = Qc ( w ) ✓ ( w ) he detection e ciency (including transit probability) at w ant to infer: the True occurrence rate density. We haven’t al form for ✓ ( w ) and all of this derivation is equally appli ensity as, for example, a broken power law or a histogram d rate density ˆ is a quantitative description of the rate n the Petigura et al. (2013b) catalog; it is not a description ✓(w) = dN dw The "observable" rate density where

Slide 56

Slide 56 text

The Rate Density Model method. Instead, we could use a functional form for the ameters along with the parameters of the rate density. all the results in this Article, we’ll model the rate dens nction ✓ ( w ) = 8 > > > > > < > > > > > : exp(✓1 ) w 2 1 , exp(✓2 ) w 2 2 , · · · exp(✓J ) w 2 J , 0 otherwise eters ✓j are the log step heights and the bins j are fixed (looks like a histogram)

Slide 57

Slide 57 text

The Rate Density Model p(✓) = GP(✓; ) prior on the log bin heights p( ✓ | { xk }) / p( ✓ ) p({ xk } | ✓ ) use MCMC to sample the posterior PDF for the bin heights

Slide 58

Slide 58 text

6.25 12.5 25 50 100 200 400 Orbital period (days) 0.5 1 2 4 8 16 Planet size (Earth-radii) 4.9 0.6% 3.5 0.4% 0.3 0.1% 0.2 0.1% 6.6 0.9% 6.1 0.7% 0.8 0.2% 0.2 0.2% 7.7 1.3% 7.0 0.9% 0.4 0.2% 0.6 0.3% 5.8 1.6% 7.5 1.3% 1.3 0.6% 0.6 0.3% 3.2 1.6% 6.2 1.5% 2.0 0.8% 1.1 0.6% 5.0 2.1% 1.6 1.0% 1.3 0.6% 0% 1% 2% 3% 4% 5% 6% 7% 8% Planet Occurrence Fi o d a in in w e p o co B p o w th re e p R Figure credit: Petigura, Howard & Marcy (2013)

Slide 59

Slide 59 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.00 0.15 0.30 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 60

Slide 60 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.00 0.15 0.30 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 61

Slide 61 text

0 1 2 3 ln R/R 10 3 10 2 10 1 100 (ln R/R ) all 6.25  P/day < 25 25  P/day < 100 100  P/day < 400 1 10 R/R Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 62

Slide 62 text

2 3 4 5 6 ln P/day 10 3 10 2 10 1 (ln P/day) all 0.5  R/R < 2 2  R/R < 8 8  R/R < 32 10 100 P/day Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 63

Slide 63 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.00 0.15 0.30 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 64

Slide 64 text

Figure credit: Foreman-Mackey, Hogg & Morton (2014) 7 6 5 4 3 2 1 ln 0.0 0.1 0.2 0.3 0.4 0.5 0.6 p(ln ) 10 3 10 2 10 1

Slide 65

Slide 65 text

Figure credit: Foreman-Mackey, Hogg & Morton (2014) 7 6 5 4 3 2 1 ln 0.0 0.1 0.2 0.3 0.4 0.5 0.6 p(ln ) 10 3 10 2 10 1 our result has large fractional uncertainty—w his is shown in Figure 9 where we compare the to the published value and uncertainty. Earth analogs is = 0.019+0.019 0.010 nat 2 ates that this quantity is a rate density, per na radius. Converted to these units, Petigura e same quantity (indicated as the vertical lines the expected number of Earth-like exoplanets per Sun-like star per ln-radius per ln-period

Slide 66

Slide 66 text

5 4 3 2 ln Petigura et al. (2013) Dong & Zhu (2013) linear extrapolation negligible uncertainties Foreman-Mackey et al. (2014) Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 67

Slide 67 text

Figure credit: Petigura, Howard & Marcy (2013) of stars having nearly Earth-size planets ð1 − 2 R⊕Þ with any orbital period up to a maximum period, P, on the h size ð1 − 2 R⊕Þ are included. This cumulative distribution reaches 20.2% at P = 50 d, meaning 20.4% of Sun-like Petigura et al. assumed that the period distribution of small planets is flat from 50d-400d

Slide 68

Slide 68 text

2 3 4 5 6 ln P/day 10 3 10 2 10 1 (ln P/day) all 0.5  R/R < 2 2  R/R < 8 8  R/R < 32 10 100 P/day Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 69

Slide 69 text

6.25 12.5 25 50 100 200 400 Orbital period (days) 0.5 1 2 4 8 16 Planet size (Earth-radii) 4.9 0.6% 3.5 0.4% 0.3 0.1% 0.2 0.1% 6.6 0.9% 6.1 0.7% 0.8 0.2% 0.2 0.2% 7.7 1.3% 7.0 0.9% 0.4 0.2% 0.6 0.3% 5.8 1.6% 7.5 1.3% 1.3 0.6% 0.6 0.3% 3.2 1.6% 6.2 1.5% 2.0 0.8% 1.1 0.6% 5.0 2.1% 1.6 1.0% 1.3 0.6% 0% 1% 2% 3% 4% 5% 6% 7% 8% Planet Occurrence Fi o d a in in w e p o co B p o w th re e p R Figure credit: Petigura, Howard & Marcy (2013)

Slide 70

Slide 70 text

5 4 3 2 ln Petigura et al. (2013) Dong & Zhu (2013) linear extrapolation negligible uncertainties Foreman-Mackey et al. (2014) Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 71

Slide 71 text

5 4 3 2 ln Petigura et al. (2013) Dong & Zhu (2013) linear extrapolation negligible uncertainties Foreman-Mackey et al. (2014) Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 72

Slide 72 text

5 4 3 2 ln Petigura et al. (2013) Dong & Zhu (2013) linear extrapolation negligible uncertainties Foreman-Mackey et al. (2014) Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 73

Slide 73 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.00 0.15 0.30 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 74

Slide 74 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.00 0.15 0.30 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 75

Slide 75 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.00 0.15 0.30 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] Figure credit: Foreman-Mackey, Hogg & Morton (2014) nsity is exactly what Petigura’s extrapolation model predicts but, for comparison, we so integrate our inferred rate density over their choice of “Earth-like” bin (200  P/d 0 and 1  R/R < 2) to find a rate of Earth analogs. The published rate is 0.057+ Petigura et al. 2013b) and our posterior constraint is Z 400 day P=200 day Z 2 R R=1 R ✓ (ln P, ln R) d[ln R] d[ln P] = 0.019+0.010 0.008 . 9. Comparison with previous work Our inferred rate density of Earth analogs (Equation 22) is not consistent with previo ublished results. In particular, our result is completely inconsistent with the earlier r ased on exactly the same dataset (Petigura et al. 2013b). This inconsistency is du e di↵erent assumptions made but it merits some investigation. The two key di↵ere tween our analysis and previous work are (a) the form of the extrapolation function ) the presence of measurement uncertainties on the planet radii. The two main assumptions that we relax in this Article are the extrapolation

Slide 76

Slide 76 text

The Number of Transiting Earths nets places a probabilistic constraint on the number of xisting Kepler dataset. If we adopt the definition of “ 013b, 200  P/day < 400 and 1  R/R < 2), and integ sity function and the geometric transit probability (Equa the expected number of Earth-like exoplanets transiting e stars chosen by Petigura et al. (2013b) is N , transiting = 10.6+5.9 4.5 ainties are only on the expectation value and don’t inc e. This is an exciting result because it means that, if we planet search pipelines to small planets orbiting on long Earth analogs in the existing data. Furthermore, because ting systems in the catalog, the True expected number of biting Sun-like stars is probably larger than the values in Let's go find them!

Slide 77

Slide 77 text

The Punchline Existing methods for population inference are not robust Current data don't support claims of high Earth analog abundance There should be some transiting Earth analogs in the Kepler data

Slide 78

Slide 78 text

Extras

Slide 79

Slide 79 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.0 0.2 0.4 p(ln P/day) 0.0 0.4 0.8 p(ln R/R ) 10 100 P [days] 1 10 R [R ] Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 80

Slide 80 text

7 6 5 4 3 2 1 ln 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 p(ln ) 10 3 10 2 10 1 Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 81

Slide 81 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.0 0.2 0.4 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] Figure credit: Foreman-Mackey, Hogg & Morton (2014)

Slide 82

Slide 82 text

7 6 5 4 3 2 1 ln 0.0 0.1 0.2 0.3 0.4 0.5 0.6 p(ln ) 10 3 10 2 10 1 Figure credit: Foreman-Mackey, Hogg & Morton (2014)