Exoplanet population inference

Slide 1

Slide 1 text

EXOPLANET POPULATIONS Inferring from noisy, incomplete catalogs Dan Foreman-Mackey CCPP@NYU // github.com/dfm // @exoplaneteer // dfm.io

Slide 2

Slide 2 text

I have a confession to make…

Slide 3

Slide 3 text

HELLO my name is DFM and…

Slide 4

Slide 4 text

I — NOISE

Slide 5

Slide 5 text

Fast Gaussian processes Ambikasaran, DFM, et al. (arXiv:1403.6015)

Slide 6

Slide 6 text

“ Aren’t kernel matrices Hierarchical Oﬀ-Diagonal Low-Rank? — no astronomer ever

Slide 7

Slide 7 text

K(3) = K3 ⇥ K2 ⇥ K1 ⇥ K0 Full rank; Low-rank; Identity matrix; Zero matrix; Ambikasaran, DFM, et al. (arXiv:1403.6015)

Slide 8

Slide 8 text

time [days] Ambikasaran, DFM, et al. (arXiv:1403.6015)

Slide 9

Slide 9 text

Barclay, Endl, Huber, DFM, et al. (submitted)

Slide 10

Slide 10 text

github.com/dfm/george

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Hierarchical inference of exoplanet populations DFM, Hogg & Morton (arXiv:1406.3020)

Slide 13

Slide 13 text

Given a set of light curves and/or radial velocities, what can we say about the population of exoplanets?

Slide 14

Slide 14 text

Occurrence rate The true rate (or frequency) of exoplanets as a function of the physical parameters

Slide 15

Slide 15 text

HISTO GRAM just make a

Slide 16

Slide 16 text

Occurrence rate histogram the catalog is complete & independent the rate density function is piecewise constant the measurement uncertainties are negligible * also ignoring false positives, etc.

Slide 17

Slide 17 text

Measurement uncertainties 1

Slide 18

Slide 18 text

Given a set of light curves and/or radial velocities, what can we say about the population of exoplanets?

Slide 19

Slide 19 text

Given a set of light curves and/or radial velocities, what can we say about the population of exoplanets? p (data | population)

Slide 20

Slide 20 text

p (data | catalog) p (catalog | population) Hierarchical?

Slide 21

Slide 21 text

k = 1, · · · , K ✓ wk xk per-object parameters (period, radius, etc.) per-object observations global population p({ xk } | ✓ ) = Z p({ xk }, { wk } | ✓ ) d{ wk }

Slide 22

Slide 22 text

?? measurements p({ xk } | ✓ ) = Z p({ xk }, { wk } | ✓ ) d{ wk } = Z p({ xk } | { wk }) p({ wk } | ✓ ) d{ wk }

Slide 23

Slide 23 text

HARD this is ™

Slide 24

Slide 24 text

Hogg, Myers, & Bovy (2010) Inferring the eccentricity distribution [1008.4146]

Slide 25

Slide 25 text

w (n) k ⇠ p( wk | xk, ↵ ) What is a catalog? posterior samples interim prior

Slide 26

Slide 26 text

The maths… p({ xk } | ✓) = K Y k=1 Z p( xk | wk) p( wk | ✓) d wk = K Y k=1 Z p( xk | wk) p( wk | ✓) p( wk | xk, ↵) p( wk | xk, ↵) d wk = Z↵ K Y k=1 Z p( wk | ✓) p( wk | ↵) p( wk | xk, ↵) d wk ⇡ Z↵ K Y k=1 1 N N X n=1 p( w (n) k | ✓) p( w (n) k | ↵)

Slide 27

Slide 27 text

Hogg, Myers, & Bovy (2010) 0.0 0.2 0.4 0.6 0.8 1.0 eccentricity e 0 1 0.0 0.2 0.4 0.6 0.8 eccentricity e 0 2 0.0 0.2 0.4 0.6 0.8 1.0 eccentricity e 0 1 2 3 4 5 frequency p(e) 300 stars / ML estimates 0.0 0.2 0.4 0.6 0.8 eccentricity e 0 2 4 6 8 10 frequency p(e) 300 stars / ML estimates 3 4 5 cy p(e) 300 stars / inferred distribution 6 8 10 cy p(e) 300 stars / inferred distribution 0.0 0.2 0.4 0.6 0.8 1.0 eccentricity e 0 1 0.0 0.2 0.4 0.6 0.8 1.0 eccentricity e 0 1 2 3 4 5 frequency p(e) 300 stars / inferred distribution Fig. 2.— True, maximum-likelihood of 300 ersatz exoplanets. The top tions from which the true eccentric

Slide 28

Slide 28 text

the catalog is complete & independent the rate density function is piecewise constant the measurement uncertainties are non-negligible & known The probabilistic histogram * also ignoring false positives, etc.

Slide 29

Slide 29 text

Detection eﬃciency 2

Slide 30

Slide 30 text

DETECTION EFFICIENCY the dreaded

Slide 31

Slide 31 text

inverse-detection-eﬃciency maximum likelihood the method

Slide 32

Slide 32 text

inverse-detection-eﬃciency maximum likelihood the method “non-parametric” Howard et al. (2011), Dressing & Charbonneau (2013), Petigura et al. (2013), and more…

Slide 33

Slide 33 text

inverse-detection-eﬃciency maximum likelihood the method “non-parametric” Howard et al. (2011), Dressing & Charbonneau (2013), Petigura et al. (2013), and more… “parametric” Tabachnik & Tremaine (2002), Youdin (2011), and more…

Slide 34

Slide 34 text

in exoplanets: Tabachnik & Tremaine 2002, Youdin 2011, etc. The inhomogeneous Poisson process p ( {wk } | ✓ ) = exp ✓ Z ˆ✓( w ) d w ◆ K Y k=1 ˆ✓( wk) ˆ ✓(w) ⌘ ✓(w) Qc(w) the observable rate density ✓(ln P, ln R) = dN d ln P d ln R for example

Slide 35

Slide 35 text

the true rate density model ✓( w ) = 8 > > > > < > > > > : exp( ✓1) w 2 1, exp( ✓2) w 2 2, · · · exp( ✓J ) w 2 J , 0 otherwise The censored histogram L(✓) = ln p({wk } | ✓) = J X j=1 e✓j Z j Qc(w) dw + K X k=1 J X j=1 1[wk 2 j] [ln Qc(wk) + ✓j] the Poisson log-likelihood

Slide 36

Slide 36 text

The censored histogram exp( ✓j ⇤ ) = Nj R j Qc( w ) d w the maximum likelihood result

Slide 37

Slide 37 text

the catalog is complete & independent the rate density function is piecewise constant the measurement uncertainties are negligible The censored histogram * also ignoring false positives, etc.

Slide 38

Slide 38 text

Putting it all together 3

Slide 39

Slide 39 text

p ( { xk } | ✓) p ( { xk } | ↵) ⇡ exp ✓ Z ˆ✓(w) dw ◆ K Y k=1 1 Nk Nk X n=1 ˆ✓(w (n) k ) p (w (n) k | ↵) p({ xk } | ✓ ) = Z p({ xk } | { wk }) p({ wk } | ✓ ) d{ wk } The FML maths w (n) k ⇠ p( wk | xk, ↵ ) posterior samples

Slide 40

Slide 40 text

the catalog is complete & independent the rate density function is anything the measurement uncertainties are non-negligible & known The probabilistic, censored “histogram” * also ignoring false positives, etc.

Slide 41

Slide 41 text

the catalog is complete & independent the rate density function is piecewise constant the measurement uncertainties are non-negligible & known The probabilistic, censored “histogram” * also ignoring false positives, etc.

Slide 42

Slide 42 text

The data 4

Slide 43

Slide 43 text

color-coded as a function of P and RP . The survey completeness for small planets is a complicated function of P and RP . It decreases with increasing P and decreasing both completeness factors). stars have a planet with pe between 1 and 2 R⊕ . 5 10 20 30 40 50 100 200 300 400 Orbital period (days) 0.5 1 2 3 4 5 10 20 Planet size (Earth-radii) 0 10 20 30 40 50 60 70 80 90 100 Survey Completeness (C) % Fig. 1. on a lo detected like star color sca injection photom complet missed rence. T orbital P graph). orbital favors d Petigura, Howard & Marcy (2013)

Slide 44

Slide 44 text

Only detected the one most detectable signal in each light curve Petigura, Howard & Marcy (2013)

Slide 45

Slide 45 text

6.25 12.5 25 50 100 200 400 Orbital period (days) 0.5 1 2 4 8 16 Planet size (Earth-radii) 4.9 0.6% 3.5 0.4% 0.3 0.1% 0.2 0.1% 6.6 0.9% 6.1 0.7% 0.8 0.2% 0.2 0.2% 7.7 1.3% 7.0 0.9% 0.4 0.2% 0.6 0.3% 5.8 1.6% 7.5 1.3% 1.3 0.6% 0.6 0.3% 3.2 1.6% 6.2 1.5% 2.0 0.8% 1.1 0.6% 5.0 2.1% 1.6 1.0% 1.3 0.6% 0% 1% 2% 3% 4% 5% 6% 7% 8% Planet Occurrence Fig. 2. Plan orbital perio d and RP = 0: are shown as in orbital pe in a cell is where the su each cell. He planets (for of the orbita completenes Best42k sam planet occur occurrence w where the c the small pl rence is cons entire range ports mild e RP = 1 − 2 R⊕ Petigura, Howard & Marcy (2013)

Slide 46

Slide 46 text

the catalog is complete & independent the rate density function is piecewise constant the measurement uncertainties are negligible Petigura et al. assumed… * also ignoring false positives, etc.

Slide 47

Slide 47 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.00 0.15 0.30 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] real data

Slide 48

Slide 48 text

2 3 4 5 6 ln P/day 10 3 10 2 10 1 (ln P/day) all 0.5  R/R < 2 2  R/R < 8 8  R/R < 32 10 100 P/day real data

Slide 49

Slide 49 text

0 1 2 3 ln R/R 10 3 10 2 10 1 100 (ln R/R ) all 6.25  P/day < 25 25  P/day < 100 100  P/day < 400 1 10 R/R real data

Slide 50

Slide 50 text

0 4 8 12 16 R/R 10 3 10 2 10 1 100 (R/R ) all 6.25  P/day < 25 25  P/day < 100 100  P/day < 400 real data

Slide 51

Slide 51 text

CARE? why should I

Slide 52

Slide 52 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.0 0.2 0.4 p(ln P/day) 0.0 0.4 0.8 p(ln R/R ) 10 100 P [days] 1 10 R [R ] simulated catalog A

Slide 53

Slide 53 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.0 0.2 0.4 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] simulated catalog B

Slide 54

Slide 54 text

The rate density of Earth analogs ing distribution. 6. Extrapolation to Earth well as inferring the occurrence distribution of exoplanets, this dataset constrain the rate density of Earth analogs. Explicitly, we constrain the nsity of exoplanets orbiting “Sun-like” stars11, evaluated at the location = (ln P , ln R ) = dN d ln P d ln R R=R , P=P . is the rate density of exoplanets around a Sun-like star (expected per star per natural logarithm of period per natural logarithm of radius eriod and radius of Earth. Equation (23), we deﬁne “Earth analog” in terms of measurable quantit

Slide 55

Slide 55 text

action of stars having nearly Earth-size planets ð1 − 2 R⊕Þ with any orbital period up to a maximum period, P, on the horiz rly Earth size ð1 − 2 R⊕Þ are included. This cumulative distribution reaches 20.2% at P = 50 d, meaning 20.4% of Sun-like stars Extrapolation: Petigura, Howard & Marcy (2013)

Slide 56

Slide 56 text

Extrapolation: Foreman-Mackey, Hogg & Morton (submitted) Gaussian Process use a the rate function should be “smooth”

Slide 57

Slide 57 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.0 0.2 0.4 p(ln P/day) 0.0 0.4 0.8 p(ln R/R ) 10 100 P [days] 1 10 R [R ] simulated catalog A

Slide 58

Slide 58 text

4.0 3.5 3.0 2.5 2.0 1.5 1.0 ln 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 p(ln ) 101 [%] simulated catalog A

Slide 59

Slide 59 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.0 0.2 0.4 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] simulated catalog B

Slide 60

Slide 60 text

6 5 4 3 2 1 0 ln 0.0 0.1 0.2 0.3 0.4 0.5 0.6 p(ln ) 100 101 102 [%] simulated catalog B

Slide 61

Slide 61 text

2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 ln P/day 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 ln R/R 0.00 0.15 0.30 p(ln P/day) 0.0 0.5 1.0 p(ln R/R ) 10 100 P [days] 1 10 R [R ] real data

Slide 62

Slide 62 text

10 9 8 7 6 5 4 3 2 1 ln 0.0 0.1 0.2 0.3 0.4 0.5 0.6 p(ln ) 10 2 10 1 100 101 [%] real data Foreman-Mackey, Hogg & Morton Petigura, Howard & Marcy

Slide 63

Slide 63 text

10 9 8 7 6 5 4 3 2 1 ln 0.0 0.1 0.2 0.3 0.4 0.5 0.6 p(ln ) 10 2 10 1 100 101 [%] real data Foreman-Mackey, Hogg & Morton Petigura, Howard & Marcy on the rate density of Earth analogs (as deﬁned h our result has large fractional uncertainty—with This is shown in Figure 9 where we compare the m for to the published value and uncertainty. Qu y of Earth analogs is = 0.017+0.018 0.009 nat 2 dicates that this quantity is a rate density, per natur mic radius. Converted to these units, Petigura et a the same quantity (indicated as the vertical lines in hat Petigura’s extrapolation model predicts but, for ferred rate density over their choice of “Earth-like”

Slide 64

Slide 64 text

Conclusions & Summary. 4

Slide 65

Slide 65 text

the catalog is complete & independent the rate density function is anything the measurement uncertainties are non-negligible & known * also ignoring false positives, etc.