How to find a transiting exoplanets

How to find a transiting exoplanets

A colloquium about noise.

00c684a144d49f612a51e855eb326d6c?s=128

Dan Foreman-Mackey

May 16, 2017
Tweet

Transcript

  1. Dan Foreman-Mackey Sagan Fellow / University of Washington @exoplaneteer /

    dfm.io / github.com/dfm How to find a transiting exoplanet data-driven discovery in the astronomical time domain
  2. Dan Foreman-Mackey Sagan Fellow / University of Washington @exoplaneteer /

    dfm.io / github.com/dfm Noise models and some more noise models
  3. Let me introduce myself…

  4. I build tools. and when I say "tools" I actually

    mean "software"…
  5. None
  6. None
  7. Exoplanets

  8. How We Find Exoplanets

  9. transit radial velocity direct imaging microlensing timing 2712 692 52

    40 25 Data Source: The Open Exoplanet Catalogue
  10. Data Source: The Open Exoplanet Catalogue 2000 2005 2010 2015

    year 0 500 1000 confirmed exoplanets transit RV microlensing direct imaging timing
  11. Kepler Credit: NASA

  12. Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 1

    10 100 1000 orbital period [days] 1 10 planet radius [R ]
  13. So what?

  14. The Population of Exoplanets

  15. The population of exoplanets 1 occurrence rates 2 physics

  16. Burke et al. (2015) the data, rises toward small planets

    with a = -1.8 2 and has a break near the edge of the parameter space. Given the low numbers of observed planet candidates in the smallest planet bins, the full posterior allowed behavior (1σ orange region ; 3σ Figure 6) the occurrence rates in the smallest Rp bins. (b) The more complicated model ensures the ability to adapt to variations in the PLDF in the sensitivity analysis of Section 6.2. (c) Previous work on Kepler planet occurrence rates indicated a break in the planet population for 1 2.0 Rp  2.8 Å R (Fressin et al. 2013; Petigura et al. 2013a, 2013b; Silburt et al. 2015). (d) Finally, extending this work to a larger parameter space and for alternative target selection samples, such as the Kepler M dwarf sample where a sharp break at Rp ∼ 2.5 Å R is observed (Dressing & Charbonneau 2013; Burke et al. 2015), the double power law in Rp is strongly (BIC >10) warranted. Symptomatic of the weak evidence for a broken power law model over the ⩽ 0.75 Rp ⩽ 2.5 Å R range, Rbrk is not constrained within the prior Rp limits of the parameter space. When Rbrk is near the lower and upper Rp limits, a1 and a2 also become poorly constrained, respectively. To provide a more meaningful constraint on the average power law behavior for Rp in the double power law PLDF model, we introduce aavg , which we set to a a = avg 1 if ⩾ R R brk mid and a a = avg 2 otherwise, where Rmid is the midpoint between the upper and lower limits of Rp . We find a = -1.54 0.5 avg and b = -0.68 0.17 for our baseline result. We use aavg as a summary statistic for the model parameters only to enable a simpler comparison of our results to independent analyses of planet occurrence rates and to approximate the behavior for the power law Rp dependence if we had used the simpler single power law model. The results for a single power law model in both Rp and P orb are equivalent to the results for the double Figure 7. Same as Figure 6, but marginalized over 0.75 < Rp < 2.5 Å R and bins of dP orb = 31.25 days. Figure 8. Shows the underlying planet occurrence rate model. Marginalized over 50 < P orb < 300 days and bins of dRp =0.25 Å R planet occurrence rates for the model parameters that maximize the likelihood (white dash line). Posterior distribution for the underlying planet occurrence rate for the median (blue solid line), 1σ region (orange region), and 3σ region (blue region). An approximate PLDF based upon results from Petigura et al. (2013a) for comparison (dash dot line). Figure 9. Same as Figure 8, but marginalized over 0.75 < Rp < 2.5 Å R and bins of dP orb =31.25 days. Figure 6) the occurrence rates in the smallest Rp bins. (b) The more complicated model ensures the ability to adapt to variations in the PLDF in the sensitivity analysis of Section 6.2. (c) Previous work on Kepler planet occurrence rates indicated a break in the planet population for 1 2.0 Rp  2.8 Å R (Fressin et al. 2013; Petigura et al. 2013a, 2013b; Silburt et al. 2015). (d) Finally, extending this work to a larger parameter space and for alternative target selection samples, such as the Kepler M gure 7. Same as Figure 6, but marginalized over 0.75 < Rp < 2.5 Å R and bins dP orb = 31.25 days. Figure 9. Same as Figure 8, but marginalized over 0.75 < Rp < 2.5 Å R and bins of dP orb =31.25 days. he Astrophysical Journal, 809:8 (19pp), 2015 August 10 Burke et al.
  17. Kepler and the Transit Method (the spacecraft) 

  18. Credit: NASA/European Space Agency

  19. Jupiter Credit: NASA/European Space Agency

  20. Jupiter Earth Credit: NASA/European Space Agency

  21. None
  22. 1.0 0.5 0.0 0.5 1.0 time since transit [days] 100

    50 0 relative brightness [ppm]
  23. …but this is the real world. A few problems: 1

    Timing 2 Geometry 3 Spacecraft motion 4 Intrinsic brightness variation
  24. …but this is the real world. A few problems: 1

    Timing 2 Geometry 3 Spacecraft motion 4 Intrinsic brightness variation transit probability
  25. …but this is the real world. A few problems: 1

    Timing 2 Geometry 3 Spacecraft motion 4 Intrinsic brightness variation transit probability noise!
  26. Credit: NASA

  27. Credit: NASA 190,000 stars for 4 years at 30 minute

    cadence with 10-3 pixel pointing precision
  28. Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 1

    10 100 1000 orbital period [days] 1 10 planet radius [R ]
  29. 1 Kepler (2009) 2 K2 (2014) 3 TESS (2018) 4

    PLATO (2025)
  30. Population Inference

  31. Ingredients 1 Systematic target selection & catalog of stellar properties

    2 Systematic catalog of planets 3 Quantified completeness & reliability 4 False positive rates & other effects (e.g. multiplicity)
  32. Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 1

    10 100 1000 orbital period [days] 1 10 planet radius [R ]
  33. Burke, et al. (2015) model et al. ming e and

    al. g by f the ough tudy, peline planet planet hlight matic with ng & e we e the ump- icity, Figure 1. Fractional completeness model for the host to Kepler-22b (KIC: 10593626) in the Q1-Q16 pipeline run using the analytic model described in Section 2. Burke et al.
  34. We need 1 Fully automated methods for planet discovery 2

    Rigorous methods for population inference
  35. How to Find a Transiting Exoplanet

  36. Science. physics data

  37. Science. physics data a model

  38. None
  39. None
  40. None
  41. None
  42. star

  43. spacecraft star

  44. detector spacecraft star

  45. detector spacecraft star planet?

  46. + planet star spacecraft detector observation + + =

  47. + planet star spacecraft detector observation + + = PHYSICS

  48. + planet star spacecraft detector observation + + = PHYSICS

    ????
  49. The way we draw transits…

  50. …and the way we should draw transits interesting boring boring

  51. + planet star spacecraft detector observation + + = PHYSICS

    DATA-DRIVEN MODELS
  52. + planet star spacecraft detector observation + + = PHYSICS

    DATA-DRIVEN MODELS (Gaussian Process)
  53. How to find a transiting exoplanet 1 Fit & remove

    data-driven noise model 2 Matched filter grid search for candidate signals 3 Vet candidates to remove false alarms
  54. Scalable Methods An aside...

  55. Medium data; big questions… 1 Kepler 2 K2 3 TESS

    190,000 stars 60,000 obs. per star 250,000 stars 4,000 obs. per star 500,000 stars 20,000 obs. per star approximately…
  56. Scaling of Gaussian Processes O(N3) Cholesky factorization

  57. Scaling of Gaussian Processes O(N3) Cholesky factorization O ( N

    log 2 N ) Approximate methods Ambikasaran, DFM, et al. (2016); arXiv:1403.6015
  58. Scaling of Gaussian Processes O(N3) Cholesky factorization O ( N

    log 2 N ) Approximate methods Ambikasaran, DFM, et al. (2016); arXiv:1403.6015 O(N) Exploiting structure of specific 1D kernels DFM, et al. (submitted); arXiv:1703.09710
  59. DFM, et al. (submitted); arXiv:1703.09710 102 103 104 105 number

    of data points [N] 10 5 10 4 10 3 10 2 10 1 100 computational cost [seconds] 1 2 4 8 16 32 64 128 256 direct O(N) 100 numb github.com/dfm/celerite
  60. Pause… time for some examples!

  61. The Frequency of Jupiter Analogs 1

  62. Tim Morton (Princeton) David Hogg (NYU) Eric Agol (UW) Bernhard

    Schölkopf (MPIS) in collaboration with… DFM, et al. (2016) arXiv:1607.08237
  63. 1 10 100 orbital period [days] 1 10 planet radius

    [R ] Data Source: The NASA Exoplanet Archive
  64. Data Source: The NASA Exoplanet Archive 1 10 100 orbital

    period [days] 1 10 planet radius [R ]
  65. Data Source: The NASA Exoplanet Archive 1 10 100 1000

    10000 orbital period [days] 1 10 planet radius [R ]
  66. Data Source: The NASA Exoplanet Archive 1 10 100 1000

    10000 orbital period [days] 1 10 planet radius [R ]
  67. Why Kepler? Radial velocity, microlensing, etc. better suited…

  68. 1 Systematic target selection & catalog of stellar properties 2

    Systematic catalog of planets 3 Quantified completeness & reliability 4 False positive rates & other effects (e.g. multiplicity)
  69. Data Source: The NASA Exoplanet Archive 1 10 100 1000

    10000 orbital period [days] 1 10 planet radius [R ]
  70. DFM et al. (2016); arXiv:1607.08237 1 10 100 1000 10000

    orbital period [days] 1 10 planet radius [R ] Data Source: The NASA Exoplanet Archive
  71. How to find a transiting exoplanet 1 Fit & remove

    data-driven noise model 2 Matched filter grid search for candidate signals 3 Vet candidates to remove false alarms
  72. + planet star spacecraft detector observation + + = PHYSICS

    GAUSSIAN PROCESS CAUSAL MODEL (PCA) PHOTON NOISE
  73. How to find a transiting exoplanet 1 Fit & remove

    data-driven noise model 2 Matched filter grid search for candidate signals 3 Vet candidates to remove false alarms
  74. DFM, et al. (2016) 40 20 0 20 40 hours

    since event (a) variability KIC 7220674 40 20 0 20 40 hours since event (b) step KIC 8631697 40 20 0 20 40 hours since event (c) box KIC 5521451 40 20 0 20 40 hours since event (d) transit KIC 8505215
  75. 12 Foreman-Mackey, Hogg, Morton, et al. 0.50 0.25 0.00 10321319

    1.2 0.6 0.0 10287723 1.6 0.8 0.0 8505215 0.8 0.0 6551440 0.8 0.0 8738735 3 2 1 0 8800954 4 2 0 10187159 4 2 0 3218908 3.0 1.5 0.0 4754460 5.0 2.5 0.0 8410697 4 2 0 10842718 8 4 0 11709124 16 8 0 3239945 4 2 0 8426957 50 25 0 9306307 80 40 0 10602068 Figure 3. Sections of PDC light curve centered on each candidate (black) with the posterior-median transit model over-plotted (orange). Candidates with two transits are folded on the posterior-median DFM, et al. (2016)
  76. 1 Systematic target selection & catalog of stellar properties 2

    Systematic catalog of planets 3 Quantified completeness & reliability 4 False positive rates & other effects (e.g. multiplicity)
  77. nuisance boring boring

  78. DFM, et al. (2016) 3 5 10 20 period [years]

    0.2 0.5 1.0 2.0 RP /RJ 0.048 0.211 0.499 0.669 0.727 0.710 0.635 0.046 0.194 0.468 0.616 0.657 0.630 0.569 0.043 0.193 0.460 0.605 0.623 0.591 0.520 0.038 0.174 0.433 0.529 0.529 0.492 0.427 0.0 0.3 0.6 0.0 0.3 0.6
  79. DFM, et al. (2016) 2.00 ± 0.72 planets per G/K-

    dwarf occurrence rate in range: 2 – 25 years, 0.1 – 1 RJ
  80. EVEREST: A Noise Model for K2 2

  81. Credit: NASA R.I.P. Kepler

  82. cbna Flickr user Aamir Choudhry K2

  83. https://keplerscience.arc.nasa.gov/k2-fields.html

  84. Adapted from a similar figure by Ian Crossfield baseline number

    of targets TESS  K2  Kepler
  85. 3.4 3.6 3.8 4.0 log10 Te↵ 0 2 4 log10

    g Kepler 3.4 3.6 3.8 4.0 log10 Te↵ K2 Data Source: The NASA Exoplanet Archive; 5/13/2017
  86. None
  87. None
  88. 4000 2000 0 2000 4000 raw: 301 ppm EPIC 201374602;

    Kp = 11.5 mag 10 20 30 40 50 60 70 80 time [BJD - 2456808] 400 0 400 residuals: 35 ppm relative brightness [ppm] 4000 2000 0 2000 4000 raw: 301 ppm EPIC 201374602; Kp = 11.5 mag 10 20 30 40 50 60 70 80 time [BJD - 2456808] 400 0 400 residuals: 35 ppm relative brightness [ppm]
  89. cbna Flickr user Aamir Choudhry Luger, et al. (2016, 2017)

    led by… Rodrigo Luger & Ethan Kruse
  90. + planet star spacecraft detector observation + + = PHYSICS

    GAUSSIAN PROCESS CAUSAL MODEL + PIXEL-LEVEL DECORRELATION PHOTON NOISE inspired by: Vanderburg & Johnson (2014) Crossfield, et al. (2015) Aigrain, et al. (2015) DFM, et al. (2015) Deming, et al. (2015) + more
  91. Figure credit: Rodrigo Luger Ideal Observed

  92. Pixel-level decorrelation (PLD) if background is correctly subtracted, and astrophysical

    signal is multiplicative, then the fractional astrophysical contribution is equal in all pixels. Deming, et al. (2015); Luger, et al. (2016, 2017) ˆ pn(t) = pn(t) PN k=1 pn(t) estimator for instrumental signal estimator for astrophysical signal pixel time series
  93. = Figure credit: Rodrigo Luger; Deming, et al. (2015); Luger,

    et al. (2016, 2017) Pixel-level decorrelation (PLD) ÷
  94. + planet star spacecraft detector observation + + = PHYSICS

    GAUSSIAN PROCESS CAUSAL MODEL + PIXEL-LEVEL DECORRELATION PHOTON NOISE
  95. Luger, et al. (2016); see also Aigrain, et al. (2015)

  96. EVEREST + planet star spacecraft detector observation + + =

    PHYSICS GAUSSIAN PROCESS CAUSAL MODEL + PIXEL-LEVEL DECORRELATION PHOTON NOISE
  97. EVEREST + planet star spacecraft detector observation + + =

    PHYSICS GAUSSIAN PROCESS CAUSAL MODEL + PIXEL-LEVEL DECORRELATION PHOTON NOISE
  98. g. 3.— Cross-validation procedure for first order PLD o 03150

    (WASP-47 e), a campaign 3 planet host. Show ter v in the validation set (red) and the scatter in the (blue) as a function of , the prior amplitude for Luger, Kruse, DFM, et al. (2017)
  99. Luger, Kruse, DFM, et al. (2017) Kp = 15; for

    campaigns 3, 4, and 8, EVEREST recovers the Kepler precision dow of (variable) giant stars, leading to a higher average CDPP, while campaign 7 change in the orientation of the spacecraft and excess jitter. Fig. 20.— The same as Figure 19, but comparing the CDPP of all K2 stars to that of Kepler . EVEREST 2.0 recovers the original Kepler photometric precision down to at least Kp = 14, and past contam the in which inated valida fects o overfit spacec get ap of the apertu a time overfit § 3.7, o this be In F ing bin overfit light c binary
  100. EVEREST 2.0 7 This pro- ma of the v n

    , and that e of the seg- 3, where we sections for e minimum al line indi- se between and slight re conserva- ith nPLD to report our and a com- arisons with curves. We proxy 6 hr h we calcu- we smooth clip outliers deviation in Luger, Kruse, DFM, et al. (2017)
  101. Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 1

    10 100 1000 orbital period [days] 1 10 planet radius [R ]
  102. 1 10 100 1000 orbital period [days] 1 10 planet

    radius [R ] Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 Kruse, et al. (in prep)
  103. 1 10 100 1000 orbital period [days] 1 10 planet

    radius [R ] Data Source: The NASA Exoplanet Archive; Kepler DR25; 5/13/2017 Kruse, et al. (in prep) 800 candidates 500 new
  104. Population inference? as a function of host star properties?

  105. 40 50 60 70 80 0.985 0.990 0.995 1.000 1.005

    90 100 110 0.985 0.990 0.995 1.000 1.005 −0.05 0.00 0.05 0.90 0.92 0.94 0.96 0.98 1.00 −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06 . . . . a b c d K2 long cadence data Barycentric Julian Date − 2,457,700 [day] Relative brightness Relative brightness 1b 1c 1d 1e 1f 1g 1h 1b 1c 1d 1e 1f 1g 1h Time from mid−transit [day] Relative brightness transit 1 transit 2 transit 3 transit 4 folded lightcurve Orbital separation [AU] Figure 1: a, b : Long cadence K2 light curve detrended with EVEREST and with stellar variability removed. Data points are in black, and our highest likelihood transit model for all seven planets TRAPPIST-1h: Luger, Sestovic, Kruse, et al. (2017); arXiv:1703.04166 embargoed
  106. These Noise Models are Models of Stars 3

  107. nuisance! interesting interesting

  108. Suzanne Aigrain (Oxford) Vinesh Rajpaul (Oxford) Eric Agol (UW) Sivaram

    Ambikasaran (Indian Inst. of Sci.) in collaboration with… Angus, et al. (submitted) DFM, et al. (submitted) Ruth Angus (Columbia) led by…
  109. Figure credit: Ruth Angus 100 101 Age (Gyr) 101 102

    Rotation period (days) Coma Berenices Praesepe Hyades NGC 6811 NGC 6819 The Sun Asteroseismic targets M67 (Esselstein, in prep)
  110. Angus, et al. (submitted); github.com/RuthAngus/GProtation

  111. ctive model should ers and be flexible QP behaviour. A

    irements. We thus a method to prob- ation periods. This e rotation period, rtainty. arning community iology, geophysics used in the stellar e stellar variability l. 2012; Haywood 5; Haywood 2015; t al. 2015; Rajpaul eful in regression cifically when the variate Gaussian. If n in N dimensions, can describe that ocesses is provided tween data points demonstration, we ight curve of KIC s once every ⇠ 30.5 FGK stars. Clearly, summit of the Mauna Loa volcano in Hawaii (data from Keeling and Whorf 2004) using a kernel which is the product of a periodic and a SE kernel: the QP kernel. This kernel is defined as ki , j = A exp 2 6 6 6 6 4 ( xi xj )2 2 l 2 2 sin2 ⇡( xi xj ) P !3 7 7 7 7 5 + 2 ij . (2) It is the product of the SE kernel function, which describes the overall covariance decay, and an exponentiated, squared, sinusoidal kernel function that describes the periodic covariance structure. P can be interpreted as the rotation period of the star, and controls the amplitude of the sin2 term. If is very large, only points almost exactly one period away are tightly correlated and points that are slightly more or less than one period away are very loosely cor- related. If is small, points separated by one period are tightly correlated, and points separated by slightly more or less are still highly correlated, although less so. In other words, large values of lead to periodic variations with increasingly complex harmonic con- tent. This kernel function allows two data points that are separated in time by one rotation period to be tightly correlated, while also allowing points separated by half a period to be weakly correlated. The additional parameter captures white noise by adding a term to the diagonal of the covariance matrix. This can be interpreted to represent underestimation of observational uncertainties — if the uncertainties reported on the data are too small, it will be non- zero — or it can capture any remaining “jitter,” or residuals not captured by the e ective GP model. We use this QP kernel function (Equation 2) to produce the GP model that fits the Kepler light curve 0 20 40 time [days] 1.0 0.5 0.0 0.5 1.0 relative flux [ppt] Kepler light curve 10 1 100 ! [days 1] 10 3 10 2 10 1 S(!) power spectrum 0 0.000 0.025 0.050 0.075 0.100 0.125 k(⌧) 3.50 3.75 4.00 4.25 rotation period [days] Angus, et al. (submitted); github.com/RuthAngus/GProtation
  112. 0 1 2 3 4 ln(Injected Period) 2 0 2

    4 6 ln(Recovered Period) 7 6 5 4 3 ln (Amplitude) Angus, et al. (submitted); github.com/RuthAngus/GProtation
  113. github.com/ dfm/peerless rodluger/everest RuthAngus/GProtation dfm/celerite Jupiter analogs K2 de-trending GP

    models of rotation fast 1D GPs Open science
  114. Summary 1 Find exoplanets 2 Learn about stars Build data-driven

    noise models and… Dan Foreman-Mackey Sagan Fellow / University of Washington @exoplaneteer / dfm.io / github.com/dfm