$30 off During Our Annual Pro Sale. View Details »

Nathan Hara

Nathan Hara

(Université de Genève)

https://s3-seminar.github.io/seminars/nathan-hara/

Title — An optimal exoplanet detection criterion

Abstract — Over 4000 exoplanets have been detected so far. They have deeply transformed our understanding of planetary system formation, and expanded our possibilities to search for life outside of Earth. The smaller and the more distant to their host stars exoplanets are, the harder they are to detect. Earth « twins » orbiting solar-type stars are still out of reach because of very complex astrophysical and instrumental noises. To overcome this difficulty, we need new methods to analyse unevenly sampled, multi-variate time series: better models, computational methods and decision rules to claim detections. In this talk I will mostly focus on the latter aspect. Exoplanet detections are claimed based on the value of a statistical significance metric: if it is greater than a certain threshold, a detection is claimed. I will address the question of the optimal significance metric in the general setting of detection of parametric signals, and advocate for a Bayesian hypothesis testing framework where hypotheses are indexed by continuous variables.

Biography — Nathan Hara is a research fellow at the university of Geneva since 2017, which he joined after a PhD with Jacques Laskar and Gwenaël Boué at Paris Observatory. He works on statistical techniques to detect exoplanets, in particular Earth twins, which are prime candidates for the detection of life outside of Earth, and observational programs to unveil multi-planetary systems.

S³ Seminar

April 22, 2022
Tweet

More Decks by S³ Seminar

Other Decks in Science

Transcript

  1. An optimal detection criterion for parametric signals Nathan Hara Université

    de Genève GPR V Oxfor d 29 March 2022 With Thibault de Poyferré, Jean-Baptiste Delisle, Marc Ho ff mann Nicolas Unger, Rodrigo Díaz, Damien Ségransan
  2. Radial velocity Star Spectrograph Observer 2

  3. Detecting exoplanets in RV data SCMA VII -3 -2 -1

    2 0 1 ×10 -6 z (AU) 2 3 Motion of the star in the observational reference frame 0 4 ×10 -6 y (AU) 2 ×10 -6 x (AU) -2 0 -2 -4 -4 Motion of the star To observer 0 200 400 600 800 1000 Time (days) -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 Velocity along z axis (m/s) Radial velocity as a function of time 2 10 1 0 5 ×10 -5 y (AU) -1 -2 x (AU) ×10 -6 0 Motion of the star in the observational reference frame -5 1.5 ×10 -5 z (AU) 1 0.5 0 -0.5 -1 Motion of the star To observer 0 200 400 600 800 1000 Time (days) -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 Velocity along z axis (m/s) Radial velocity as a function of time Motion of the star Radial velocity Decomposition of the signal in periodic components The amplitude of a periodic component is proportional to the planet projected mass Radial velocities 3
  4. Detecting exoplanets in RV data SCMA VII Signal shape The

    signal shape depends on the orbital eccentricity Nearly circular Eccentric Very eccentric Orbit Signal Star Planet Credit: Perryman 2011 Radial velocity 4
  5. Detecting exoplanets in RV data SCMA VII Data model Signal

    Credit: Perryman 2011 Radial velocity Time-series Sum of Keplerian components Other deterministic terms <latexit sha1_base64="rM8fdTI79IFdekIjzt9dQ7dgjno=">AAADKnicjVHPaxQxGH0df7TWX6sevQSXwpaFZaaI9lIo9iJ4qeC2hU4dMmm6hk4mwyRTXIb9j/xPvHkpxasHPXq1l36J01ItRTPMzMv73nvJl+RVoayL45O56MbNW7fnF+4s3r13/8HD3qPHW9Y0tZBjYQpT7+TcykKVcuyUK+ROVUuu80Ju54cbvr59JGurTPnOTSu5p/mkVAdKcEdU1pOMTQduma2x1DY6LZRWzmatWktm79syq2bsTabYIBXGDo4yRdJharSc8EwtD2WmfIFdMGzI0gnXmnvg5EfXlkZZOct6/XgUh8GugqQDfXRj0/SOkWIfBgINNCRKOMIFOCw9u0gQoyJuDy1xNSEV6hIzLJK3IZUkBSf2kL4Tmu12bElzn2mDW9AqBb01ORmWyGNIVxP2q7FQb0KyZ6/LbkOm39uU/nmXpYl1+EDsv3znyv/1+V4cDrAaelDUUxUY353oUppwKn7n7FJXjhIq4jzep3pNWATn+Tmz4LGhd3+2PNS/B6Vn/Vx02gY//C7pgpO/r/Mq2FoZJS9GK2+f99dfdVe9gKd4hgHd50us4zU2Mabsz/iJXziNPkVfopPo629pNNd5nuCPEX07A6PbtiM=</latexit> y(t) = np X i=1 Ki(cos(vi(t) + !i) + ei cos !i) + + noise Radial velocity 5 Stochastic term
  6. Detecting exoplanets in RV data SCMA VII Noise I 6

    Photon noise Instrumental systematics Example: SOPHIE drift as a function of time Credit: François Bouchy Nominal error bars + jitter (instrumental and/or stellar) Error on measurement = <latexit sha1_base64="mka8LqAdjz/M3hJiWy1bHnc56lI=">AAACxXicjVHLSsNAFD2Nr1pfVZdugkVwVZIi6rLoQpdVbCvUIsl0WofmxWRSKEX8Abf6a+If6F94Z0xBLaITkpw5954zc+/1k0CkynFeC9bc/MLiUnG5tLK6tr5R3txqpXEmGW+yOIjlte+lPBARbyqhAn6dSO6FfsDb/vBUx9sjLlMRR1dqnPBu6A0i0RfMU0RditJtueJUHbPsWeDmoIJ8NeLyC27QQwyGDCE4IijCATyk9HTgwkFCXBcT4iQhYeIc9yiRNqMsThkesUP6DmjXydmI9tozNWpGpwT0SlLa2CNNTHmSsD7NNvHMOGv2N++J8dR3G9Pfz71CYhXuiP1LN838r07XotDHsalBUE2JYXR1LHfJTFf0ze0vVSlySIjTuEdxSZgZ5bTPttGkpnbdW8/E30ymZvWe5bkZ3vUtacDuz3HOglat6h5WaxcHlfpJPuoidrCLfZrnEeo4RwNN8u7jEU94ts6s0FLW6DPVKuSabXxb1sMHh6+PiQ==</latexit> i <latexit sha1_base64="WPhJn8KxDww45XWZoqeMqEgrO2I=">AAAC43icjVHLSgMxFD2O73fVpSCDRRCEMi2iLotuxJWC1YK1ZWYaa3BeJhlBijt37sStP+BW/0X8A/0Lb2IKPhDNMDPnnnvPSW5ukEVcKs976XP6BwaHhkdGx8YnJqemCzOzBzLNRchqYRqloh74kkU8YTXFVcTqmWB+HETsMDjb0vnDCyYkT5N9dZmx49jvJPyEh74iqlVYaMhzoboNyTux3+LNirvi2mCnWblqFYpeyTPL/QnKFhRh125aeEYDbaQIkSMGQwJFOIIPSc8RyvCQEXeMLnGCEDd5hiuMkTanKkYVPrFn9O1QdGTZhGLtKY06pF0iegUpXSyRJqU6QVjv5pp8bpw1+5t313jqs13SP7BeMbEKp8T+petV/lene1E4wYbpgVNPmWF0d6F1yc2t6JO7n7pS5JARp3Gb8oJwaJS9e3aNRpre9d36Jv9qKjWr49DW5njTp6QBl7+P8yc4qJTKa6XK3mqxumlHPYJ5LGKZ5rmOKraxixp5X+MBj3hymHPj3Dp3H6VOn9XM4cty7t8B2mmbUw==</latexit> q 2 i + 2 J Nominal jitter Not on all spectrographs!
  7. Detecting exoplanets in RV data SCMA VII Credit: NASA/SDO Convection

    cells on the surface of the star (granulation) Creates noise at the time-scale of the stellar rotation period Creates correlated noise Noise II: stellar activity 7 Saar & Donahue 1997, Meunier et al. 2010, Boisse et al. 2011, Dumusque et al. 2014 See also Cegla 2019 From Dumusque et al. 2011 Stochastic apparition of spots and faculae on the surface + Inhibition of the convective blueshift Approaching limb Receding limb Super granulation Meso granulation granulation P-modes
  8. Detecting exoplanets in RV data SCMA VII Stellar activity effect

    0 200 400 600 800 1000 1200 1400 1600 Time (days) -20 -10 0 10 20 RV (m/s) Ideal and noisy RV signals Observations Ideal signal 10 0 10 1 10 2 10 3 10 4 Period (days) 0 0.2 0.4 0.6 Normalized RSS Ideal and noisy RV signals, Generalized Lomb-Scargle periodogram Observations Ideal signal 8 Simulated observations: System 1, RV fi tting challenge Dumusque et al. 2017 Periods of the injected planets Low frequency structures Stellar rotation
  9. Detecting exoplanets in RV data SCMA VII • Unevenly sampled

    Close to ~1 day sampling step with missing samples • ~ 40 - 1000 data points • Corrupted by uncorrelated and complex correlated noise -15 -10 -5 0 5 10 15 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 5000 ΔRV [m/s] Date (BJD - 2,450,000.0) [d] Radial velocity time-series: summary Main characteristics Objectives • How many planets? • With what orbital elements? 9
  10. Detecting exoplanets in RV data SCMA VII Sun radial velocities

    observed by HARPS-N Expected signal due to the Earth The sun is observed as a planet hosting star Challenge II: dealing with the complex noises 10 Dumusque et al. 2021 Collier-Cameron et al. 2019 Credit: Annelies Mortier We need to deal with these complex noises to detect exo-Earths Credit: NASA/SDO + instrumental 
 effects
  11. RV data analysis in a nutshell p(θ, η ∣ y)

    ≈ p(θ, η ∣ I, ̂ RV ) = p(I, ̂ RV ∣ θ, η)p(θ, η) p(I, ̂ RV ) ̂ RV = RVcenter of mass + RVcontam + measurement error Planet parameters Other parameters Data: Shape variation indicators: θ η y I How to reduce the spectrum? What model do I use? How do I compute everything? Based on this, how do I take a decision on the number of planets? RV_center of mass is a pure Doppler shift Other effects also affect the spectral shape Upcoming review with Eric Ford
  12. RV data analysis in a nutshell reduce model compute decide

    (reduce model decide) compute
  13. Decision: how many planets? 13 Periodograms 10 0 10 1

    10 2 10 3 10 4 Period (days) 0 0.1 0.2 0.3 0.4 0.5 0.6 Normalized RSS Generalized Lomb-Scargle periodogram 3 sines with SNR 10 Periodogram True spectrum Tallest peak More precise but Fast, numerically stable but Looks for one planet at a time Much heavier computational workload, convergence not trivial to ensure, not giving information on the period <latexit sha1_base64="v28QT0p8Zol+rIpb1EvQgmBwi24=">AAADBnicjVHLSsNAFD3GV31XXboZLIKClFRE3Qg+EFwqWBWslEk66tB0EpKJWNru/RN37sStP+BW8Q/0L7wzjeAD0QlJzj33njNz53pRIBPtui89Tm9f/8Bgbmh4ZHRsfCI/OXWYhGnsi7IfBmF87PFEBFKJspY6EMdRLHjDC8SRV982+aNLEScyVAe6GYnTBj9X8kz6XBNVzW/uVFuKVbS40i0WBVwJnXQ6bJ1VpNIsmm+2K/pCaL7I1AKF3aBtghrrBtV8wS26drGfoJSBArK1F+afUUENIXykaEBAQRMOwJHQc4ISXETEnaJFXExI2rxAB8OkTalKUAUntk7fc4pOMlZRbDwTq/Zpl4DemJQMc6QJqS4mbHZjNp9aZ8P+5t2ynuZsTfp7mVeDWI0LYv/SfVT+V2d60TjDmu1BUk+RZUx3fuaS2lsxJ2efutLkEBFncI3yMWHfKj/umVlNYns3d8tt/tVWGtbEflab4s2ckgZc+j7On+BwqVhaKS7tLxc2trJR5zCDWczTPFexgV3soUzeN3jEE56da+fWuXPuu6VOT6aZxpflPLwD3j2osA==</latexit> En planets = Z p(y|✓, n)p(✓|n)d✓ Bayesian techniques Lomb 1976, Ferraz-Mello 1981, Scargle 1982, Baluev 2008, 2009, 2013, 2015, Zechmeister & Küster 2009, Sulis 2016 Gregory 2007, Gregory & Ford 2007, Tuomi et al. 2011, Diaz et al. 2016 FAP < 0.1 % Evidence n + 1 planets Evidence n planets > 150
  14. Take 1: sparse recovery

  15. periodogram ℓ1 Radial velocity data analysis with compressed sensing techniques

    Hara, Boué, Laskar Correia 2017 10 0 10 1 10 2 10 3 10 4 Period (days) 0 0.1 0.2 0.3 0.4 0.5 0.6 Normalized RSS Generalized Lomb-Scargle periodogram 3 sines with SNR 10 Periodogram True spectrum Tallest peak 10 0 10 1 10 2 10 3 10 4 Period (days) 0 0.2 0.4 0.6 0.8 1 RV (m/s) l1-periodogram 3 sines with SNR 10 l1-periodogram True spectrum Interprétation Based on sparse recovery techniques (Chen & Donoho 1998) Nelson, Ford et al 2020 6 systems with 200 points in 22s 15 Analytical estimate of false alarm probability E ff i cient modelling of correlated noise I, Delisle, Hara, Ségransan, 2020
  16. periodogram ℓ1 16

  17. periodogram ℓ1 17

  18. periodogram ℓ1 18

  19. periodogram ℓ1 19

  20. periodogram ℓ1 20

  21. periodogram ℓ1 21

  22. periodogram ℓ1 22

  23. periodogram ℓ1 23

  24. Application Interprétation The SOPHIE search for northern extrasolar planets. XVI.

    HD 15829: A compact planetary system in a near-3:2 mean motion resonance chain, Hara et al. 2020, A&A Six transiting planets and a chain of Laplace resonances in TOI-178, Leleu, Alibert, Hara et al. 2021, A&A HD 158259 l1 periodogram, noise model with best cross validation score HD 158259: 5 to 6 planets TOI 178: 6 planets 24 Période (jours) Périodogramme classique Do not transit Transits (TESS) Close to a 3:2 mean motion resonance SOPHIE radial velocities ESPRESSO (PI) + CHEOPS GTO Equilibrium temperature (K) Density (g/cm3) https://github.com/nathanchara/l1periodogram 5 outer planets in a chain of Laplace resonances
  25. Works well, but what would work best? Take 2: optimal

    detection criterion
  26. Question • What do we mean by optimal detection criterion

    ? • What is the optimal solution ? • How does it perform? Hara, de Poyferré, Delisle, Hoffmann 2022 (submitted, arXiv:2203.04957 ) Hara, Unger, Delisle, Díaz, Ségransan 2021 (A&A, accepted)
  27. What do we mean by « optimal detection criterion »?

  28. Definition of a detection p(y ∣ (θj )j=1..n , η)

    data Vector of orbital elements of planet j n planets in the model Other parameters (O ff sets, trends, hyperparameters of a Gaussian process) General likelihood model We de fi ne a detection claim as « There are n planets, one planet with orbital elements , …, one planet with orbital elements » s are regions of the parameter spac e θ ∈ Θ1 θ ∈ Θn Θi Parameter space Θ1 Θ2
  29. General framework p(y ∣ (θj )j=1..n , η) data Vector

    of parameters of pattern j n patterns in the model Nuisance parameters General likelihood model We de fi ne a detection claim as « There are n patterns, one pattern with parameters , …, one pattern with parameters » s are regions of the parameter spac e θ ∈ Θ1 θ ∈ Θn Θi Θ1 Θ2 Parameter space
  30. Definition of a detection: RV case Orbital frequency p(y ∣

    (θj )j=1..n , η) Time-series of spectra Or RV Kj , ej , ϖj , M0j and ωj = 2π/Pj Example
 we claim the detection of two planets with a certain accuracy on their frequencies There is one planet with orbital frequency between and ω1 − Δω 2 ω1 + Δω 2 There is one planet with orbital frequency between and ω2 − Δω 2 ω2 + Δω 2
  31. False and missed detections Correct detectio n I claimed that

    there is a planet with orbital frequency between and , and there is one ω1 − Δω 2 ω1 + Δω 2 There are truly three planets at these frequencies Orbital frequency False detection I claimed that there is one planet with orbital frequency between and , but there is none. ω1 − Δω 2 ω1 + Δω 2 Missed detections I claimed that there are two planets, but there are three (one missed detection)
 Alternately: I missed two planets truly present
  32. What do we mean by optimal detection criterion? A decision

    rule selecting the maximizing the expected value of the utilit y (Von Neumann and Morgenstern 1947) or equivalently minimizing Cost = Number of false detections + Number of missed detection s • As a function of Θi , i = 1..n γ × γ « There are n planets, one planet with orbital elements , …, one planet with orbital elements » θ ∈ Θ1 θ ∈ Θn Or Minimizing (Expected number of missed detections) with constraint Expected number of false detections < • As a function of x x Expectation is taken on the posterior probability p((θj )j=1..n , η ∣ y)
  33. Computing the cost function « There are n planets, one

    planet with orbital elements , …, one planet with orbital elements » θ ∈ Θ1 θ ∈ Θn
  34. What is the optimal solution?

  35. Orbital frequency The complicated case: overlapping detections This case concentrates

    most of the theoretical complications Orbital frequency Everything becomes simple Δω Δω 2Δω If the prior forbids two signals to be too close
  36. Solution for exoplanets I Orbital frequency Δω Δω 2Δω Minimize

    Cost function = Number of false detections + Number of missed detection s As a function of Minimizing (Number of missed detections) with constraint on the expected Number of false detections < As a function of γ × γ x x Both problems have the same solution
  37. Solution for exoplanets II Simply compute the posterior probability to

    have a planet in a frequency interval Orbital frequency Δω TIP = p ( planet with frequency in interval [ω − Δω 2 , ω + Δω 2 ] y ) FIP = 1 − TIP TIP = True inclusion probabilit y FIP = False inclusion probability Data from Lovis et al. 2011
  38. Computing the TIP/FIP Orbital frequency Δω TIP = p (

    planet with frequency in interval [ω − Δω 2 , ω + Δω 2 ] y ) = nmax ∑ k=1 p ( planet with frequency in interval [ω − Δω 2 , ω + Δω 2 ] y, k planets ) p(k planets|y) p(k planets|y) = p(y|k planets)p(k planets) ∑nmax j=1 p(y| j planets)p(j planets) Simply compute the posterior probability to have a planet in a frequency interval By-products of Bayesia n evidence calculations We use Polychord (Handley et al. 2015a, b) Δω = 2π Tobs
  39. Computational trick Gaussian mixture prior on x knowing θ RV1

    planet = A cos ν + B sin ν y = M(θ)x Radial velocity P(y ∣ θ) = ∫ P(y ∣ θ, x)p(x ∣ θ)dx has an analytical expression Need to explore 3 parameters per planet instead of 5 Parameters on which the model depends non-linearly (eccentricity, period…) Gaussian mixture: possibility to have a multimodal prior on mass
  40. How do we decide on n_max? TIP = nmax ∑

    k=1 p ( planet with frequency in interval [ω − Δω 2 , ω + Δω 2 ] y, k planets ) p(k planets|y) For fi xed n_max: 
 several runs Increment n_max: 
 Does the FIP periodogram change? p(k planets|y) = p(y|k planets)p(k planets) ∑nmax j=1 p(y| j planets)p( j planets)
  41. How does our new detection criterion perform?

  42. FIP: performances Simulation: 1000 systems with 0,1 ou 2 planets

    generated on 80 time-stamp s (Circular, Log-uniform period, Rayleigh prior on K, uniform on phase ) Search for planets with different methods with correct priors Periodogram or periodogram + false alarm probability (FAP) or 
 Bayes factor FIP periodogram + FIP, FAP or Bayes factor ℓ1 10 0 10 1 10 2 10 3 10 4 Period (days) 0 0.1 0.2 0.3 0.4 0.5 0.6 Normalized RSS Generalized Lomb-Scargle periodogram 3 sines with SNR 10 Periodogram True spectrum Tallest peak 10 0 10 1 10 2 10 3 10 4 Period (days) 0 0.2 0.4 0.6 0.8 1 RV (m/s) l1-periodogram 3 sines with SNR 10 l1-periodogram True spectrum There is one planet with orbital frequency between and ω1 − Δω ω1 + Δω False detection True detection Hara et al. 201 7 github.com/nathanchara/l1periodogram
  43. FIP: performances Simulation: 1000 systems with 0,1 or 2 planets

    White noise simulation Red noise simulation (exponential kernel)
  44. Interpretation Simulation: 1000 systems with 0,1 or 2 planets On

    average, among N independent detections with TIP = p, pN detections are correct TIP: True inclusion probability
  45. Robustness to a prior change We analyse the data with

    the wrong prior Dashed lines: FIP periodogram + Bayes factor Plain lines: FIP periodogram + FIP Data generated with • Periods log-uniform on 1-100 day s • Semi-amplitude Rayleigh prior with σ
  46. HD 10180 <latexit sha1_base64="R2GYhQAGULPakKLs3lPLMNR1aQU=">AAAE/3icjZLLbtNAFIZPTYASLm1hyWZECyqLjGyP47F3FQiJFQqIXqS6qmx3mlr1JfIFqQpZ8Cbs2CG2vABb2CHeAN6Cc8YuCVQIJkrmzD/nO7dJNEmTqjbNb0vGpd7lK1eXr/Wv37h5a2V17fZOVTRlrLbjIi3KvSisVJrkartO6lTtTUoVZlGqdqPTx3S/+0qVVVLkL+uziTrIwnGeHCdxWKN0uGb4QaTGST6tw6hJw3I2jV93n1l/lIa5qiv2gG0EaTHefPJwg+lDlYyz8HDaiTOtMjZ6NtI7Yy+avE4yxVgQsOCEajvfaPVNvZHrwPNs7gzRNrktSBLcs9XAMj08WLJiFKNvLQDC58LVgOWTZHHTUwOfYItlSc6GLqsIsueQ9HxuijaL02aRmEVSHLeFJOtyiQVMuNxvizOdNpflqwH6ki01KPw2mzPHXCn5sCuRaCa5LdVAVy08TTlWSw0XKFNwx24p0SXDGk2pGzthQrT5kETQnYNDHwlHg8IkyeWuiSDFYg6STtsibkTKBdKT3PHbsWhvHK2DpKXTEzr059NBNlD50a9/yuHqOs5FL3bRsDpjHbo1Kla/QgBHUEAMDWSgIIca7RRCqPCzDxaYMEHtAKaolWgl+l7BDPrINuil0CNE9RR/x3ja79QczxSz0nSMWVL8lkgyuI9MgX4l2pSN6ftGRyb1b7GnOibVdoZ71MXKUK3hBNV/ceee/8tRLzUcg6d7SLCniVaou7iL0uipUOVsoasaI0xQI/sI70u0Y02ez5lpptK902xDff9de5JK57jzbeAHVYkPbP35nBeNHZtbLrefO+tbj7qnXoa7cA828T0lbMFTGME2xMZb45Px2fjSe9N713vf+9C6Gksdcwd+W72PPwETNQ8j</latexit> Planets log(E) log(E) PNP Runtime 0

    -882.45 0.23 3.82e-108 17s 1 -839.36 0.19 1.08e-93 1 min 56 s 2 -789.03 0.24 3.72e-76 6 min 57 s 3 -736.95 0.04 1.19e-57 17 min 39 s 4 -677.56 0.15 7.27e-36 38 min 41 s 5 -603.42 0.13 1.12e-07 1 h 33 min 31 s 6 -590.14 0.30 6.60e-02 4 h 46 min 46 s 7 -587.49 0.22 9.34e-01 14 h 59 min 57 s Favours 7 planets and the evidence keeps increasing
  47. How do I compute all this?

  48. Detecting exoplanets in RV data SCMA VII Computationally heavy Numerical

    aspects 48
  49. Detecting exoplanets in RV data SCMA VII Matrix inversions are

    typically in but faster for certain matrices For semi-separable matrices, the inversion is in <latexit sha1_base64="AGe7WDVuBm1wdXd098YFilaLcwo=">AAACyXicjVHLSsNAFD2Nr1pfVZdugkWom5KoqMuiG0HQCvYBtUqSTmtsXk4mYi2u/AG3+mPiH+hfeGdMQS2iE5KcOfeeM3PvtSPPjYVhvGa0sfGJyansdG5mdm5+Ib+4VIvDhDus6oReyBu2FTPPDVhVuMJjjYgzy7c9Vrd7+zJev2E8dsPgVPQj1vKtbuB2XMcSRNWOi0fnm+sX+YJRMtTSR4GZggLSVQnzLzhDGyEcJPDBEEAQ9mAhpqcJEwYi4loYEMcJuSrOcI8caRPKYpRhEdujb5d2zZQNaC89Y6V26BSPXk5KHWukCSmPE5an6SqeKGfJ/uY9UJ7ybn3626mXT6zAJbF/6YaZ/9XJWgQ62FU1uFRTpBhZnZO6JKor8ub6l6oEOUTESdymOCfsKOWwz7rSxKp22VtLxd9UpmTl3klzE7zLW9KAzZ/jHAW1jZK5Xdo42SqU99JRZ7GCVRRpnjso4wAVVMn7Co94wrN2qF1rt9rdZ6qWSTXL+La0hw+fC5C9</latexit> O(N 3) <latexit sha1_base64="INqvfh3z3YrCH4hP5EaPJGmyYRs=">AAACx3icjVHLSsNAFD2Nr1pfVZdugkWom5IWUZdFN7rRCvYBtUiSTtvQJBMmk2IpLvwBt/pn4h/oX3hnTEEtohOSnDn3njNz73Ui34ulZb1mjLn5hcWl7HJuZXVtfSO/udWIeSJcVne5z0XLsWPmeyGrS0/6rBUJZgeOz5rO8FTFmyMmYo+H13IcsU5g90Ov57m2VNRl8WL/Nl+wSpZe5iwop6CAdNV4/gU36ILDRYIADCEkYR82YnraKMNCRFwHE+IEIU/HGe6RI21CWYwybGKH9O3Trp2yIe2VZ6zVLp3i0ytIaWKPNJzyBGF1mqnjiXZW7G/eE+2p7jamv5N6BcRKDIj9SzfN/K9O1SLRw7GuwaOaIs2o6tzUJdFdUTc3v1QlySEiTuEuxQVhVyunfTa1Jta1q97aOv6mMxWr9m6am+Bd3ZIGXP45zlnQqJTKh6XK1UGhepKOOosd7KJI8zxCFWeooU7eAzziCc/GucGNkXH3mWpkUs02vi3j4QP4hpAY</latexit> O(N) Numerical methods: S+LEAF matrices <latexit sha1_base64="QoYZuIjy8H+rH9dvq78uwiA+vE8=">AAADxnicjVHbbtNAEB3XXEq4pfDIy4oIKYgSxXmACoFUAQ99LBJpK9XFWm8myTbrS3fXQGVZ4gd4hU9D/AH8BbMbV4FWCNayPXPOnLMzu2mppLHD4fdgLbx0+crV9Wud6zdu3rrd3bizZ4pKCxyLQhX6IOUGlcxxbKVVeFBq5FmqcD9dvHL8/nvURhb5W3ta4lHGZ7mcSsEtQclGAPEEpzF+LIteVMcZt/N0WmPzru5FTdPxpHKMwqklaIloh2g5my+hFGcyr/Gk8qZNh7FY8RQVQc8EKtTSokMXfZvITWaT44fsBWXxa1SWM+uy2FRZUpvneeKb0FktmsYZ9XliWCwK04/zKjErzSOWOsbI/DwTa3rdSPVjaiRLJ3zFNpudGPPJqtuk2xsOhn6xi0HUBj1o127R/QYxTKAAARVkgJCDpVgBB0PPIUQwhJKwI6gJ0xRJzyM00CFtRVVIFZzQBX1nlB22aE658zReLWgXRa8mJYMHpCmoTlPsdmOer7yzQ//mXXtP19sp/dPWKyPUwpzQf+nOKv9X52axMIUtP4OkmUqPuOlE61L5U3Gds9+msuRQEubiCfGaYuGVZ+fMvMb42d3Zcs//8JUOdbloayv46bqkC47OX+fFYG80iJ4MRm9Gve2X7VWvwz24D326z6ewDTuwC2MQwSz4HHwJvoY7YR5W4Ydl6VrQau7CHyv89As1e+ly</latexit> k(ti, tj) = k( t) = X s<nc (as cos(⌫s t) + bs sin(⌫s t)) e s t, (1) CELERITE kernels yield semi-separable covariance matrices (Foreman Mackey et al. 2017) Inversion still in <latexit sha1_base64="INqvfh3z3YrCH4hP5EaPJGmyYRs=">AAACx3icjVHLSsNAFD2Nr1pfVZdugkWom5IWUZdFN7rRCvYBtUiSTtvQJBMmk2IpLvwBt/pn4h/oX3hnTEEtohOSnDn3njNz73Ui34ulZb1mjLn5hcWl7HJuZXVtfSO/udWIeSJcVne5z0XLsWPmeyGrS0/6rBUJZgeOz5rO8FTFmyMmYo+H13IcsU5g90Ov57m2VNRl8WL/Nl+wSpZe5iwop6CAdNV4/gU36ILDRYIADCEkYR82YnraKMNCRFwHE+IEIU/HGe6RI21CWYwybGKH9O3Trp2yIe2VZ6zVLp3i0ytIaWKPNJzyBGF1mqnjiXZW7G/eE+2p7jamv5N6BcRKDIj9SzfN/K9O1SLRw7GuwaOaIs2o6tzUJdFdUTc3v1QlySEiTuEuxQVhVyunfTa1Jta1q97aOv6mMxWr9m6am+Bd3ZIGXP45zlnQqJTKh6XK1UGhepKOOosd7KJI8zxCFWeooU7eAzziCc/GucGNkXH3mWpkUs02vi3j4QP4hpAY</latexit> O(N) Delisle, Hara, Ségransan 2020 b 49 S+LEAF matrices = semi-separable matrix + Leaf matrix <latexit sha1_base64="e0siJelHer3NSv8njfxPTCTuxJo=">AAACxHicjVHLSsNAFD2Nr1pfVZdugkVwVZIi6rIoiMsW7ANqkSSd1tDJg8xEKEV/wK1+m/gH+hfeGaegFtEJSc6ce8+Zuff6KQ+FdJzXgrWwuLS8Ulwtra1vbG6Vt3faIsmzgLWChCdZ1/cE42HMWjKUnHXTjHmRz1nHH5+reOeOZSJM4is5SVk/8kZxOAwDTxLVbN+UK07V0cueB64BFZjVSMovuMYACQLkiMAQQxLm8CDo6cGFg5S4PqbEZYRCHWe4R4m0OWUxyvCIHdN3RLueYWPaK0+h1QGdwunNSGnjgDQJ5WWE1Wm2jufaWbG/eU+1p7rbhP6+8YqIlbgl9i/dLPO/OlWLxBCnuoaQako1o6oLjEuuu6Jubn+pSpJDSpzCA4pnhAOtnPXZ1hqha1e99XT8TWcqVu0Dk5vjXd2SBuz+HOc8aNeq7nG11jyq1M/MqIvYwz4OaZ4nqOMSDbS09yOe8GxdWNwSVv6ZahWMZhfflvXwARymj2I=</latexit> V <latexit sha1_base64="3sF1WPB0pxtT6yBk/gM1COEI4Bk=">AAADH3icjVHNThsxGByW/tAF2gDHXqxGSEFI0QZV0AsSbS89gmADglDk3TiJhfdHXm8lFOVheBNuvVXcKl6gKqf2EfrZNagFVdSr3R3PNzP2ZyelkpWJoqupYPrR4ydPZ56Fs3Pzz180Fha7VVHrVMRpoQp9kPBKKJmL2EijxEGpBc8SJfaT0/e2vv9J6EoW+Z45K8Vxxoe5HMiUG6JOGoddtsl6GTcjnY37kg8nrbcrbPWWMlqqSU+JgWnFux/3eloOR+aOoPaCXRbfKsIwPGk0o3bkBrsPOh404cd20fiKHvookKJGBoEchrACR0XPETqIUBJ3jDFxmpB0dYEJQvLWpBKk4MSe0ndIsyPP5jS3mZVzp7SKoleTk2GZPAXpNGG7GnP12iVb9l/ZY5dp93ZG/8RnZcQajIh9yHej/F+f7cVggDeuB0k9lY6x3aU+pXanYnfO/ujKUEJJnMV9qmvCqXPenDNznsr1bs+Wu/p3p7SsnadeW+Pa7pIuuHP3Ou+D7lq7s95e23nd3Hrnr3oGL/EKLbrPDWzhA7YRU/YFvuEHfgbnwefgS3D5WxpMec8S/hrB1S+EFLIz</latexit> V = diag(A) + tril UST + triu SUT
  50. Detecting exoplanets in RV data SCMA VII 50 Complexity <latexit

    sha1_base64="3sF1WPB0pxtT6yBk/gM1COEI4Bk=">AAADH3icjVHNThsxGByW/tAF2gDHXqxGSEFI0QZV0AsSbS89gmADglDk3TiJhfdHXm8lFOVheBNuvVXcKl6gKqf2EfrZNagFVdSr3R3PNzP2ZyelkpWJoqupYPrR4ydPZ56Fs3Pzz180Fha7VVHrVMRpoQp9kPBKKJmL2EijxEGpBc8SJfaT0/e2vv9J6EoW+Z45K8Vxxoe5HMiUG6JOGoddtsl6GTcjnY37kg8nrbcrbPWWMlqqSU+JgWnFux/3eloOR+aOoPaCXRbfKsIwPGk0o3bkBrsPOh404cd20fiKHvookKJGBoEchrACR0XPETqIUBJ3jDFxmpB0dYEJQvLWpBKk4MSe0ndIsyPP5jS3mZVzp7SKoleTk2GZPAXpNGG7GnP12iVb9l/ZY5dp93ZG/8RnZcQajIh9yHej/F+f7cVggDeuB0k9lY6x3aU+pXanYnfO/ujKUEJJnMV9qmvCqXPenDNznsr1bs+Wu/p3p7SsnadeW+Pa7pIuuHP3Ou+D7lq7s95e23nd3Hrnr3oGL/EKLbrPDWzhA7YRU/YFvuEHfgbnwefgS3D5WxpMec8S/hrB1S+EFLIz</latexit> V = diag(A) + tril UST + triu SUT <latexit sha1_base64="CzBcQgGFcI7QvmH1+NZ0oEQYbh8=">AAAC+3icjVHLThsxFD0ZCoR3CsturEZILFA0QRWwRGXDMlUbEomgyjNxghXPo7anahTlT9h1V7HtD3Rb9hV/AH/Ra3eoClEFHs3Muefec+zrG+VKGhuGN5Vg7sX8wmJ1aXlldW19o/Zy89RkhY5FO85UprsRN0LJVLSttEp0cy14EinRiUbHLt/5LLSRWfrBjnNxnvBhKgcy5paoj7X996xnxRc7YTztsylrs7+xFhSnPSsTYZjeZbqnxCejeGoZCethI/SLzYJmCeooVyur/UIPfWSIUSCBQApLWIHD0HOGJkLkxJ1jQpwmJH1eYIpl0hZUJaiCEzui75Cis5JNKXaexqtj2kXRq0nJsE2ajOo0Ybcb8/nCOzv2f94T7+nONqZ/VHolxFpcEPuU7r7yuTrXi8UAh74HST3lnnHdxaVL4W/FnZz905Ulh5w4h/uU14Rjr7y/Z+Y1xvfu7pb7/K2vdKyL47K2wJ07JQ24+Xics+B0r9Hcb+y9e1M/eluOuopXeI0dmucBjnCCFtrkfYkf+InrYBp8Db4FV39Kg0qp2cKDFXz/DalCpD4=</latexit> S and U are n ⇥ r, r 6 n <latexit sha1_base64="X5Jhm/rY+EXEbuen7VQtrn8YAjI=">AAAC73icjVHLSsNAFD2Nr/quunQTLEJVKGkRdSm6caUVbBXaKpNx1NC8nEwEKf0Hd+7ErT/gVv9C/AP9C++MEXwgOiHJuefec2buXDf2vUQ5znPO6usfGBzKD4+Mjo1PTBamphtJlEou6jzyI3ngskT4XijqylO+OIilYIHri323s6nz+xdCJl4U7qnLWLQDdhp6Jx5niqijwmIrYOqMM7+70yuVWi6TXfew2rOXbPkeGHhYXdheOCoUnbJjlv0TVDJQRLZqUeEJLRwjAkeKAAIhFGEfDAk9TVTgICaujS5xkpBn8gI9jJA2pSpBFYzYDn1PKWpmbEix9kyMmtMuPr2SlDbmSRNRnSSsd7NNPjXOmv3Nu2s89dku6e9mXgGxCmfE/qX7qPyvTveicII104NHPcWG0d3xzCU1t6JPbn/qSpFDTJzGx5SXhLlRftyzbTSJ6V3fLTP5F1OpWR3zrDbFqz4lDbjyfZw/QaNarqyUq7vLxfWNbNR5zGIOJZrnKtaxhRrq5H2Fezzg0Tq3rq0b6/a91Mplmhl8WdbdG7Z4nvQ=</latexit> O(( ¯ b2 + r¯ b + r2)N) Semi-separable component Leaf component <latexit sha1_base64="xWJFam5XQwAWZtSRZKd2PR450iM=">AAACxnicjVHLSsNAFD2Nr1pfVZdugkVwVZIi6rLopsuK9gG1lGQ6rUPTJEwmSimCP+BWP038A/0L74wpqEV0QpIz595zZu69fhyIRDnOa85aWFxaXsmvFtbWNza3its7zSRKJeMNFgWRbPtewgMR8oYSKuDtWHJv7Ae85Y/Odbx1y2UiovBKTWLeHXvDUAwE8xRRl35P9Iolp+yYZc8DNwMlZKseFV9wjT4iMKQYgyOEIhzAQ0JPBy4cxMR1MSVOEhImznGPAmlTyuKU4RE7ou+Qdp2MDWmvPROjZnRKQK8kpY0D0kSUJwnr02wTT42zZn/znhpPfbcJ/f3Ma0yswg2xf+lmmf/V6VoUBjg1NQiqKTaMro5lLqnpir65/aUqRQ4xcRr3KS4JM6Oc9dk2msTUrnvrmfibydSs3rMsN8W7viUN2P05znnQrJTd43Ll4qhUPctGncce9nFI8zxBFTXU0SDvIR7xhGerZoVWat19plq5TLOLb8t6+ABg6ZBK</latexit> bi non-zero extra diagonal coe ff i cients Inversion cost of a S+LEAF matrix Leaf matrices can model calibration noise
  51. Detecting exoplanets in RV data SCMA VII Semi-separable + Leaf

    matrices Generated quasi-periodic signal Gaussian process prediction With quasi-periodic kernel Calibration noise ignored Simulated data With calibration noise Gaussian process prediction with quasi-periodic kernel and calibration component 51 Calibration noise Calibration Component Important for densely sampled data
  52. Detecting exoplanets in RV data SCMA VII X is a

    Gaussian process Radial velocity Spectroscopic Indicators Gaussian processes 52 From Rajpaul et al 2015 Wavelength lag Relative intensity Schematic CCFs Granular region Inter-granular region Wavelength lag Relative intensity Sum of CCF and bisector CCFs sum bisector FWHM BIS bottom BIS top <latexit sha1_base64="gcI5/NPFLUg/xXxn7q/Rx++KA6Y=">AAADNnicjVHdahQxGP12qm1df7raS2+Ci7UiLLNFVJBCWb1o0Yta3e1CpyyZbLodmpmMmUyhDPtevknxxjvR3vUFCp7EWbQW0Qwzc3K+c07yJXGuksKG4edGMHft+vzC4o3mzVu37yy17t4bFLo0QvaFVtoMY15IlWSybxOr5DA3kqexkrvx0StX3z2Wpkh09sGe5HI/5ZMsOUgEt6BGrY8r0WupLGc7A7bOBiPBhqv2MXsCaFg01rYaTkG8ZCyKmiuR0hO2M6o230wfQf52Jve1lNtDk1a9rfdT1Hq/onqXokatdtgJ/WBXQbcGbarHtm6dUkRj0iSopJQkZWSBFXEq8OxRl0LKwe1TBc4AJb4uaUpNeEuoJBQc7BG+E8z2ajbD3GUW3i2wisJr4GT0EB4NnQF2qzFfL32yY/+WXflMt7cT/OM6KwVr6RDsv3wz5f/6XC+WDuiF7yFBT7lnXHeiTin9qbids9+6skjIwTk8Rt0AC++cnTPznsL37s6W+/p3r3Ssm4taW9KZ2yUuuPvndV4Fg7VO91ln7d3T9kavvupFuk8PaBX3+Zw2aJO2qY/sU7pozDcWgk/Bl+Br8O2nNGjUnmW6NILzH1CdszI=</latexit> RV = V c X(t) + V r ˙ X(t); log R0 HK = L c X(t) BIS = B c X(t) + B r ˙ X(t) Augmented data: <latexit sha1_base64="wY7sA3L4UnU4afmgDWGwOYyDG2M=">AAADHnicjVFNT9wwEB3SUmgKZQvHXqyuKuCyyiIEHBH0AOqF0u4CwmjleL1ZC+dDjlNpFeW/9J/0xg1xpP0BleDS/oWOjZHaoqp1lOT5zbxnz0xcKFmaKPo6FTx6PP1kZvZp+Gxu/vlC68Viv8wrzUWP5yrXxzErhZKZ6BlplDgutGBprMRRfL5r40cfhS5lnn0wk0KcpSzJ5EhyZpAatE6oEiOzQmgsEpnVTGs2aWrekJC+EcowctgnhNKQqjwhh4N6722zfEekzIx1Wu/sv29CKrKh1xKqZTI2q2TQakedyC3yEHQ9aINfB3nrGigMIQcOFaQgIAODWAGDEp9T6EIEBXJnUCOnEUkXF9BAiNoKswRmMGTP8Zvg7tSzGe6tZ+nUHE9R+GpUEniNmhzzNGJ7GnHxyjlb9m/etfO0d5vgP/ZeKbIGxsj+S3ef+b86W4uBEWy5GiTWVDjGVse9S+W6Ym9OfqnKoEOBnMVDjGvE3Cnv+0ycpnS1294yF79xmZa1e+5zK7i1t8QBd/8c50PQX+t0Nzpr79bb2zt+1LPwEl7BCs5zE7ZhDw6gh96f4Rt8hx/Bp+AiuAyu7lKDKa9Zgt9W8OUniSOxfg==</latexit> 0 @ RV log R0 HK BIS 1 A
  53. Detecting exoplanets in RV data SCMA VII X is a

    Gaussian process Radial velocity Indicators Gaussian processes: a data driven approach 53 From Jones et al 2017 Wavelength lag Relative intensity Schematic CCFs Granular region Inter-granular region Wavelength lag Relative intensity Sum of CCF and bisector CCFs sum bisector FWHM BIS bottom BIS top Let the data select which are Non zero with BIC, AIC, cross validation… <latexit sha1_base64="FQEWH/+GwHSD+cMsN2b3UWrqkcU=">AAACyXicjVHLTsJAFD3UF+ILdemmkZi4Ii0x6pLoxsQNJvJIkJB2GHCgtLWdGpGw8gfc6o8Z/0D/wjtjSVRidJq2Z86958zce93QE7G0rNeMMTe/sLiUXc6trK6tb+Q3t2pxkESMV1ngBVHDdWLuCZ9XpZAeb4QRd4aux+vu4FTF67c8ikXgX8pRyFtDp+eLrmCOJKrmtMeiP2nnC1bR0sucBXYKCkhXJci/4AodBGBIMASHD0nYg4OYniZsWAiJa2FMXERI6DjHBDnSJpTFKcMhdkDfHu2aKevTXnnGWs3oFI/eiJQm9kgTUF5EWJ1m6niinRX7m/dYe6q7jejvpl5DYiWuif1LN838r07VItHFsa5BUE2hZlR1LHVJdFfUzc0vVUlyCIlTuEPxiDDTymmfTa2Jde2qt46Ov+lMxao9S3MTvKtb0oDtn+OcBbVS0T4sli4OCuWTdNRZ7GAX+zTPI5Rxhgqq5N3HI57wbJwbN8adcf+ZamRSzTa+LePhAx3Ckck=</latexit> aij
  54. Detecting exoplanets in RV data SCMA VII Multivariate Gaussian processes

    S+LEAF still in O(N) for multivariate timeseries See Delisle et al. 2022 54 https://gitlab.unige.ch/Jean-Baptiste.Delisle/spleaf
  55. Back to the general case p(y ∣ (θj )j=1..n ,

    η) data Vector of parameters of pattern j n patterns in the model Nuisance parameters General likelihood model We de fi ne a detection claim as « There are n patterns, one pattern with parameters , …, one pattern with parameters » s are regions of the parameter spac e θ ∈ Θ1 θ ∈ Θn Θi Parameter space Θ1 Θ2
  56. Conclusion p(y ∣ (θj )j=1..n , η) Have n discrete

    hypotheses Hi , i = 1..n How many are true? s are indices θ θj = (i)i=1..m
  57. General context Works for y = Ax + ϵ p(y

    ∣ (θj )j=1..n , η) s are indices and amplitudes θ θj = (i, xi )i=1..m Generalises the Barbieri & Berger 2004 framework
  58. Conclusion Bayes factors are cool but why settle for second

    best? Just like Bayes factor, the result heavily depend on the mode l -> average over models, check residuals Probability(what I’m interested in | data)