Save 37% off PRO during our Black Friday Sale! »

Exoplanet direct imaging data challenge

Exoplanet direct imaging data challenge

Presentation given at the Paris-Saclay Center for Data Science (INRIA Saclay) on the topic of data challenges for astronomical high-contrast imaging and direct detection of exoplanets.


  1. Exoplanet direct imaging data challenge Carlos Alberto Gomez Gonzalez Paris-Saclay

    Center for Data Science, 11/04/2018
  2. Grenoble Alpes Data Institue ▪ WP1: Data Science for Earth,

    Space and Environmental Sciences ▪ WP2: Data Science for Life Sciences ▪ WP3: Massive and Rich Data for Humanities ▪ WP4: Data Science, Social Media and Social Sciences ▪ WP5: Data Governance, Data Protection and Privacy WP0: Coordination MSTIC - Mathematics, Information and Communication Sciences and Technologies CBS - Chemistry, Biology, Health PAGE - Particle physics, Astrophysics, Geosciences, Environment and ecology SHS - Humanities and Social Sciences PSS - Social Sciences USMB – Univ Savoie Mont Blanc 19 Labs involved that cover 5 research domains IDEX Cross disciplinary Program • 1.7 million euros • From 2017 to 2020
  3. Grenoble Alpes Data Institue • Data Challenge: • epigenetic &

    High-dimension Mediation Data Challenge, • audio-visual diarization, • cancer research, • home-made framework (codalab / jupyter)? • Data Science in the Alpes (March 20) • R in Grenoble group. • PySciDataGre group. • Data club. • Data Carpentry.
  4. What do I do • Interdisciplinary research. • Exoplanetary science

    and astrophysics with CS & ML. • Integrating cutting-edge ML developments. • Ensuring the use of robust statistical approaches and well-suited metrics. • Open-source development. • Data challenges.
  5. Mostly, we rely on indirect methods for detecting exoplanets …

    it’s very hard to “see” them
  6. Credit: NASA, ANIMATION

  7. Milli et al. 2016 Konopacky et al. 2013 Bowler 2016

    Marois et al. 2010 HR8799, L’ band 20 AU 0.5” b c d e Power of direct imaging
  8. SPHERE, Vigan et al. 2015 Very Large Telescope (VLT), Chile

  9. Basic calibration and “cosmetics” • Dark/bias subtraction • Flat fielding

    • Sky or thermal background subtraction • Bad pixel correction Raw astronomical images Detection on final residual image Image recentering Bad frames removal PSF modeling • Median • Pairwise, ANDROMEDA • LOCI • PCA, NMF • LLSG Image combination Model PSF subtraction De-rotation (ADI) or rescaling (mSDI) Characterization of detected companions Sequence of calibrated images
  10. calib. im ages planet Angular differential imaging - bright synthetic

    planet Speckle noise ANIMATION
  11. HR8799 bcde (Marois et al. 2008-2010) On of the lucky

    cases! Final images after post-processing (several epochs) post- proc. ANIMATION
  12. • Available on Pypi • • Documentation (Sphynx):

    • Jupyter tutorial • Python 2/3 compatibility • Continuous integration (Travis CI) and automated testing (Pytest) Gomez Gonzalez et al. 2017 Vortex Image Processing library
  13. Marois et al. 2007 Gomez Gonzalez et al. 2016 Lafrenière

    et al. 2007 A lgo- ZO O Marois et al. 2007 Soummer et al. 2012 Amara & Quanz 2012 Absil et al. 2013 Gomez Gonzalez et al. 2017 Gomez Gonzalez et al. 2016 Gomez Gonzalez et al. 2016 Marois et al. 2014 Marois et al. 2014 Hagelberg et al. 2015 Mugnier et al. 2009 Cantalloube et al. 2015
  14. Open Science & reproducibility Open source

  15. “Today, software is to scientific research what Galileo’s telescope was

    to astronomy: a tool, combining science and engineering. It lies outside the central field of principal competence among the researchers that rely on it. … it builds upon scientific progress and shapes our scientific vision.” Pradal et al. 2015
  16. Image sequence Detection Final residual image ? ? ? ?

    ? ? ? Speckles (?) Real planet Synth. planet

  18. “Essentially, all models are wrong, but some are useful.” George

    Box “…if the model is going to be wrong anyway, why not see if you can get the computer to ‘quickly’ learn a model from the data, rather than have a human laboriously derive a model from a lot of thought.” Peter Norvig
  19. PC 1 PC 2 Unsupervised Supervised Regression Classification Dimensionality reduction

    Clustering ML in a nutshell Density estimation and reinforcement learning
  20. The goal is to learn a mapping from the input

    samples to the labels: given a labeled dataset : Supervised learning Goodfellow et al. 2016
  21. Deep neural networks

  22. Supervised detection of exoplanets Gomez Gonzalez et al. 2018

  23. Model PSF subtraction combined residuals Supervised detection (SODINN) noisy and

    unlabelled images data transformation + adequate (ML) model higher sensitivity
  24. Data-driven performance assessment

  25. No need to reinvent the wheel :)

  26. • Small committee takes care of most of the planing.

    • Main organizer takes care of logistics/ leaderboard. • Main organizer writes a review-type paper. • Community effort. • Using a robust framework for data challenges creation. • Hands-on sessions. • Workshop for analyzing results and learning from different approaches. Old school Open science Data challenges
  27. • Low detection rate so far - observational bias or

    reality? • Several instruments/surveys with large databases. • New instruments coming online in the next years. • ~13 years of image processing techniques. • Discovering new techniques! Exoplanet DI challenge Motivation
  28. • Benchmark datasets — Metrics — Sub-challenges. • Finding the

    right platform (RAMP?). • Avoid excluding “old” code/pipelines: IDL, MATLAB, etc. Exoplanet DI challenge Planning: • benchmark datasets • sub-challenges • metrics Kick-off one-day session (RAMP?) Submission of results • Final leaderboard • Comparison of results/ approaches Workshop: image processing for exoplanet DI
  29. • Possible sub-challenges. • Each observing technique: different data format/dimensionality,

    variability to exploit. • Detection & characterization. • Focus on exoplanets (excluding disks?). Exoplanet DI challenge ADI (NACO, SPHERE/ IRDIS, NIRC2, LMIRcam, etc) ADI + mSDI (SPHERE/IFS, GPI, etc) RDI, other techniques?
  30. ¡Gracias! carlgogo carlosalbertogomezgonzalez