Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A data scientist’s guide to direct imaging of exoplanets

A data scientist’s guide to direct imaging of exoplanets

Invited talk at the RADA big data workshop (https://as595.github.io/RADABigData/) in Medellín, Colombia. The workshop is part of the Radio Astronomy for Development in the Americas (RADA) Big Data program, which is jointly funded by the UK Newton Fund and the Ministerio de Tecnologías de la Información y las Comunicaciones de Colombia. Feb 12, 2019.

Carlos Alberto Gomez Gonzalez

February 12, 2019
Tweet

More Decks by Carlos Alberto Gomez Gonzalez

Transcript

  1. A data scientist’s guide to direct imaging of exoplanets Carlos

    Alberto Gomez Gonzalez RADA big data workshop (Medellín). Feb 12, 2019
  2. Exo from extrasolar (outside of the Solar System) A data

    scientist’s guide to direct imaging of exoplanets
  3. http://exoplanetarchive.ipac.caltech.edu, 25 Jan 2018 Task: try to spot this color

    and estimate the total contribution of direct imaging detection… about 1% !!!
  4. HR8799, Marois et al. 2010 20 AU 0.5” Konopacky et

    al. 2013 Bowler 2016 Why direct imaging? Milli et al. 2016
  5. Raw astronomical images Sequence of calibrated images Basic calibration and

    “cosmetics” • Dark/bias subtraction • Flat fielding • Sky/thermal background subtraction • Bad pixel correction Image recentering Bad frames removal Image combination PSF modeling • Median • Pairwise, ANDROMEDA • LOCI • PCA/KLIP, NMF • LLSG Model PSF subtraction Detection on residual frame or detection map Characterization of “detected” companions R1 ,!1 ,F1 R2 ,!2 ,F2 R3 ,!3 ,F3 R4 ,!4 ,F4 Raw Clean
  6. Algorithms Ten years of research: • Median frame subtraction •

    Pairwise subtraction • Least squares image combination • PCA (forward modeling), NMF • Low-rank plus sparse decompositions • Matched filtering • Maximum likelihood estimation
  7. Building blocks… D — M = R Noise reduction algorithm

    R - residuals containing the exoplanet signal
  8. Observer (classifier) D — M = R or Noise reduction

    algorithm R - residuals containing the exoplanet signal Building blocks…
  9. Image sequence Final residual image ? ? ? ? ?

    ? ? Speckles (?) Real planet Synthetic planets Detection
  10. f = arg min fθ,θ∈Θ n i=1 L(yi, fθ (xi

    )) + g(θ) f : X → Y, Training data chihuahua muffin (xi, yi )i=1,...,n Supervised learning
  11. { Model architecture f = arg min fθ,θ∈Θ n i=1

    L(yi, fθ (xi )) + g(θ) f : X → Y, (xi, yi )i=1,...,n Supervised learning
  12. Loss function and regularization f = arg min fθ,θ∈Θ n

    i=1 L(yi, fθ (xi )) + g(θ) f : X → Y, (xi, yi )i=1,...,n Supervised learning
  13. { Optimization f = arg min fθ,θ∈Θ n i=1 L(yi,

    fθ (xi )) + g(θ) f : X → Y, (xi, yi )i=1,...,n Supervised learning
  14. ▸ Activation function Input X 1st Layer (data transformation) 2nd

    Layer (data transformation) Nth Layer (data transformation) …
  15. ▸ Max pooling ▸ Dropout ▸ BatchNorm Input X 1st

    Layer (data transformation) 2nd Layer (data transformation) Nth Layer (data transformation) …
  16. Input X Nth Layer (data transformation) … 1st Layer (data

    transformation) 2nd Layer (data transformation) Predictions Ŷ Labels Y Loss function weights weights weights weight update Optimizer loss score
  17. Reframing the problem: from unsupervised to supervised learning N …

    • Sequences of images without labels • Not enough archival data (observed stars) • We can generate semi- synthetic data by injecting a planet (PSF) template! • We grab patches: signal/noise PSF template
  18. • Reproducible results • Hyper-parameter and network architecture tuning •

    Comparison of labeling strategies DataLabeler Model Predictor • Flux vs S/N sampling • Fluxes/contrast estimation • Training data generation • Data augmentation • Data persistence (load/save with HDF5) • Network creation (Keras and Tensorflow) • Model training • Model persistence (load/ save with HDF5) • Target samples generation • Predictions (based on trained model) • Probability map inspection • Results to HDF5 SODINN library
  19. Discriminative model: Neural Network Goal - to make correct predictions

    on new samples: f : X → Y ˆ y = p(c+| MLAR sample) SGD with a binary cross-entropy loss: L = − n (yn ln(ˆ yn ) + (1 − yn ) ln(1 − ˆ yn )) Learning a mapping function
  20. 2d Max pooling size=(2x2) 2d Max pooling size=(2x2) 2d Max

    pooling size=(2x2) 2d Max pooling size=(2x2) 2d Max pooling size=(2x2) 2d Max pooling size=(2x2) 3d Convolutional layer kernel=(3x3x3), filters=40 3d Convolutional layer kernel=(2x2x2), filters=80 Dense layer units=128 Output dense layer units=1 3d Max pooling size=(2x2x2) 3d Max pooling size=(2x2x2) ReLU activation + dropout Sigmoid activation X and y to train/test/validation sets 2d Convolutional layer kernel=(3x3), filters=40 Dense layer units=128 Output dense layer units=1 ReLU activation + dropout Sigmoid activation X and y to train/test/validation sets 2d Convolutional layer kernel=(3x3), filters=40 2d Convolutional layer kernel=(3x3), filters=40 2d Convolutional layer kernel=(2x2), filters=80 2d Convolutional layer kernel=(2x2), filters=80 2d Convolutional layer kernel=(2x2), filters=80 … … B-LSTM / B-GRU layer 2d Max pooling size=(2x2) 2d Max pooling size=(2x2) 2d Max pooling size=(2x2) 2d Max pooling size=(2x2) 2d Max pooling size=(2x2) 2d Max pooling size=(2x2) 3d Convolutional layer kernel=(3x3x3), filters=40 3d Convolutional layer kernel=(2x2x2), filters=80 Dense layer units=128 Output dense layer units=1 3d Max pooling size=(2x2x2) 3d Max pooling size=(2x2x2) ReLU activation + dropout Sigmoid activation X and y to train/test/validation sets 2d Convolutional layer kernel=(3x3), filters=40 Dense layer units=128 Output dense layer units=1 ReLU activation + dropout Sigmoid activation X and y to train/test/validation sets 2d Convolutional layer kernel=(3x3), filters=40 2d Convolutional layer kernel=(3x3), filters=40 2d Convolutional layer kernel=(2x2), filters=80 2d Convolutional layer kernel=(2x2), filters=80 2d Convolutional layer kernel=(2x2), filters=80 … … B-LSTM / B-GRU layer
  21. • Data from the most representative instruments • Metrics: •

    Phases: • Will run on Codalab https://carlgogo.github.io/exoimaging_challenge/ https://github.com/carlgogo/exoimaging_challenge_extras
  22. Connections with other fields https://arxiv.org/pdf/1811.02471.pdf two-component convolutional long short-term memory

    network (LSTM) Convolutional LSTMs for Cloud-Robust Segmentation of Remote Sensing Imagery