Gaussian process models of correlated noise in microlensing light curves

Gaussian process models of correlated noise in microlensing light curves
23rd International Microlensing Conference @ Flatiron CCA Fran Bartolić University of St Andrews fbartolic

2 Introduction + = Observed data Deterministic forward model Probabilistic
noise model

3 Introduction Deterministic forward model Probabilistic noise model Likelihood (assuming
independent noise) Mean function is the number of data points are the measured fluxes are the reported error bars

Point Source Point Lens Deterministic forward model 4 • Reparametrization
necessary for efficient sampling Mean function Parameters

Point Source Point Lens Deterministic forward model 5 Change of
Parametrization (Dominik 2009)

6 The likelihood function Covariance matrix Probabilistic noise model Likelihood
(assuming independent noise) or or

7 Gaussian processes (GPs) - theory • Formally, GPs are
probability distributions over functions • In practice, GPs are multivariate Gaussians with a covariance matrix specified by a covariance function • GPs are used extensively in machine learning, statistics, and astronomy for both regression and classification problems Discrete Continuous Covariance (kernel) function

8 Gaussian processes (GPs) - theory • The covariance function
models the covariance between any two points • Covariance functions are parametrized with “hyperparameters” Squared exponential kernel

9 Gaussian processes (GPs) - theory Matern 3/2 kernel •
The covariance function models the covariance between any two points • Covariance functions are parametrized with “hyperparameters”

10 Gaussian processes (GPs) - likelihood function Probabilistic noise model
• Inverting a matrix scales as ! • Fortunately, Dan Foreman-Mackey’s celerite library does GPs in ! ! • Other approximate schemes such as variationally sparse GPs as implemented in GPFlow are potentially interesting

11 Priors • No such thing as non-informative prior •
Priors can only be understood in the context of the likelihood • Prior predictive checks are a great way of testing assumptions • Using Bayesian methods and GPs doesn’t mean you can’t overfit, need to use informative prior for length scale hyperparameter (Geir-Arne Fuglstad et. al. 2018) Inverse Gamma prior

12 Sampling the posterior with Hamiltonian Monte Carlo (HMC) •
To sample the posterior, I use Hamiltonian Monte Carlo (HMC), it’s orders of magnitude more efficient than Metropolis-Hastings or affine invariant samplers (emcee) • HMC is pretty much the only thing working in high dimensions (10s to 100s of parameters) • HMC requires gradient of log-likelihood w.r. to all model parameters, automatic differentiation is key • Don’t write your own HMC sampler, use existing libraries such as PyMC3, Stan, or TensorFlow, these will complain when sampling fails which happens very often! Bayes’ theorem Model parameters

13 Results

14 Results • Generally, including a GP in the model
leads to a different posterior over the physical parameters of interest • Just how different the posterior is depends on data quality

15 Extending the model • How to deal with outliers?
- robust GPs with Student T noise, mixture model, sigma clipping? • How to deal with reported error bars? - hierarchical model for rescaling factors? • How to incorporate other information in the noise model? - Simultaneously fitting GPs to other stars in the field, tractable approximations of multi-dimensional GPs? • GPs with binary lens events? - Sort out the model without GPs first

16 Take home messages • Modeling assumptions matter • If
you’re doing Bayesian analysis state your likelihood function and your priors • Clever parametrizations can speed up MCMC by several orders of magnitude • GPs provide elegant framework for handling correlated noise, recent innovations make them computationally tractable • If you want to use gradient based optimizers or samplers, look into machine learning frameworks such as TensorFlow and PyTorch Hack session ideas: • Microlensing data handling infrastructure, cross-matching catalogs • Interfacing VBBinaryLensing with Dnest4 Diffusive Nested Sampling code • Forward modeling light curves with inverse-ray shooting algorithm built on TensorFlow

17 Additional slides - ugly posteriors

Gaussian process models of correlated noise in ...

Gaussian process models of correlated noise in microlensing light curves

Fran Bartolić

More Decks by Fran Bartolić

Other Decks in Science

Featured

Transcript