probability distributions over functions • In practice, GPs are multivariate Gaussians with a covariance matrix specified by a covariance function • GPs are used extensively in machine learning, statistics, and astronomy for both regression and classification problems Discrete Continuous Covariance (kernel) function
• Inverting a matrix scales as ! • Fortunately, Dan Foreman-Mackey’s celerite library does GPs in ! ! • Other approximate schemes such as variationally sparse GPs as implemented in GPFlow are potentially interesting
Priors can only be understood in the context of the likelihood • Prior predictive checks are a great way of testing assumptions • Using Bayesian methods and GPs doesn’t mean you can’t overfit, need to use informative prior for length scale hyperparameter (Geir-Arne Fuglstad et. al. 2018) Inverse Gamma prior
To sample the posterior, I use Hamiltonian Monte Carlo (HMC), it’s orders of magnitude more efficient than Metropolis-Hastings or affine invariant samplers (emcee) • HMC is pretty much the only thing working in high dimensions (10s to 100s of parameters) • HMC requires gradient of log-likelihood w.r. to all model parameters, automatic differentiation is key • Don’t write your own HMC sampler, use existing libraries such as PyMC3, Stan, or TensorFlow, these will complain when sampling fails which happens very often! Bayes’ theorem Model parameters
- robust GPs with Student T noise, mixture model, sigma clipping? • How to deal with reported error bars? - hierarchical model for rescaling factors? • How to incorporate other information in the noise model? - Simultaneously fitting GPs to other stars in the field, tractable approximations of multi-dimensional GPs? • GPs with binary lens events? - Sort out the model without GPs first
you’re doing Bayesian analysis state your likelihood function and your priors • Clever parametrizations can speed up MCMC by several orders of magnitude • GPs provide elegant framework for handling correlated noise, recent innovations make them computationally tractable • If you want to use gradient based optimizers or samplers, look into machine learning frameworks such as TensorFlow and PyTorch Hack session ideas: • Microlensing data handling infrastructure, cross-matching catalogs • Interfacing VBBinaryLensing with Dnest4 Diffusive Nested Sampling code • Forward modeling light curves with inverse-ray shooting algorithm built on TensorFlow