Bayesian model comparison (for exoplanets)

Bayesian model comparison (wrt exoplanets) Ben Nelson Data Science Scholar
at Northwestern University @exobenelson August 18, 2016 GNOME Data Analysis Bootcamp

Outline 1. “Evidence”, Bayes factors, and decision making 2. How
to efficiently compute “evidence” 3. Cross-validation as an alternative to BIC/BFs August 18, 2016 GNOME Data Analysis Bootcamp

p(✓|d) = p(✓)p(d|✓) p(d) d : ✓ : M :
data parameters prior likelihood posterior August 18, 2016 GNOME Data Analysis Bootcamp

p(✓|d, M) = p(✓|M)p(d|✓, M) p(d|M) d : ✓ :
M : data parameters model prior likelihood posterior August 18, 2016 GNOME Data Analysis Bootcamp

Z p(✓|M)p(d|✓, M)d✓ p(✓|d, M) = p(✓|M)p(d|✓, M) p(d|M) d
: ✓ : M : data parameters model = fully marginalized likelihood or “evidence” August 18, 2016 GNOME Data Analysis Bootcamp

Z p(✓|M)p(d|✓, M)d✓ p(✓|d, M) = p(✓|M)p(d|✓, M) p(d|M) d
: ✓ : M : data parameters model = fully marginalized likelihood or “evidence” Without something to compare to, FML is not very useful... August 18, 2016 GNOME Data Analysis Bootcamp

model A MA : MB : model B Let’s have
two models compete! August 18, 2016 GNOME Data Analysis Bootcamp

p(d|MA ) p(d|MB ) ⇥ p(MA ) p(MB ) =
p(MA |d) p(MB |d) “So, what do I do with this?” August 18, 2016 GNOME Data Analysis Bootcamp

If BF/POR is really huge, favor If BF/POR is really
small, favor Otherwise, it’s not very decisive. p(d|MA ) p(d|MB ) ⇥ p(MA ) p(MB ) = p(MA |d) p(MB |d) MA MB “So, what do I do with this?” August 18, 2016 GNOME Data Analysis Bootcamp

If BF/POR is really huge, favor If BF/POR is really
small, favor Otherwise, it’s not very decisive. p(d|MA ) p(d|MB ) ⇥ p(MA ) p(MB ) = p(MA |d) p(MB |d) MA Takeaway #1: model comparison ≠ model selection “So, what do I do with this?” MB August 18, 2016 GNOME Data Analysis Bootcamp

Why model comparison ≠ model selection Model comparison just gives
probabilities. Model selection is a decision based on other (outside) factors, i.e. a cost function/utility. August 18, 2016 GNOME Data Analysis Bootcamp

Model comparison just gives probabilities. Model selection is a decision
based on other (outside) factors, i.e. a cost function/utility. Most rigorous thing to do is average all models, not select the most probable. Why model comparison ≠ model selection p(✓|d) = K X k=1 p(✓|d, Mk )p(Mk |d) August 18, 2016 GNOME Data Analysis Bootcamp

Computing FML in practice p(d|M) = Z p(✓|M)p(d|✓, M)d✓ August
18, 2016 GNOME Data Analysis Bootcamp

Computing FML in practice p(d|M) = Z p(✓|M)p(d|✓, M)d✓ Takeaway
#2: This integral is HARD August 18, 2016 GNOME Data Analysis Bootcamp

Computing FML in practice p(d|M) = Z p(✓|M)p(d|✓, M)d✓ Takeaway
#2: This integral is HARD * * but there’s an entire literature on how to compute this efficiently August 18, 2016 GNOME Data Analysis Bootcamp

“This is too expensive/difficult. Can’t I just compute maximum likelihood
estimates?” August 18, 2016 GNOME Data Analysis Bootcamp

“This is too expensive/difficult. Can’t I just compute maximum likelihood
estimates?” There’s a 3-parameter model can fit any scatterplot exactly: August 18, 2016 GNOME Data Analysis Bootcamp

y = A cos(kx + ) “This is too expensive/difficult.
Can’t I just compute maximum likelihood estimates?” There’s a 3-parameter model can fit any scatterplot exactly: August 18, 2016 GNOME Data Analysis Bootcamp

Thermodynamic Integration (Theory) 1. Start with parallel-tempering MCMC. i.e. multiple
MCMCs with likelihoods taken to different powers: p (✓|d) / p(✓)p (d|✓) Earl & Deem 2010, Phys Chem Chem Phys August 18, 2016 GNOME Data Analysis Bootcamp

1. Start with parallel-tempering MCMC. i.e. multiple MCMCs with likelihoods
taken to different powers: 2. FML at is p (✓|d) / p(✓)p (d|✓) p (d) = Z p(✓)p (d|✓)d✓ Thermodynamic Integration (Theory) Earl & Deem 2010, Phys Chem Chem Phys August 18, 2016 GNOME Data Analysis Bootcamp

1. Start with parallel-tempering MCMC. i.e. multiple MCMCs with likelihoods
taken to different powers: 2. FML at is 3. Ultimately, derive... p (d) = Z p(✓)p (d|✓)d✓ p (✓|d) / p(✓)p (d|✓) p ( d ) ⇡ exp Z 1 0 d h log p ( d|✓ ) i “average” log-likelihood at Thermodynamic Integration (Theory) Earl & Deem 2010, Phys Chem Chem Phys August 18, 2016 GNOME Data Analysis Bootcamp

p ( d ) ⇡ exp Z 1 0 d
h log p ( d|✓ ) i Thermodynamic Integration (Practice) Advantages: 1. A nice side effect of performing PTMCMC 2. Already implemented in emcee* Caveats: Need a robust estimate of at every h log p ( d|✓ ) i *dan.iel.fm/emcee/current/user/pt August 18, 2016 GNOME Data Analysis Bootcamp

Importance Sampling (Theory) p(d) = Z p(✓)p(d|✓) g(✓) g(✓)d✓ August

p(d) = Z p(✓)p(d|✓) g(✓) g(✓)d✓ Importance Sampling (Theory) August

Importance Sampling (Theory) d p(d) ⇡ 1 N X ✓i
⇠g(✓) p(✓i )p(d|✓i ) g(✓i ) August 18, 2016 GNOME Data Analysis Bootcamp

Importance Sampling (Theory) PC Guo Thesis 2012 Weinberg+ 2013 Nelson+
2016 . . . = 1 N X ✓i ⇠g⌧ (✓) p(✓i )p(d|✓i ) g⌧ (✓i ) August 18, 2016 GNOME Data Analysis Bootcamp

fMCMC ⇥ d p(d) ⇡ 1 N X ✓i ⇠g⌧
(✓) p(✓i )p(d|✓i ) g⌧ (✓i ) Importance Sampling (Theory) PC Guo Thesis 2012 Weinberg+ 2013 Nelson+ 2016 Fraction of MCMC samples to reside in subspace τ August 18, 2016 GNOME Data Analysis Bootcamp

Importance Sampling (Theory) PC Guo Thesis 2012 Weinberg+ 2013 Nelson+
2016 d p(d)⇡ 1 N ⇥ fMCMC X ✓i ⇠g⌧ (✓) p(✓i )p(d|✓i ) g⌧ (✓i ) August 18, 2016 GNOME Data Analysis Bootcamp

Importance Sampling (Practice) Advantages: 1. Embarrassingly parallel 2. Have a
posterior sample? Already partway there! Caveats: 1. Performance depends on chosen or 2. Needs a robust value of fMCMC g(✓) d p(d) ⇡ 1 N X ✓i ⇠g(✓) p(✓i )p(d|✓i ) g(✓i ) d p(d)⇡ 1 N ⇥ fMCMC X ✓i ⇠g⌧ (✓) p(✓i )p(d|✓i ) g⌧ (✓i ) g⌧ (✓) August 18, 2016 GNOME Data Analysis Bootcamp

Importance Sampling (Gliese 876) Nelson+ 2016 Seth Pritchard (undergrad at
UT San Antonio) August 18, 2016 GNOME Data Analysis Bootcamp

Importance Sampling (Tutorial) Nelson+ 2016 github.com/benelson/FML Features: -  generate synthetic
RVs of input planetary system -  MCMC with n-body model -  step-by-step importance sampling tutorial August 18, 2016 GNOME Data Analysis Bootcamp

Nested Sampling Science: determining evidence for exomoons (Kipping+ 2013], functional
form of eccentricity distribution (Kipping 2013), testing n-planets in RV observations (Brewer & Donovan 2015) Publicly available code: Multinest (Feroz & Hobson 2008, Feroz 2009), DNest3/4 (Brewer+ 2010), Transdimensional MCMC (Brewer & Donovan 2015) Geometric Path Monte Carlo Science: testing n-planets in RV observations (Hou, Goodman, Hogg 2014) Savage-Dickey Density Ratio Specializes in comparing nested models with 1-2 parameter difference Science: Mass of Mars-sized Kepler-138b (Jontof-Hutter+ 2016) More methods August 18, 2016 GNOME Data Analysis Bootcamp

Nested Sampling Science: determining evidence for exomoons (Kipping+ 2013], functional
form of eccentricity distribution (Kipping 2013), testing n-planets in RV observations (Brewer & Donovan 2015) Publicly available code: Multinest (Feroz & Hobson 2008, Feroz 2009), DNest3/4 (Brewer+ 2010), Transdimensional MCMC (Brewer & Donovan 2015) Geometric Path Monte Carlo Science: testing n-planets in RV observations (Hou, Goodman, Hogg 2014) Savage-Dickey Density Ratio Specializes in comparing nested models with 1-2 parameter difference Science: Mass of Mars-sized Kepler-138b (Jontof-Hutter+ 2016) More methods Takeaway #3: Computing FML is an active field in itself August 18, 2016 GNOME Data Analysis Bootcamp

Computational Difficulty Assumptions Made August 18, 2016 GNOME Data Analysis
Bootcamp

Computational Difficulty Assumptions Made AIC BIC few parameters many parameters
August 18, 2016 GNOME Data Analysis Bootcamp

Computational Difficulty Assumptions Made AIC BIC few parameters many parameters
Cross-validation August 18, 2016 GNOME Data Analysis Bootcamp

Cross-validation ✓(d) p(d|✓(d) , M) cvl = 1.; for (d
in data){ get parameters that optimize on data WITHOUT d; cvl *= ; } August 18, 2016 GNOME Data Analysis Bootcamp

Cross-validation cvl = 1.; for (d in data){ get parameters
that optimize on data WITHOUT d; cvl *= ; } ✓(d) p(d|✓(d) , M) August 18, 2016 GNOME Data Analysis Bootcamp

Cross-validation ✓(d) p(d|✓(d) , M) The model with the largest
cross-validation likelihood (cvl) is preferred. cvl = 1.; for (d in data){ get parameters that optimize on data WITHOUT d; cvl *= ; } August 18, 2016 GNOME Data Analysis Bootcamp

Takeaway #4: General Recommendations 1.  Do you need to make
many rough decisions quickly (i.e. milliseconds)? AIC/BIC 2.  Do you have decent computational resources and really understand your priors/utility? Bayes factor/posterior odds ratio 3.  Is your problem somewhat in between? Cross-validation August 18, 2016 GNOME Data Analysis Bootcamp

Conclusions model comparison ≠ model selection: how to decide depends
on your utility For 3+ parameter models, computing FML is hard. But it’s an active problem in exoplanet research. For tutorial on using importance sampling to compute FMLs: github.com/benelson/FML August 18, 2016 GNOME Data Analysis Bootcamp

Bayesian model comparison (for exoplanets)

Bayesian model comparison (for exoplanets)

More Decks by Ben Nelson

Other Decks in Research

Featured

Transcript