Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bayesian model comparison (for exoplanets)

Ben Nelson
August 18, 2016

Bayesian model comparison (for exoplanets)

Talk from 2016 Sagan Summer workshop and GNOME Data Analysis Bootcamp at CIERA/NU. Heavily modeled after https://www.youtube.com/watch?v=uaztY3Lbr4A

There were some animations in the original presentations that won't appear here.

Ben Nelson

August 18, 2016
Tweet

More Decks by Ben Nelson

Other Decks in Research

Transcript

  1. Bayesian model comparison (wrt exoplanets) Ben Nelson Data Science Scholar

    at Northwestern University @exobenelson August 18, 2016 GNOME Data Analysis Bootcamp
  2. Outline 1. “Evidence”, Bayes factors, and decision making 2. How

    to efficiently compute “evidence” 3. Cross-validation as an alternative to BIC/BFs August 18, 2016 GNOME Data Analysis Bootcamp
  3. Outline 1. “Evidence”, Bayes factors, and decision making 2. How

    to efficiently compute “evidence” 3. Cross-validation as an alternative to BIC/BFs August 18, 2016 GNOME Data Analysis Bootcamp
  4. p(✓|d) = p(✓)p(d|✓) p(d) d : ✓ : M :

    data parameters prior likelihood posterior August 18, 2016 GNOME Data Analysis Bootcamp
  5. p(✓|d, M) = p(✓|M)p(d|✓, M) p(d|M) d : ✓ :

    M : data parameters model prior likelihood posterior August 18, 2016 GNOME Data Analysis Bootcamp
  6. Z p(✓|M)p(d|✓, M)d✓ p(✓|d, M) = p(✓|M)p(d|✓, M) p(d|M) d

    : ✓ : M : data parameters model = fully marginalized likelihood or “evidence” August 18, 2016 GNOME Data Analysis Bootcamp
  7. Z p(✓|M)p(d|✓, M)d✓ p(✓|d, M) = p(✓|M)p(d|✓, M) p(d|M) d

    : ✓ : M : data parameters model = fully marginalized likelihood or “evidence” Without something to compare to, FML is not very useful... August 18, 2016 GNOME Data Analysis Bootcamp
  8. model A MA : MB : model B Let’s have

    two models compete! August 18, 2016 GNOME Data Analysis Bootcamp
  9. model A model B Let’s have two models compete! Bayes

    Factor MA : MB : p(d|MA ) p(d|MB ) = R p(✓A |MA )p(d|✓A , MA )d✓A R p(✓B |MB )p(d|✓B , MB )d✓B August 18, 2016 GNOME Data Analysis Bootcamp
  10. p(d|MA ) p(d|MB ) ⇥ p(MA ) p(MB ) =

    p(MA |d) p(MB |d) Let’s have two models compete! Bayes Factor Prior on models Posterior Odds Ratio model A MA : MB : model B p(d|MA ) p(d|MB ) = R p(✓A |MA )p(d|✓A , MA )d✓A R p(✓B |MB )p(d|✓B , MB )d✓B August 18, 2016 GNOME Data Analysis Bootcamp
  11. p(d|MA ) p(d|MB ) ⇥ p(MA ) p(MB ) =

    p(MA |d) p(MB |d) “So, what do I do with this?” August 18, 2016 GNOME Data Analysis Bootcamp
  12. If BF/POR is really huge, favor If BF/POR is really

    small, favor Otherwise, it’s not very decisive. p(d|MA ) p(d|MB ) ⇥ p(MA ) p(MB ) = p(MA |d) p(MB |d) MA MB “So, what do I do with this?” August 18, 2016 GNOME Data Analysis Bootcamp
  13. If BF/POR is really huge, favor If BF/POR is really

    small, favor Otherwise, it’s not very decisive. p(d|MA ) p(d|MB ) ⇥ p(MA ) p(MB ) = p(MA |d) p(MB |d) MA Takeaway #1: model comparison ≠ model selection “So, what do I do with this?” MB August 18, 2016 GNOME Data Analysis Bootcamp
  14. Why model comparison ≠ model selection Model comparison just gives

    probabilities. Model selection is a decision based on other (outside) factors, i.e. a cost function/utility. August 18, 2016 GNOME Data Analysis Bootcamp
  15. Model comparison just gives probabilities. Model selection is a decision

    based on other (outside) factors, i.e. a cost function/utility. Most rigorous thing to do is average all models, not select the most probable. Why model comparison ≠ model selection p(✓|d) = K X k=1 p(✓|d, Mk )p(Mk |d) August 18, 2016 GNOME Data Analysis Bootcamp
  16. Outline 1. “Evidence”, Bayes factors, and decision making 2. How

    to efficiently compute “evidence” 3. Cross-validation as an alternative to BIC/BFs August 18, 2016 GNOME Data Analysis Bootcamp
  17. Computing FML in practice p(d|M) = Z p(✓|M)p(d|✓, M)d✓ Takeaway

    #2: This integral is HARD August 18, 2016 GNOME Data Analysis Bootcamp
  18. Computing FML in practice p(d|M) = Z p(✓|M)p(d|✓, M)d✓ Takeaway

    #2: This integral is HARD * * but there’s an entire literature on how to compute this efficiently August 18, 2016 GNOME Data Analysis Bootcamp
  19. “This is too expensive/difficult. Can’t I just compute maximum likelihood

    estimates?” August 18, 2016 GNOME Data Analysis Bootcamp
  20. “This is too expensive/difficult. Can’t I just compute maximum likelihood

    estimates?” There’s a 3-parameter model can fit any scatterplot exactly: August 18, 2016 GNOME Data Analysis Bootcamp
  21. y = A cos(kx + ) “This is too expensive/difficult.

    Can’t I just compute maximum likelihood estimates?” There’s a 3-parameter model can fit any scatterplot exactly: August 18, 2016 GNOME Data Analysis Bootcamp
  22. Thermodynamic Integration (Theory) 1. Start with parallel-tempering MCMC. i.e. multiple

    MCMCs with likelihoods taken to different powers: p (✓|d) / p(✓)p (d|✓) Earl & Deem 2010, Phys Chem Chem Phys August 18, 2016 GNOME Data Analysis Bootcamp
  23. 1. Start with parallel-tempering MCMC. i.e. multiple MCMCs with likelihoods

    taken to different powers: 2. FML at is p (✓|d) / p(✓)p (d|✓) p (d) = Z p(✓)p (d|✓)d✓ Thermodynamic Integration (Theory) Earl & Deem 2010, Phys Chem Chem Phys August 18, 2016 GNOME Data Analysis Bootcamp
  24. 1. Start with parallel-tempering MCMC. i.e. multiple MCMCs with likelihoods

    taken to different powers: 2. FML at is 3. Ultimately, derive... p (d) = Z p(✓)p (d|✓)d✓ p (✓|d) / p(✓)p (d|✓) p ( d ) ⇡ exp Z 1 0 d h log p ( d|✓ ) i “average” log-likelihood at Thermodynamic Integration (Theory) Earl & Deem 2010, Phys Chem Chem Phys August 18, 2016 GNOME Data Analysis Bootcamp
  25. p ( d ) ⇡ exp Z 1 0 d

    h log p ( d|✓ ) i Thermodynamic Integration (Practice) Advantages: 1. A nice side effect of performing PTMCMC 2. Already implemented in emcee* Caveats: Need a robust estimate of at every h log p ( d|✓ ) i *dan.iel.fm/emcee/current/user/pt August 18, 2016 GNOME Data Analysis Bootcamp
  26. Importance Sampling (Theory) d p(d) ⇡ 1 N X ✓i

    ⇠g(✓) p(✓i )p(d|✓i ) g(✓i ) August 18, 2016 GNOME Data Analysis Bootcamp
  27. Importance Sampling (Theory) PC Guo Thesis 2012 Weinberg+ 2013 Nelson+

    2016 . . . = 1 N X ✓i ⇠g⌧ (✓) p(✓i )p(d|✓i ) g⌧ (✓i ) August 18, 2016 GNOME Data Analysis Bootcamp
  28. fMCMC ⇥ d p(d) ⇡ 1 N X ✓i ⇠g⌧

    (✓) p(✓i )p(d|✓i ) g⌧ (✓i ) Importance Sampling (Theory) PC Guo Thesis 2012 Weinberg+ 2013 Nelson+ 2016 Fraction of MCMC samples to reside in subspace τ August 18, 2016 GNOME Data Analysis Bootcamp
  29. Importance Sampling (Theory) PC Guo Thesis 2012 Weinberg+ 2013 Nelson+

    2016 d p(d)⇡ 1 N ⇥ fMCMC X ✓i ⇠g⌧ (✓) p(✓i )p(d|✓i ) g⌧ (✓i ) August 18, 2016 GNOME Data Analysis Bootcamp
  30. Importance Sampling (Practice) Advantages: 1. Embarrassingly parallel 2. Have a

    posterior sample? Already partway there! Caveats: 1. Performance depends on chosen or 2. Needs a robust value of fMCMC g(✓) d p(d) ⇡ 1 N X ✓i ⇠g(✓) p(✓i )p(d|✓i ) g(✓i ) d p(d)⇡ 1 N ⇥ fMCMC X ✓i ⇠g⌧ (✓) p(✓i )p(d|✓i ) g⌧ (✓i ) g⌧ (✓) August 18, 2016 GNOME Data Analysis Bootcamp
  31. Importance Sampling (Gliese 876) Nelson+ 2016 Seth Pritchard (undergrad at

    UT San Antonio) August 18, 2016 GNOME Data Analysis Bootcamp
  32. Importance Sampling (Tutorial) Nelson+ 2016 github.com/benelson/FML Features: -  generate synthetic

    RVs of input planetary system -  MCMC with n-body model -  step-by-step importance sampling tutorial August 18, 2016 GNOME Data Analysis Bootcamp
  33. Nested Sampling Science: determining evidence for exomoons (Kipping+ 2013], functional

    form of eccentricity distribution (Kipping 2013), testing n-planets in RV observations (Brewer & Donovan 2015) Publicly available code: Multinest (Feroz & Hobson 2008, Feroz 2009), DNest3/4 (Brewer+ 2010), Transdimensional MCMC (Brewer & Donovan 2015) Geometric Path Monte Carlo Science: testing n-planets in RV observations (Hou, Goodman, Hogg 2014) Savage-Dickey Density Ratio Specializes in comparing nested models with 1-2 parameter difference Science: Mass of Mars-sized Kepler-138b (Jontof-Hutter+ 2016) More methods August 18, 2016 GNOME Data Analysis Bootcamp
  34. Nested Sampling Science: determining evidence for exomoons (Kipping+ 2013], functional

    form of eccentricity distribution (Kipping 2013), testing n-planets in RV observations (Brewer & Donovan 2015) Publicly available code: Multinest (Feroz & Hobson 2008, Feroz 2009), DNest3/4 (Brewer+ 2010), Transdimensional MCMC (Brewer & Donovan 2015) Geometric Path Monte Carlo Science: testing n-planets in RV observations (Hou, Goodman, Hogg 2014) Savage-Dickey Density Ratio Specializes in comparing nested models with 1-2 parameter difference Science: Mass of Mars-sized Kepler-138b (Jontof-Hutter+ 2016) More methods Takeaway #3: Computing FML is an active field in itself August 18, 2016 GNOME Data Analysis Bootcamp
  35. Outline 1. “Evidence”, Bayes factors, and decision making 2. How

    to efficiently compute “evidence” 3. Cross-validation as an alternative to BIC/BFs August 18, 2016 GNOME Data Analysis Bootcamp
  36. Computational Difficulty Assumptions Made AIC BIC few parameters many parameters

    Cross-validation August 18, 2016 GNOME Data Analysis Bootcamp
  37. Cross-validation ✓(d) p(d|✓(d) , M) cvl = 1.; for (d

    in data){ get parameters that optimize on data WITHOUT d; cvl *= ; } August 18, 2016 GNOME Data Analysis Bootcamp
  38. Cross-validation ✓(d) p(d|✓(d) , M) cvl = 1.; for (d

    in data){ get parameters that optimize on data WITHOUT d; cvl *= ; } August 18, 2016 GNOME Data Analysis Bootcamp
  39. Cross-validation cvl = 1.; for (d in data){ get parameters

    that optimize on data WITHOUT d; cvl *= ; } ✓(d) p(d|✓(d) , M) August 18, 2016 GNOME Data Analysis Bootcamp
  40. Cross-validation ✓(d) p(d|✓(d) , M) The model with the largest

    cross-validation likelihood (cvl) is preferred. cvl = 1.; for (d in data){ get parameters that optimize on data WITHOUT d; cvl *= ; } August 18, 2016 GNOME Data Analysis Bootcamp
  41. Takeaway #4: General Recommendations 1.  Do you need to make

    many rough decisions quickly (i.e. milliseconds)? AIC/BIC 2.  Do you have decent computational resources and really understand your priors/utility? Bayes factor/posterior odds ratio 3.  Is your problem somewhat in between? Cross-validation August 18, 2016 GNOME Data Analysis Bootcamp
  42. Conclusions model comparison ≠ model selection: how to decide depends

    on your utility For 3+ parameter models, computing FML is hard. But it’s an active problem in exoplanet research. For tutorial on using importance sampling to compute FMLs: github.com/benelson/FML August 18, 2016 GNOME Data Analysis Bootcamp