Data analysis with MCMC

data analysis with Markov chain Monte Carlo Dan Foreman-Mackey CCPP@NYU

   dan.iel.fm dfm @exoplaneteer Dan Foreman-Mackey

I'm a grad student in Camp Hogg at NYU David
W. Hogg

I'm an engineer in Camp Hogg at NYU David W.
Hogg

David W. Hogg I build tools for data analysis in
astronomy(mostly)

emceethe MCMC Hammer introducing arxiv.org/abs/1202.3665 dan.iel.fm/emcee it's hammer time!

I work on the Local Group exoplanets variable stars image
modeling calibration

I work on the Local Group exoplanets variable stars image
modeling calibration not writing papers

Physics Data The graphical model of my research. a sketch
of

Physics Data The graphical model of my research. a sketch
of the generative view of data analysis

2 = N X n=1 [ yn ( m xn
+ b )]2 2 n a line ˆ yn = m xn + b + ✏n ✏n ⇠ N(0; 2 n ) synthetic data: noise model: p( { yn, xn, 2 n } | m, b) / exp ✓ 1 2 2 ◆ for example

Physics The graphical model of my research. a sketch of
inference Data

p(data | physics) likelihood function/generative model p(physics | data) /
p(physics) p(data | physics) posterior probability

but physics is everything. including things we don't care about!
good/bad news:

a more realistic model: p(D | ✓, ↵) data cool
physics garbage

what if we underestimated our error bars? N Y n=1
1 p 2 ⇡ ( 2 n + 2 ) exp  1 2 [yn (m xn + b)] 2 2 n + 2 p ({ yn, xn, 2 n } | m, b, 2) = "physics" "jitter" (not physics) (by some additive variance)

p(D | ✓) / Z p(↵) p(D | ✓, ↵)
d↵ marginalize. Do The Right Thing™

Z d 2 p( 2 ) N Y n=1 1
p 2 ⇡ ( 2 n + 2 ) exp  1 2 [yn (m xn + b)] 2 2 n + 2 p ({ yn, xn, n } | m, b ) = in our example BOOM!

What do you really want?

What do you really want? Ep[f(✓)] = 1 Z Z
f(✓) p(✓) p(D | ✓) d✓

probably. What do you really want? Ep[f(✓)] = 1 Z
Z f(✓) p(✓) p(D | ✓) d✓

Ep[f(✓)] = 1 Z Z f(✓) p(✓) p(D | ✓)
d✓ p(D | ✓) / Z p(↵) p(D | ✓, ↵) d↵ marginalization expectation

Ep[f(✓)] = 1 Z Z f(✓) p(✓) p(D | ✓)
d✓ p(D | ✓) / Z p(↵) p(D | ✓, ↵) d↵ marginalization expectation large number of dimensions

Ep[f(✓)] = 1 Z Z f(✓) p(✓) p(D | ✓)
d✓ p(D | ✓) / Z p(↵) p(D | ✓, ↵) d↵ marginalization expectation large number of dimensions whoa!

Ep[f(✓)] = 1 Z Z f(✓) p(✓) p(D | ✓)
d✓ p(D | ✓) / Z p(↵) p(D | ✓, ↵) d↵ marginalization expectation This is HARD. (in general) large number of dimensions whoa!

OK now that we agree…

Z f( x ) p( x ) d x ⇡
1 N N X n=1 f( xn) xn ⇠ p( x ) where: error: number of independent samples as you learned in middle school / 1 p N0

MCMC draws samples from a probability function and all you
need to be able to do is evaluate the function (up to a constant)

Metropolis–Hastings

Metropolis–Hastings in an ideal world

start here perhaps Metropolis–Hastings in an ideal world

propose a new position Metropolis–Hastings in an ideal world x
0 ⇠ q( x 0; x )

0 ⇠ q( x 0; x ) accept? p(accept) = min ✓ 1, p( x ) p( x 0) q( x ; x 0) q( x 0; x ) ◆

0 ⇠ q( x 0; x ) deﬁnitely. accept? p(accept) = min ✓ 1, p( x ) p( x 0) q( x ; x 0) q( x 0; x ) ◆

0 ⇠ q( x 0; x ) deﬁnitely. accept? p(accept) = min ✓ 1, p( x ) p( x 0) q( x ; x 0) q( x 0; x ) ◆ only relative probabilities

Metropolis–Hastings in an ideal world x 0 ⇠ q( x
0; x )

Metropolis–Hastings in an ideal world x 0 ⇠ q( x
0; x ) accept? p(accept) = min ✓ 1, p( x ) p( x 0) q( x ; x 0) q( x 0; x ) ◆

not this time. Metropolis–Hastings in an ideal world x 0
⇠ q( x 0; x ) accept? p(accept) = min ✓ 1, p( x ) p( x 0) q( x ; x 0) q( x 0; x ) ◆

double count! Metropolis–Hastings in an ideal world

Metropolis–Hastings in the real world

Metropolis–Hastings in the real world the Small Acceptance Fraction problem

Metropolis–Hastings in the real world the Huge Acceptance Fraction problem

a brief aside.

YOUR LIKELIHOOD FUNCTION IS EXPENSIVE

YOUR LIKELIHOOD FUNCTION IS EXPENSIVE REMEMBER?

Metropolis–Hastings in the real world D(D+1)/2 tuning parameters that's

Metropolis–Hastings in the real world D(D+1)/2 tuning parameters that's the
dimension of your problem

Metropolis–Hastings in the real world D(D+1)/2 tuning parameters that's the
dimension of your problem Scientific Awesomeness how hard is MCMC Metropolis Hastings how things Should be (~number of parameters)

that being said, go code up your own Metropolis–Hastings code
right now. (well not right right now)

Jonathan Goodman Jonathan Weare (dfm.io/mcmc-gw10) "Ensemble samplers with affine invariance"

Ensemble Samplers in the real world

Ensemble samplers with affine invariance

y A x + b Affine Transformation hard easy

y A x + b Affine Transformation hard easy THE
SAME

choose a helper Ensemble Samplers in the real world

choose a helper Ensemble Samplers in the real world propose
a new position

choose a helper Ensemble Samplers in the real world accept?
p(accept) = min ✓ 1, ZD 1 p( x ) p( x 0) ◆ propose a new position

choose a helper Ensemble Samplers in the real world

go code up your own Ensemble Sampler code right now?

emceethe MCMC Hammer introducing arxiv.org/abs/1202.3665 dan.iel.fm/emcee it's hammer time!

emceethe MCMC Hammer introducing arxiv.org/abs/1202.3665 dan.iel.fm/emcee it's hammer time! pip
install emcee to install:

it's hammer time! using emcee is easy let me show
you...

N Y n=1 1 p 2 ⇡ ( 2 n
+ 2 ) exp  1 2 [yn (m xn + b)] 2 2 n + 2 p ({ yn, xn, 2 n } | m, b, 2) =

import numpy as np import emcee def lnprobfn(p, x, y,
yerr): m, b, d2 = p if not 0 <= d2 <= 1: return -np.inf ivar = 1.0 / (yerr ** 2 + d2) chi2 = np.sum((y - m * x - b) ** 2 * ivar) return -0.5 * (chi2 - np.sum(np.log(ivar))) # Load data. # x, y, yerr = ... # Set up sampler and initialize the walkers. nwalkers, ndim = 100, 3 sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprobfn, args=(x, y, yerr)) p0 = [[np.random.rand(), 10 * np.random.rand(), np.random.rand()] for k in range(nwalkers)] # Go. sampler.run_mcmc(p0, 1000) m, b, d2 = sampler.flatchain.T

import numpy as np def lnprobfn(p, x, y, yerr): m,
b, d2 = p if not 0 <= d2 <= 1: return -np.inf ivar = 1.0 / (yerr ** 2 + d2) chi2 = np.sum((y - m * x - b) ** 2 * ivar) return -0.5 * (chi2 - np.sum(np.log(ivar))) p ( 2 ) = ⇢ 1 , if 0  2  1 0 , otherwise 2 = N X n=1 ( yn m xn + b )2 2 n + 2 ln p ({ xn, yn, n } | m, b, 2) = 1 2 2 1 2 N X n=1 ln 2 n + 2

import emcee nwalkers, ndim = 100, 3 sampler = emcee.EnsembleSampler(nwalkers,
ndim, lnprobfn, args=(x, y, yerr)) p0 = [[np.random.rand(), 10 * np.random.rand(), np.random.rand()] for k in range(nwalkers)] sampler.run_mcmc(p0, 1000) m, b, d2 = sampler.flatchain.T mk ⇠ U(0, 1) bk ⇠ U(0, 10) 2 k ⇠ U(0, 1) initialize each walker run MCMC for 1000 steps get the samples initialize the sampler

burn-in

convergence

acceptance fraction?

acceptance fraction? whoa!

acceptance fraction? whoa! that's a lot of significant digits...

dfm/acor 

displaying results

dfm/triangle.py 

sharing results you need a number for your abstract?!?

1 sort the samples 2 compute moments/quantiles (0.16, 0.5, 084)

X = ¯ X+ + what does it MEAN?

X = ¯ X+ + what does it MEAN? 3 machine readable samplings (including priors values)

X = ¯ X+ + what does it MEAN? 3 machine readable samplings (including priors values) my pipe dream

Brendon Brewer's words of wisdom...

emcee isn't always The Right Choice™ Remember: Brendon Brewer's words
of wisdom...

what about multimodal densities? ✓ p(✓ | D)

what about multimodal densities? ✓ p(✓ | D) p(accept)

what to do?

what to do? ✓ p(✓ | D) 1 burn-in &
priors

priors Z = p(D | M) = Z p(D, ✓ | M) d✓ 2 use a different algorithm (gasp!)

priors Z = p(D | M) = Z p(D, ✓ | M) d✓ 2 use a different algorithm (gasp!) github.com/eggplantbren/DNest3

sometimes it is useful

emcee is community supported.

emcee has good documentation.

emcee has a live support team. [email protected]

Alex Conley (UC Boulder) Jason Davies (Jason Davies Ltd.) Will
Meierjurgen Farr (Northwestern) David W. Hogg (NYU) Dustin Lang (CMU) Phil Marshall (Oxford) Ilya Pashchenko (ASC LPI, Moscow) Adrian Price-Whelan (Columbia) Jeremy Sanders (Cambridge) Joe Zuntz (Oxford) Eric Agol (UW) Jo Bovy (IAS) Jacqueline Chen (MIT) John Gizis (Delaware) Jonathan Goodman (NYU) Marius Millea (UC Davis) Jennifer Piscionere (Vanderbilt) contributors

take home. p(D | ✓, ↵) your model:

take home. p(D | ✓, ↵) it's hammer time! p(D
| ✓) / Z p(↵) p(D | ✓, ↵) d↵ your model:

take home. p(D | ✓, ↵) ✓ = it's hammer
time! p(D | ✓) / Z p(↵) p(D | ✓, ↵) d↵ your model:

Alex Conley (UC Boulder) Jason Davies (Jason Davies Ltd.) Will
Meierjurgen Farr (Northwestern) David W. Hogg (NYU) Dustin Lang (CMU) Phil Marshall (Oxford) Ilya Pashchenko (ASC LPI, Moscow) Adrian Price-Whelan (Columbia) Jeremy Sanders (Cambridge) Joe Zuntz (Oxford) Eric Agol (UW) Jo Bovy (IAS) Jacqueline Chen (MIT) John Gizis (Delaware) Jonathan Goodman (NYU) Marius Millea (UC Davis) Jennifer Piscionere (Vanderbilt) thanks!

Data analysis with MCMC

Data analysis with MCMC

More Decks by Dan Foreman-Mackey

Other Decks in Science

Featured

Transcript