Upgrade to Pro — share decks privately, control downloads, hide ads and more …

pyastro16

 pyastro16

"Tools for probabilistic data analysis in Python" from the PyAstro16 conference

Dan Foreman-Mackey

March 23, 2016
Tweet

More Decks by Dan Foreman-Mackey

Other Decks in Science

Transcript

  1. Physics Data mean model (physical parameters → predicted data) noise

    (stochastic; instrument, systematics, etc.) inference (parameter estimation)
  2. Linear regression if you have: a linear mean model and

    known Gaussian uncertainties and you want: "best" parameters and uncertainties
  3. Linear (mean) models y = m x + b y

    = a2 x 2 + a1 x + a0
  4. Linear (mean) models y = m x + b y

    = a2 x 2 + a1 x + a0 y = a sin( x + w )
  5. Linear (mean) models y = m x + b y

    = a2 x 2 + a1 x + a0 y = a sin( x + w ) + known Gaussian uncertainties
  6. Linear regression # x, y, yerr are numpy arrays of

    the same shape import numpy as np A = np.vander(x, 2) ATA = np.dot(A.T, A / yerr[:, None]**2) sigma_w = np.linalg.inv(ATA) mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
  7. Linear regression # x, y, yerr are numpy arrays of

    the same shape import numpy as np A = np.vander(x, 2) ATA = np.dot(A.T, A / yerr[:, None]**2) sigma_w = np.linalg.inv(ATA) mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2)) A = 0 B B B @ x1 1 x2 1 . . . . . . xn 1 1 C C C A
  8. Linear regression # x, y, yerr are numpy arrays of

    the same shape import numpy as np A = np.vander(x, 2) ATA = np.dot(A.T, A / yerr[:, None]**2) sigma_w = np.linalg.inv(ATA) mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2)) A = 0 B B B @ x1 1 x2 1 . . . . . . xn 1 1 C C C A w = ✓ m b ◆
  9. Linear regression # x, y, yerr are numpy arrays of

    the same shape import numpy as np A = np.vander(x, 2) ATA = np.dot(A.T, A / yerr[:, None]**2) sigma_w = np.linalg.inv(ATA) mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2)) That's it! (in other words: "Don't use MCMC for linear regression!") A = 0 B B B @ x1 1 x2 1 . . . . . . xn 1 1 C C C A w = ✓ m b ◆
  10. Maximum likelihood if you have: a non-linear mean model and/or

    non-Gaussian/unknown noise and you want: "best" parameters
  11. Example likelihood function " " 2 ln p( { yn

    } | ✓) = 1 2 N X n=1 [yn f✓(xn)] 2 n 2 + constant log-likelihood mean model
  12. # x, y, yerr are numpy arrays of the same

    shape import numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2) r = minimize(nll, [1.0, 10.0, 1.5]) print(r) Likelihoods ln p( { yn } | ✓) = 1 2 N X n=1 [yn f✓(xn)] 2 n 2 + constant
  13. # x, y, yerr are numpy arrays of the same

    shape import numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2) r = minimize(nll, [1.0, 10.0, 1.5]) print(r) Likelihoods f✓ ( xn ) = a 1 + e b ( xn c ) ln p( { yn } | ✓) = 1 2 N X n=1 [yn f✓(xn)] 2 n 2 + constant
  14. # x, y, yerr are numpy arrays of the same

    shape import numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2) r = minimize(nll, [1.0, 10.0, 1.5]) print(r) Likelihoods f✓ ( xn ) = a 1 + e b ( xn c ) ln p({yn } | ✓) ln p( { yn } | ✓) = 1 2 N X n=1 [yn f✓(xn)] 2 n 2 + constant
  15. AutoDiff to the rescue! "The most criminally underused tool in

    the [PyAstro] toolkit" — adapted from justindomke.wordpress.com
  16. import autograd.numpy as np from autograd import elementwise_grad def f(x):

    y = np.exp(-x) return (1.0 - y) / (1.0 + y) df = elementwise_grad(f) ddf = elementwise_grad(df) HIPS/autograd just works
  17. import autograd.numpy as np from autograd import elementwise_grad def f(x):

    y = np.exp(-x) return (1.0 - y) / (1.0 + y) df = elementwise_grad(f) ddf = elementwise_grad(df) HIPS/autograd just works 4 2 0 2 4 x 1.0 0.5 0.0 0.5 1.0 f(x); f0(x); f00(x)
  18. before autograd # x, y, yerr are numpy arrays of

    the same shape import numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): r = (y - model(theta, x)) / yerr return 0.5 * np.sum(r*r) r = minimize(neg_log_like, [1.0, 10.0, 1.5]) print(r)
  19. # x, y, yerr are numpy arrays of the same

    shape from autograd import grad import autograd.numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): r = (y - model(theta, x)) / yerr return 0.5 * np.sum(r*r) r = minimize(neg_log_like, [1.0, 10.0, 1.5], jac=grad(neg_log_like)) print(r) after autograd
  20. # x, y, yerr are numpy arrays of the same

    shape from autograd import grad import autograd.numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): r = (y - model(theta, x)) / yerr return 0.5 * np.sum(r*r) r = minimize(neg_log_like, [1.0, 10.0, 1.5], jac=grad(neg_log_like)) print(r) after autograd
  21. # x, y, yerr are numpy arrays of the same

    shape from autograd import grad import autograd.numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): r = (y - model(theta, x)) / yerr return 0.5 * np.sum(r*r) r = minimize(neg_log_like, [1.0, 10.0, 1.5], jac=grad(neg_log_like)) print(r) 115 calls 66 calls after autograd
  22. HIPS/autograd just works but… HIPS/autograd is not super fast you

    might need to drop down to a compiled language
  23. HIPS/autograd just works but… HIPS/autograd is not super fast you

    might need to drop down to a compiled language or...
  24. Uncertainty quantification if you have: a non-linear mean model and/or

    non-Gaussian/unknown noise and you want: parameter uncertainties
  25. Uncertainty p(physics | data) / p(data | physics) p(physics) distribution

    of physical parameters consistent with data likelihood prior
  26. MCMC sampling with emcee # x, y, yerr are numpy

    arrays of the same shape import emcee import numpy as np def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def log_prob(theta): log_prior = 0.0 r = (y - model(theta, x)) / yerr return -0.5 * np.sum(r*r) + log_prior ndim, nwalkers = 3, 32 p0 = np.array([1.0, 10.0, 1.5]) p0 = p0 + 0.01*np.random.randn(nwalkers, ndim) sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob) sampler.run_mcmc(p0, 1000)
  27. 8 10 12 14 b 0.975 1.000 1.025 1.050 a

    1.475 1.500 1.525 c 8 10 12 14 b 1.475 1.500 1.525 c MCMC sampling with emcee made using: github.com/dfm/corner.py f✓ ( xn ) = a 1 + e b ( xn c )
  28. Other MCMC samplers in Python 1 2 pymc-devs/pymc3 stan-dev/pystan 3

    JohannesBuchner/PyMultiNest 4 eggplantbren/DNest4
  29. hierarchical inference Other MCMC samplers in Python 1 2 pymc-devs/pymc3

    stan-dev/pystan 3 JohannesBuchner/PyMultiNest 4 eggplantbren/DNest4
  30. nested sampling hierarchical inference Other MCMC samplers in Python 1

    2 pymc-devs/pymc3 stan-dev/pystan 3 JohannesBuchner/PyMultiNest 4 eggplantbren/DNest4
  31. Physics Data mean model (physical parameters → predicted data) noise

    (stochastic; instrument, systematics, etc.) inference (parameter estimation) If your data analysis problem looks like this… * * it probably does
  32. … now you know how to solve it! * *

    in theory https://speakerdeck.com/dfm/pyastro16