860

# pyastro16

"Tools for probabilistic data analysis in Python" from the PyAstro16 conference

March 23, 2016

## Transcript

| #pyastro16
2. ### Tools for Probabilistic Data Analysis in Python * Dan Foreman-Mackey

| #pyastro16 * in 15 minutes

7. ### Physics Data mean model (physical parameters → predicted data) noise

(stochastic; instrument, systematics, etc.)
8. ### Physics Data mean model (physical parameters → predicted data) noise

(stochastic; instrument, systematics, etc.) inference (parameter estimation)

few examples

11. ### Linear regression if you have: a linear mean model and

known Gaussian uncertainties and you want: "best" parameters and uncertainties

13. ### Linear (mean) models y = m x + b y

= a2 x 2 + a1 x + a0
14. ### Linear (mean) models y = m x + b y

= a2 x 2 + a1 x + a0 y = a sin( x + w )
15. ### Linear (mean) models y = m x + b y

= a2 x 2 + a1 x + a0 y = a sin( x + w ) + known Gaussian uncertainties

18. ### Linear regression # x, y, yerr are numpy arrays of

the same shape import numpy as np A = np.vander(x, 2) ATA = np.dot(A.T, A / yerr[:, None]**2) sigma_w = np.linalg.inv(ATA) mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2))
19. ### Linear regression # x, y, yerr are numpy arrays of

the same shape import numpy as np A = np.vander(x, 2) ATA = np.dot(A.T, A / yerr[:, None]**2) sigma_w = np.linalg.inv(ATA) mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2)) A = 0 B B B @ x1 1 x2 1 . . . . . . xn 1 1 C C C A
20. ### Linear regression # x, y, yerr are numpy arrays of

the same shape import numpy as np A = np.vander(x, 2) ATA = np.dot(A.T, A / yerr[:, None]**2) sigma_w = np.linalg.inv(ATA) mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2)) A = 0 B B B @ x1 1 x2 1 . . . . . . xn 1 1 C C C A w = ✓ m b ◆
21. ### Linear regression # x, y, yerr are numpy arrays of

the same shape import numpy as np A = np.vander(x, 2) ATA = np.dot(A.T, A / yerr[:, None]**2) sigma_w = np.linalg.inv(ATA) mean_w = np.linalg.solve(ATA, np.dot(A.T, y / yerr**2)) That's it! (in other words: "Don't use MCMC for linear regression!") A = 0 B B B @ x1 1 x2 1 . . . . . . xn 1 1 C C C A w = ✓ m b ◆

23. ### Maximum likelihood if you have: a non-linear mean model and/or

non-Gaussian/unknown noise and you want: "best" parameters
24. ### Likelihoods "probability of the data given physics" p(data | physics)

parameterized by some parameters
25. ### Example likelihood function " " 2 ln p( { yn

} | ✓) = 1 2 N X n=1 [yn f✓(xn)] 2 n 2 + constant log-likelihood mean model

28. ### # x, y, yerr are numpy arrays of the same

shape import numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2) r = minimize(nll, [1.0, 10.0, 1.5]) print(r) Likelihoods ln p( { yn } | ✓) = 1 2 N X n=1 [yn f✓(xn)] 2 n 2 + constant
29. ### # x, y, yerr are numpy arrays of the same

shape import numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2) r = minimize(nll, [1.0, 10.0, 1.5]) print(r) Likelihoods f✓ ( xn ) = a 1 + e b ( xn c ) ln p( { yn } | ✓) = 1 2 N X n=1 [yn f✓(xn)] 2 n 2 + constant
30. ### # x, y, yerr are numpy arrays of the same

shape import numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): return 0.5 * np.sum(((model(theta, x) - y) / yerr)**2) r = minimize(nll, [1.0, 10.0, 1.5]) print(r) Likelihoods f✓ ( xn ) = a 1 + e b ( xn c ) ln p({yn } | ✓) ln p( { yn } | ✓) = 1 2 N X n=1 [yn f✓(xn)] 2 n 2 + constant

36. ### AutoDiff to the rescue! "The most criminally underused tool in

the [PyAstro] toolkit" — adapted from justindomke.wordpress.com

38. ### AutoDiff "Compile" time chain rule

cos(x.value) )
40. ### AutoDiff "Compile" time exact gradients

y = np.exp(-x) return (1.0 - y) / (1.0 + y) df = elementwise_grad(f) ddf = elementwise_grad(df) HIPS/autograd just works

y = np.exp(-x) return (1.0 - y) / (1.0 + y) df = elementwise_grad(f) ddf = elementwise_grad(df) HIPS/autograd just works 4 2 0 2 4 x 1.0 0.5 0.0 0.5 1.0 f(x); f0(x); f00(x)
44. ### before autograd # x, y, yerr are numpy arrays of

the same shape import numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): r = (y - model(theta, x)) / yerr return 0.5 * np.sum(r*r) r = minimize(neg_log_like, [1.0, 10.0, 1.5]) print(r)
45. ### # x, y, yerr are numpy arrays of the same

shape from autograd import grad import autograd.numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): r = (y - model(theta, x)) / yerr return 0.5 * np.sum(r*r) r = minimize(neg_log_like, [1.0, 10.0, 1.5], jac=grad(neg_log_like)) print(r) after autograd
46. ### # x, y, yerr are numpy arrays of the same

shape from autograd import grad import autograd.numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): r = (y - model(theta, x)) / yerr return 0.5 * np.sum(r*r) r = minimize(neg_log_like, [1.0, 10.0, 1.5], jac=grad(neg_log_like)) print(r) after autograd
47. ### # x, y, yerr are numpy arrays of the same

shape from autograd import grad import autograd.numpy as np from scipy.optimize import minimize def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def neg_log_like(theta): r = (y - model(theta, x)) / yerr return 0.5 * np.sum(r*r) r = minimize(neg_log_like, [1.0, 10.0, 1.5], jac=grad(neg_log_like)) print(r) 115 calls 66 calls after autograd

49. ### HIPS/autograd just works but… HIPS/autograd is not super fast you

might need to drop down to a compiled language
50. ### HIPS/autograd just works but… HIPS/autograd is not super fast you

might need to drop down to a compiled language or...

53. ### Uncertainty quantiﬁcation if you have: a non-linear mean model and/or

non-Gaussian/unknown noise and you want: parameter uncertainties
54. ### Uncertainty p(physics | data) / p(data | physics) p(physics) distribution

of physical parameters consistent with data likelihood prior

SAMPLE

59. ### MCMC sampling with emcee # x, y, yerr are numpy

arrays of the same shape import emcee import numpy as np def model(theta, x): a, b, c = theta return a / (1 + np.exp(-b * (x - c))) def log_prob(theta): log_prior = 0.0 r = (y - model(theta, x)) / yerr return -0.5 * np.sum(r*r) + log_prior ndim, nwalkers = 3, 32 p0 = np.array([1.0, 10.0, 1.5]) p0 = p0 + 0.01*np.random.randn(nwalkers, ndim) sampler = emcee.EnsembleSampler(nwalkers, ndim, log_prob) sampler.run_mcmc(p0, 1000)
60. ### 8 10 12 14 b 0.975 1.000 1.025 1.050 a

1.475 1.500 1.525 c 8 10 12 14 b 1.475 1.500 1.525 c MCMC sampling with emcee made using: github.com/dfm/corner.py f✓ ( xn ) = a 1 + e b ( xn c )

63. ### Other MCMC samplers in Python 1 2 pymc-devs/pymc3 stan-dev/pystan 3

JohannesBuchner/PyMultiNest 4 eggplantbren/DNest4
64. ### hierarchical inference Other MCMC samplers in Python 1 2 pymc-devs/pymc3

stan-dev/pystan 3 JohannesBuchner/PyMultiNest 4 eggplantbren/DNest4
65. ### nested sampling hierarchical inference Other MCMC samplers in Python 1

2 pymc-devs/pymc3 stan-dev/pystan 3 JohannesBuchner/PyMultiNest 4 eggplantbren/DNest4

67. ### Physics Data mean model (physical parameters → predicted data) noise

(stochastic; instrument, systematics, etc.) inference (parameter estimation) If your data analysis problem looks like this… * * it probably does
68. ### … now you know how to solve it! * *

in theory https://speakerdeck.com/dfm/pyastro16