TensorFlow for astronomers

TensorFlow 101 for Astronomers A non-traditional introduction to model building
with TensorFlow Dan Foreman-Mackey CCA@Flatiron / dfm.io / @exoplaneteer / github.com/dfm

WARNING ⚠ This is not a talk about Machine Learning™.
If you're only here for the neural networks, please feel free to leave now <3

WARNING ⚠ I am not a TensorFlow expert. I only
started learning a few months ago. That being said, I'm stoked.

So why am I here?

Blame Peter.

astronomy TensorFlow https://trends.google.com/trends/explore?date=2016-01-01%202018-04-01&q=TensorFlow,astronomy

I think that TensorFlow has the potential to be broadly
used for data analysis problems all across astronomy. But seriously...

NumPy + special sauce, basically. What is TensorFlow? This is
a very biased description...

NumPy + special sauce, basically. What is TensorFlow? This is
a very biased description... Fast linear algebra, GPU support, automatic differentiation, ...

Introduce the why and how of TensorFlow. Demonstrate that TensorFlow
isn't just for machine learning. Sketch the procedure for extending TensorFlow with custom C++ operations. My goals for today

GPU support for some operations (esp. linear algebra). Extensible with
custom C/C++ code. Simple (Python) interface for ﬂexible but high-performance model building + gradients. Did I mention GRADIENTS?!? Why should you care?

All of science p(data | model)

All of science p(data | model) Then, use some numerical
algorithm to ﬁt for the model parameters... e.g. scipy.optimize.minimize, MCMC, etc.

All of science p(data | model) A lot of code...
Then, use some numerical algorithm to ﬁt for the model parameters... e.g. scipy.optimize.minimize, MCMC, etc.

All of science Then, use some numerical algorithm to ﬁt
for the model parameters... e.g. scipy.optimize.minimize, MCMC, etc. d p(data | model) d model A lot of code...

All of science OMG SO MANY CODEZ!!!!! Then, use some
numerical algorithm to ﬁt for the model parameters... e.g. scipy.optimize.minimize, MCMC, etc. d p(data | model) d model

Now what if you want to change the parameterization?

Simple Python interface, but C speed (or faster) easy to
implement/debug, fits into existing Python stack easily, etc. Automatic differentiation of models no need to manually compute gradients, even if parameterization changes Extensibility and customization you can incorporate your existing C code (with some caveats) TensorFlow offers

Automatic diﬀerentiation An aside

y = f(x) your computer program

schematically..... y = f1(f2(· · · fN (x) · ·
· ))

y0 =f1 0(f2(· · · fN (x) · · ·
)) f2 0(· · · fN (x) · · · ) · · · fN 0(x) hint: this is the chain rule

The chain rule + compiled code... ...leads to eﬃcient calculation
of gradients complicated models. Automatic diﬀerentiation

Linear regression A simple example

pip install tensorﬂow To start

import numpy as np ivar = 1.0 / yerr**2 A
= np.vander(x, 2) ATA = np.dot(A.T, A*ivar[:, None]) ATy = np.dot(A.T, y*ivar) w = np.linalg.solve(ATA, ATy) print(w) Linear regression

import numpy as np ivar = 1.0 / yerr**2 A
= np.vander(x, 2) ATA = np.dot(A.T, A*ivar[:, None]) ATy = np.dot(A.T, y*ivar) w = np.linalg.solve(ATA, ATy) print(w) w = (AT C 1 A) 1 (AT C 1 y) Linear regression

import tensorflow as tf ivar = 1.0 / yerr**2 A
= np.vander(x, 2) ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True) ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True) w = tf.linalg.solve(ATA, ATy) print(tf.Session().run(w)) Linear regression

import tensorflow as tf ivar = 1.0 / yerr**2 A
= np.vander(x, 2) ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True) ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True) w = tf.linalg.solve(ATA, ATy) print(tf.Session().run(w)) >>> print(w) Tensor("MatrixSolve_12:0", shape=(2, 1), dtype=float64) Linear regression

deﬁne operation "graph" import tensorflow as tf ivar = 1.0
/ yerr**2 A = np.vander(x, 2) ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True) ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True) w = tf.linalg.solve(ATA, ATy) print(tf.Session().run(w)) >>> print(w) Tensor("MatrixSolve_12:0", shape=(2, 1), dtype=float64) Linear regression

execute the operations deﬁne operation "graph" import tensorflow as tf
ivar = 1.0 / yerr**2 A = np.vander(x, 2) ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True) ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True) w = tf.linalg.solve(ATA, ATy) print(tf.Session().run(w)) >>> print(w) Tensor("MatrixSolve_12:0", shape=(2, 1), dtype=float64) Linear regression

Linear regression with unknown error bars Another simple example L
= 1 2 N X n=1  (yn m xn b)2 s2 + log(2 ⇡ s2)

m = tf.Variable(0.5, dtype=tf.float64) b = tf.Variable(-0.2, dtype=tf.float64) s =
tf.Variable(0.1, dtype=tf.float64) model = m * x + b log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2) log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2) L = 1 2 N X n=1  (yn m xn b)2 s2 + log(2 ⇡ s2)

tf.Variable(0.1, dtype=tf.float64) model = m * x + b log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2) log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2) with tf.Session() as session: session.run(tf.global_variables_initializer()) print(session.run(log_like))

tf.Variable(0.1, dtype=tf.float64) model = m * x + b log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2) log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2) with tf.Session() as session: session.run(tf.global_variables_initializer()) print(session.run(log_like)) print(session.run(tf.gradients(log_like, [m, b, s])))

m = tf.Variable(0.5, dtype=tf.float64) b = tf.Variable(-0.2, dtype=tf.float64) log_s =
tf.Variable(np.log(0.1), dtype=tf.float64) s = tf.exp(log_s) model = m * x + b log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2) log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2) with tf.Session() as session: session.run(tf.global_variables_initializer()) print(session.run(log_like)) print(session.run(tf.gradients(log_like, [m, b, log_s])))

opt = tf.contrib.opt.ScipyOptimizerInterface( -log_like, [m, b, log_s]) opt.minimize(sess) INFO:tensorflow:Optimization terminated
with: Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL' Objective function value: -20.356501 Number of iterations: 19 Number of functions evaluations: 21 To ﬁt...

You already have code written in C/C++/Fortran/etc. You can use
that inside of TensorFlow (with only a little boilerplate). Extending TensorFlow

A solver for Kepler's equation... Extending TensorFlow

https://github.com/dfm/AstroFlow/blob/master/astroﬂow/ops/kepler_op.cc template <typename T> class KeplerOp : public OpKernel {
public: explicit KeplerOp(OpKernelConstruction* context) : OpKernel(context) { OP_REQUIRES_OK(context, context->GetAttr("maxiter", &maxiter_)); OP_REQUIRES(context, maxiter_ >= 0, errors::InvalidArgument("Need maxiter >= 0, got ", maxiter_)); OP_REQUIRES_OK(context, context->GetAttr("tol", &tol_)); } void Compute(OpKernelContext* context) override { const Tensor& M_tensor = context->input(0); const Tensor& e_tensor = context->input(1); const int64 N = M_tensor.NumElements(); Tensor* E_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, M_tensor.shape(), &E_tensor)); const auto M = M_tensor.template flat<T>(); const auto e = e_tensor.template scalar<T>()(0); auto E = E_tensor->template flat<T>(); for (int64 n = 0; n < N; ++n) { E(n) = kepler<T>(M(n), e, maxiter_, tol_); } } private: int maxiter_; float tol_; };

What about gradients...?

https://github.com/dfm/AstroFlow/blob/master/astroﬂow/astroﬂow.py @tf.RegisterGradient("Kepler") def _kepler_grad(op, *grads): M, e = op.inputs E
= op.outputs[0] bE = grads[0] bM = bE / (1.0 - e * tf.cos(E)) be = tf.reduce_sum(tf.sin(E) * bM) return [bM, be]

Introduce the why and how of TensorFlow. Demonstrate that TensorFlow
isn't just for machine learning. Sketch the procedure for extending TensorFlow with custom C++ operations. My goals for today

github.com/dfm/tf-tutorial Up next...

TensorFlow for astronomers

TensorFlow for astronomers

More Decks by Dan Foreman-Mackey

Other Decks in Science

Featured

Transcript