TensorFlow for astronomers

TensorFlow for astronomers

00c684a144d49f612a51e855eb326d6c?s=128

Dan Foreman-Mackey

April 06, 2018
Tweet

Transcript

  1. TensorFlow 101 for Astronomers A non-traditional introduction to model building

    with TensorFlow Dan Foreman-Mackey CCA@Flatiron / dfm.io / @exoplaneteer / github.com/dfm
  2. WARNING ⚠ This is not a talk about Machine Learning™.

    If you're only here for the neural networks, please feel free to leave now <3
  3. None
  4. None
  5. WARNING ⚠ I am not a TensorFlow expert. I only

    started learning a few months ago. That being said, I'm stoked.
  6. None
  7. So why am I here?

  8. Blame Peter.

  9. astronomy TensorFlow https://trends.google.com/trends/explore?date=2016-01-01%202018-04-01&q=TensorFlow,astronomy

  10. I think that TensorFlow has the potential to be broadly

    used for data analysis problems all across astronomy. But seriously...
  11. NumPy + special sauce, basically. What is TensorFlow? This is

    a very biased description...
  12. NumPy + special sauce, basically. What is TensorFlow? This is

    a very biased description... Fast linear algebra, GPU support, automatic differentiation, ...
  13. Introduce the why and how of TensorFlow. Demonstrate that TensorFlow

    isn't just for machine learning. Sketch the procedure for extending TensorFlow with custom C++ operations. My goals for today
  14. GPU support for some operations (esp. linear algebra). Extensible with

    custom C/C++ code. Simple (Python) interface for flexible but high-performance model building + gradients. Did I mention GRADIENTS?!? Why should you care?
  15. All of science p(data | model)

  16. All of science p(data | model) Then, use some numerical

    algorithm to fit for the model parameters... e.g. scipy.optimize.minimize, MCMC, etc.
  17. All of science p(data | model) A lot of code...

    Then, use some numerical algorithm to fit for the model parameters... e.g. scipy.optimize.minimize, MCMC, etc.
  18. All of science Then, use some numerical algorithm to fit

    for the model parameters... e.g. scipy.optimize.minimize, MCMC, etc. d p(data | model) d model A lot of code...
  19. All of science OMG SO MANY CODEZ!!!!! Then, use some

    numerical algorithm to fit for the model parameters... e.g. scipy.optimize.minimize, MCMC, etc. d p(data | model) d model
  20. Now what if you want to change the parameterization?

  21. Simple Python interface, but C speed (or faster) easy to

    implement/debug, fits into existing Python stack easily, etc. Automatic differentiation of models no need to manually compute gradients, even if parameterization changes Extensibility and customization you can incorporate your existing C code (with some caveats) TensorFlow offers
  22. Automatic differentiation An aside

  23. y = f(x) your computer program

  24. schematically..... y = f1(f2(· · · fN (x) · ·

    · ))
  25. y0 =f1 0(f2(· · · fN (x) · · ·

    )) f2 0(· · · fN (x) · · · ) · · · fN 0(x) hint: this is the chain rule
  26. The chain rule + compiled code... ...leads to efficient calculation

    of gradients complicated models. Automatic differentiation
  27. Linear regression A simple example

  28. pip install tensorflow To start

  29. import numpy as np ivar = 1.0 / yerr**2 A

    = np.vander(x, 2) ATA = np.dot(A.T, A*ivar[:, None]) ATy = np.dot(A.T, y*ivar) w = np.linalg.solve(ATA, ATy) print(w) Linear regression
  30. import numpy as np ivar = 1.0 / yerr**2 A

    = np.vander(x, 2) ATA = np.dot(A.T, A*ivar[:, None]) ATy = np.dot(A.T, y*ivar) w = np.linalg.solve(ATA, ATy) print(w) w = (AT C 1 A) 1 (AT C 1 y) Linear regression
  31. import tensorflow as tf ivar = 1.0 / yerr**2 A

    = np.vander(x, 2) ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True) ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True) w = tf.linalg.solve(ATA, ATy) print(tf.Session().run(w)) Linear regression
  32. import tensorflow as tf ivar = 1.0 / yerr**2 A

    = np.vander(x, 2) ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True) ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True) w = tf.linalg.solve(ATA, ATy) print(tf.Session().run(w)) >>> print(w) Tensor("MatrixSolve_12:0", shape=(2, 1), dtype=float64) Linear regression
  33. define operation "graph" import tensorflow as tf ivar = 1.0

    / yerr**2 A = np.vander(x, 2) ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True) ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True) w = tf.linalg.solve(ATA, ATy) print(tf.Session().run(w)) >>> print(w) Tensor("MatrixSolve_12:0", shape=(2, 1), dtype=float64) Linear regression
  34. execute the operations define operation "graph" import tensorflow as tf

    ivar = 1.0 / yerr**2 A = np.vander(x, 2) ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True) ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True) w = tf.linalg.solve(ATA, ATy) print(tf.Session().run(w)) >>> print(w) Tensor("MatrixSolve_12:0", shape=(2, 1), dtype=float64) Linear regression
  35. Linear regression with unknown error bars Another simple example L

    = 1 2 N X n=1  (yn m xn b)2 s2 + log(2 ⇡ s2)
  36. m = tf.Variable(0.5, dtype=tf.float64) b = tf.Variable(-0.2, dtype=tf.float64) s =

    tf.Variable(0.1, dtype=tf.float64) model = m * x + b log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2) log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2) L = 1 2 N X n=1  (yn m xn b)2 s2 + log(2 ⇡ s2)
  37. m = tf.Variable(0.5, dtype=tf.float64) b = tf.Variable(-0.2, dtype=tf.float64) s =

    tf.Variable(0.1, dtype=tf.float64) model = m * x + b log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2) log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2) with tf.Session() as session: session.run(tf.global_variables_initializer()) print(session.run(log_like))
  38. m = tf.Variable(0.5, dtype=tf.float64) b = tf.Variable(-0.2, dtype=tf.float64) s =

    tf.Variable(0.1, dtype=tf.float64) model = m * x + b log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2) log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2) with tf.Session() as session: session.run(tf.global_variables_initializer()) print(session.run(log_like)) print(session.run(tf.gradients(log_like, [m, b, s])))
  39. m = tf.Variable(0.5, dtype=tf.float64) b = tf.Variable(-0.2, dtype=tf.float64) log_s =

    tf.Variable(np.log(0.1), dtype=tf.float64) s = tf.exp(log_s) model = m * x + b log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2) log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2) with tf.Session() as session: session.run(tf.global_variables_initializer()) print(session.run(log_like)) print(session.run(tf.gradients(log_like, [m, b, log_s])))
  40. opt = tf.contrib.opt.ScipyOptimizerInterface( -log_like, [m, b, log_s]) opt.minimize(sess) INFO:tensorflow:Optimization terminated

    with: Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL' Objective function value: -20.356501 Number of iterations: 19 Number of functions evaluations: 21 To fit...
  41. You already have code written in C/C++/Fortran/etc. You can use

    that inside of TensorFlow (with only a little boilerplate). Extending TensorFlow
  42. A solver for Kepler's equation... Extending TensorFlow

  43. https://github.com/dfm/AstroFlow/blob/master/astroflow/ops/kepler_op.cc template <typename T> class KeplerOp : public OpKernel {

    public: explicit KeplerOp(OpKernelConstruction* context) : OpKernel(context) { OP_REQUIRES_OK(context, context->GetAttr("maxiter", &maxiter_)); OP_REQUIRES(context, maxiter_ >= 0, errors::InvalidArgument("Need maxiter >= 0, got ", maxiter_)); OP_REQUIRES_OK(context, context->GetAttr("tol", &tol_)); } void Compute(OpKernelContext* context) override { const Tensor& M_tensor = context->input(0); const Tensor& e_tensor = context->input(1); const int64 N = M_tensor.NumElements(); Tensor* E_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, M_tensor.shape(), &E_tensor)); const auto M = M_tensor.template flat<T>(); const auto e = e_tensor.template scalar<T>()(0); auto E = E_tensor->template flat<T>(); for (int64 n = 0; n < N; ++n) { E(n) = kepler<T>(M(n), e, maxiter_, tol_); } } private: int maxiter_; float tol_; };
  44. https://github.com/dfm/AstroFlow/blob/master/astroflow/ops/kepler_op.cc template <typename T> class KeplerOp : public OpKernel {

    public: explicit KeplerOp(OpKernelConstruction* context) : OpKernel(context) { OP_REQUIRES_OK(context, context->GetAttr("maxiter", &maxiter_)); OP_REQUIRES(context, maxiter_ >= 0, errors::InvalidArgument("Need maxiter >= 0, got ", maxiter_)); OP_REQUIRES_OK(context, context->GetAttr("tol", &tol_)); } void Compute(OpKernelContext* context) override { const Tensor& M_tensor = context->input(0); const Tensor& e_tensor = context->input(1); const int64 N = M_tensor.NumElements(); Tensor* E_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, M_tensor.shape(), &E_tensor)); const auto M = M_tensor.template flat<T>(); const auto e = e_tensor.template scalar<T>()(0); auto E = E_tensor->template flat<T>(); for (int64 n = 0; n < N; ++n) { E(n) = kepler<T>(M(n), e, maxiter_, tol_); } } private: int maxiter_; float tol_; };
  45. What about gradients...?

  46. https://github.com/dfm/AstroFlow/blob/master/astroflow/astroflow.py @tf.RegisterGradient("Kepler") def _kepler_grad(op, *grads): M, e = op.inputs E

    = op.outputs[0] bE = grads[0] bM = bE / (1.0 - e * tf.cos(E)) be = tf.reduce_sum(tf.sin(E) * bM) return [bM, be]
  47. Introduce the why and how of TensorFlow. Demonstrate that TensorFlow

    isn't just for machine learning. Sketch the procedure for extending TensorFlow with custom C++ operations. My goals for today
  48. github.com/dfm/tf-tutorial Up next...