$30 off During Our Annual Pro Sale. View Details »

TensorFlow for astronomers

TensorFlow for astronomers

Dan Foreman-Mackey

April 06, 2018
Tweet

More Decks by Dan Foreman-Mackey

Other Decks in Science

Transcript

  1. TensorFlow 101 for Astronomers
    A non-traditional introduction to model building with TensorFlow
    Dan Foreman-Mackey
    CCA@Flatiron / dfm.io / @exoplaneteer / github.com/dfm

    View Slide

  2. WARNING

    This is not a talk about Machine
    Learning™. If you're only here for the
    neural networks, please feel free to leave
    now <3

    View Slide

  3. View Slide

  4. View Slide

  5. WARNING

    I am not a TensorFlow expert. I only
    started learning a few months ago. That
    being said, I'm stoked.

    View Slide

  6. View Slide

  7. So why am I here?

    View Slide

  8. Blame Peter.

    View Slide

  9. astronomy
    TensorFlow
    https://trends.google.com/trends/explore?date=2016-01-01%202018-04-01&q=TensorFlow,astronomy

    View Slide

  10. I think that TensorFlow has the potential to be broadly used
    for data analysis problems all across astronomy.
    But seriously...

    View Slide

  11. NumPy + special sauce, basically.
    What is TensorFlow?
    This is a very biased description...

    View Slide

  12. NumPy + special sauce, basically.
    What is TensorFlow?
    This is a very biased description...
    Fast linear algebra, GPU support, automatic differentiation, ...

    View Slide

  13. Introduce the why and how of TensorFlow.
    Demonstrate that TensorFlow isn't just for machine learning.
    Sketch the procedure for extending TensorFlow with custom C++ operations.
    My goals for today

    View Slide

  14. GPU support for some operations (esp. linear algebra).
    Extensible with custom C/C++ code.
    Simple (Python) interface for flexible but high-performance
    model building + gradients.
    Did I mention GRADIENTS?!?
    Why should you care?

    View Slide

  15. All of science
    p(data | model)

    View Slide

  16. All of science
    p(data | model)
    Then, use some numerical algorithm to fit for the model parameters...
    e.g. scipy.optimize.minimize, MCMC, etc.

    View Slide

  17. All of science
    p(data | model)
    A lot of code...
    Then, use some numerical algorithm to fit for the model parameters...
    e.g. scipy.optimize.minimize, MCMC, etc.

    View Slide

  18. All of science
    Then, use some numerical algorithm to fit for the model parameters...
    e.g. scipy.optimize.minimize, MCMC, etc.
    d p(data | model)
    d model
    A lot of code...

    View Slide

  19. All of science
    OMG SO MANY CODEZ!!!!!
    Then, use some numerical algorithm to fit for the model parameters...
    e.g. scipy.optimize.minimize, MCMC, etc.
    d p(data | model)
    d model

    View Slide

  20. Now what if you want to change the parameterization?

    View Slide

  21. Simple Python interface, but C speed (or faster)
    easy to implement/debug, fits into existing Python stack easily, etc.
    Automatic differentiation of models
    no need to manually compute gradients, even if parameterization changes
    Extensibility and customization
    you can incorporate your existing C code (with some caveats)
    TensorFlow offers

    View Slide

  22. Automatic differentiation
    An aside

    View Slide

  23. y = f(x)
    your computer program

    View Slide

  24. schematically.....
    y = f1(f2(· · · fN (x) · · · ))

    View Slide

  25. y0 =f1
    0(f2(· · · fN (x) · · · ))
    f2
    0(· · · fN (x) · · · )
    · · ·
    fN
    0(x)
    hint: this is the chain rule

    View Slide

  26. The chain rule + compiled code...
    ...leads to efficient calculation of gradients complicated models.
    Automatic differentiation

    View Slide

  27. Linear regression
    A simple example

    View Slide

  28. pip install tensorflow
    To start

    View Slide

  29. import numpy as np
    ivar = 1.0 / yerr**2
    A = np.vander(x, 2)
    ATA = np.dot(A.T, A*ivar[:, None])
    ATy = np.dot(A.T, y*ivar)
    w = np.linalg.solve(ATA, ATy)
    print(w)
    Linear regression

    View Slide

  30. import numpy as np
    ivar = 1.0 / yerr**2
    A = np.vander(x, 2)
    ATA = np.dot(A.T, A*ivar[:, None])
    ATy = np.dot(A.T, y*ivar)
    w = np.linalg.solve(ATA, ATy)
    print(w) w = (AT C 1 A) 1 (AT C 1 y)
    Linear regression

    View Slide

  31. import tensorflow as tf
    ivar = 1.0 / yerr**2
    A = np.vander(x, 2)
    ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True)
    ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True)
    w = tf.linalg.solve(ATA, ATy)
    print(tf.Session().run(w))
    Linear regression

    View Slide

  32. import tensorflow as tf
    ivar = 1.0 / yerr**2
    A = np.vander(x, 2)
    ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True)
    ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True)
    w = tf.linalg.solve(ATA, ATy)
    print(tf.Session().run(w)) >>> print(w)
    Tensor("MatrixSolve_12:0",
    shape=(2, 1),
    dtype=float64)
    Linear regression

    View Slide

  33. define operation "graph"
    import tensorflow as tf
    ivar = 1.0 / yerr**2
    A = np.vander(x, 2)
    ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True)
    ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True)
    w = tf.linalg.solve(ATA, ATy)
    print(tf.Session().run(w)) >>> print(w)
    Tensor("MatrixSolve_12:0",
    shape=(2, 1),
    dtype=float64)
    Linear regression

    View Slide

  34. execute the operations
    define operation "graph"
    import tensorflow as tf
    ivar = 1.0 / yerr**2
    A = np.vander(x, 2)
    ATA = tf.matmul(A, A*ivar[:, None], transpose_a=True)
    ATy = tf.matmul(A, (y*ivar)[:, None], transpose_a=True)
    w = tf.linalg.solve(ATA, ATy)
    print(tf.Session().run(w)) >>> print(w)
    Tensor("MatrixSolve_12:0",
    shape=(2, 1),
    dtype=float64)
    Linear regression

    View Slide

  35. Linear regression with unknown error bars
    Another simple example
    L =
    1
    2
    N
    X
    n=1

    (yn m xn b)2
    s2
    + log(2 ⇡ s2)

    View Slide

  36. m = tf.Variable(0.5, dtype=tf.float64)
    b = tf.Variable(-0.2, dtype=tf.float64)
    s = tf.Variable(0.1, dtype=tf.float64)
    model = m * x + b
    log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2)
    log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2)
    L =
    1
    2
    N
    X
    n=1

    (yn m xn b)2
    s2
    + log(2 ⇡ s2)

    View Slide

  37. m = tf.Variable(0.5, dtype=tf.float64)
    b = tf.Variable(-0.2, dtype=tf.float64)
    s = tf.Variable(0.1, dtype=tf.float64)
    model = m * x + b
    log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2)
    log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2)
    with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print(session.run(log_like))

    View Slide

  38. m = tf.Variable(0.5, dtype=tf.float64)
    b = tf.Variable(-0.2, dtype=tf.float64)
    s = tf.Variable(0.1, dtype=tf.float64)
    model = m * x + b
    log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2)
    log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2)
    with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print(session.run(log_like))
    print(session.run(tf.gradients(log_like, [m, b, s])))

    View Slide

  39. m = tf.Variable(0.5, dtype=tf.float64)
    b = tf.Variable(-0.2, dtype=tf.float64)
    log_s = tf.Variable(np.log(0.1), dtype=tf.float64)
    s = tf.exp(log_s)
    model = m * x + b
    log_like = -0.5 * tf.reduce_sum(((y - model) / s) ** 2)
    log_like -= 0.5 * len(x) * tf.log(2*np.pi*s**2)
    with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print(session.run(log_like))
    print(session.run(tf.gradients(log_like, [m, b, log_s])))

    View Slide

  40. opt = tf.contrib.opt.ScipyOptimizerInterface(
    -log_like, [m, b, log_s])
    opt.minimize(sess)
    INFO:tensorflow:Optimization terminated with:
    Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
    Objective function value: -20.356501
    Number of iterations: 19
    Number of functions evaluations: 21
    To fit...

    View Slide

  41. You already have code written in C/C++/Fortran/etc.
    You can use that inside of TensorFlow (with only a little boilerplate).
    Extending TensorFlow

    View Slide

  42. A solver for Kepler's equation...
    Extending TensorFlow

    View Slide

  43. https://github.com/dfm/AstroFlow/blob/master/astroflow/ops/kepler_op.cc
    template
    class KeplerOp : public OpKernel {
    public:
    explicit KeplerOp(OpKernelConstruction* context) : OpKernel(context) {
    OP_REQUIRES_OK(context, context->GetAttr("maxiter", &maxiter_));
    OP_REQUIRES(context, maxiter_ >= 0,
    errors::InvalidArgument("Need maxiter >= 0, got ", maxiter_));
    OP_REQUIRES_OK(context, context->GetAttr("tol", &tol_));
    }
    void Compute(OpKernelContext* context) override {
    const Tensor& M_tensor = context->input(0);
    const Tensor& e_tensor = context->input(1);
    const int64 N = M_tensor.NumElements();
    Tensor* E_tensor = NULL;
    OP_REQUIRES_OK(context, context->allocate_output(0, M_tensor.shape(), &E_tensor));
    const auto M = M_tensor.template flat();
    const auto e = e_tensor.template scalar()(0);
    auto E = E_tensor->template flat();
    for (int64 n = 0; n < N; ++n) {
    E(n) = kepler(M(n), e, maxiter_, tol_);
    }
    }
    private:
    int maxiter_;
    float tol_;
    };

    View Slide

  44. https://github.com/dfm/AstroFlow/blob/master/astroflow/ops/kepler_op.cc
    template
    class KeplerOp : public OpKernel {
    public:
    explicit KeplerOp(OpKernelConstruction* context) : OpKernel(context) {
    OP_REQUIRES_OK(context, context->GetAttr("maxiter", &maxiter_));
    OP_REQUIRES(context, maxiter_ >= 0,
    errors::InvalidArgument("Need maxiter >= 0, got ", maxiter_));
    OP_REQUIRES_OK(context, context->GetAttr("tol", &tol_));
    }
    void Compute(OpKernelContext* context) override {
    const Tensor& M_tensor = context->input(0);
    const Tensor& e_tensor = context->input(1);
    const int64 N = M_tensor.NumElements();
    Tensor* E_tensor = NULL;
    OP_REQUIRES_OK(context, context->allocate_output(0, M_tensor.shape(), &E_tensor));
    const auto M = M_tensor.template flat();
    const auto e = e_tensor.template scalar()(0);
    auto E = E_tensor->template flat();
    for (int64 n = 0; n < N; ++n) {
    E(n) = kepler(M(n), e, maxiter_, tol_);
    }
    }
    private:
    int maxiter_;
    float tol_;
    };

    View Slide

  45. What about gradients...?

    View Slide

  46. https://github.com/dfm/AstroFlow/blob/master/astroflow/astroflow.py
    @tf.RegisterGradient("Kepler")
    def _kepler_grad(op, *grads):
    M, e = op.inputs
    E = op.outputs[0]
    bE = grads[0]
    bM = bE / (1.0 - e * tf.cos(E))
    be = tf.reduce_sum(tf.sin(E) * bM)
    return [bM, be]

    View Slide

  47. Introduce the why and how of TensorFlow.
    Demonstrate that TensorFlow isn't just for machine learning.
    Sketch the procedure for extending TensorFlow with custom C++ operations.
    My goals for today

    View Slide

  48. github.com/dfm/tf-tutorial
    Up next...

    View Slide