Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Automatic differentiation in Scala by Xiayun Sun

Shannon
April 25, 2019

Automatic differentiation in Scala by Xiayun Sun

Shannon

April 25, 2019
Tweet

More Decks by Shannon

Other Decks in Technology

Transcript

  1. AUTO DIFFERENTIATION
    AND DOING IT IN SCALA
    XIAYUN SUN | “JOY” | BABYLON HEALTH

    View full-size slide

  2. THIS IS A
    USELESS TALK
    - BUT IT’S KINDA FUN

    View full-size slide

  3. PROBLEM STATEMENT
    ▸ Differentiate any (differentiable) function, to any order,
    automatically
    ▸ But isn’t this just chain rule?

    View full-size slide

  4. DIFFERENTIATE PROGRAMMING FUNCTIONS!
    ▸ We want sth like this:
    @autodiff
    def f(x):
    y = 0
    for i in 1 to 4:
    y = sin(x+y)
    return y
    ▸ Or, in Scala:
    @autodiff
    def f:Double => Double = x =>
    (1 to 4).foldLeft(0d){case (y, _) => sin(x+y)}

    View full-size slide

  5. THIS IS HOW IT LOOKS IN PYTORCH
    ▸ Relax — this is still a Scala talk

    View full-size slide

  6. BUT WHY
    ▸ Because it’s pretty cool — let machines do everything! \o/
    ▸ Have you heard of “deep learning” / “neural network” /
    $other_buzzwords?
    ▸ Define f: input => loss
    ▸ Minimise loss by moving inputs slightly in the direction of gradient
    of f (“gradient descent”)
    ▸ Now imagine write arbitrary code for f and not worry about how to
    differentiate that
    ▸ PyTorch; TensorFlow

    View full-size slide

  7. AVAILABLE APPROACHES
    ▸ Manual / Symbolic / Numerical
    ▸ Auto diff (“AD”)
    https://arxiv.org/pdf/1404.7456v1.pdf

    View full-size slide

  8. “DUAL NUMBER” TRICK
    ▸ Replace each numerical variable by a pair of (self, derivative)
    ▸ set `derivative` = 1 for variable to differentiate on, 0 otherwise
    ▸ These pairs follow a set of algebraic rules:
    ▸ Apply the function now with the dual number pair and its algebra:
    ▸ magically: f(x) => f((x, x’)) = (f(x), f’(x))
    ▸ Math behind is quite neat, see references in the end
    https://en.wikipedia.org/wiki/Automatic_differentiation

    View full-size slide

  9. DO IT IN SCALA
    ▸ Operator overloading for dual number via implicits
    ▸ Source transformation via Scalameta
    ▸ Natural higher order functions

    View full-size slide

  10. ⚠ LIVE CODING ⚠

    View full-size slide

  11. MORE INTERESTING STUFF
    ▸ What we did is “forward AD”, there’s also “reverse AD”
    ▸ Full gradient vector instead of single derivative
    ▸ Actually useful
    ▸ Poke AST at compiler level
    ▸ S4TF
    ▸ Lambda calculus thingy
    ▸ “Lambda the Ultimate Backpropagator”
    ▸ http://www-bcl.cs.may.ie/~barak/papers/toplas-reverse.pdf

    View full-size slide

  12. LINKS & REFERENCES
    ▸ Code during live demo: search for “autodiff.scala” in gist from @xysun
    ▸ Wikipedia on AD: https://en.wikipedia.org/wiki/Automatic_differentiation
    ▸ Really nice & easy paper: https://arxiv.org/pdf/1404.7456v1.pdf
    ▸ Math behind dual numbers (section “forward mode”): https://
    alexey.radul.name/ideas/2013/introduction-to-automatic-differentiation/
    ▸ Lambda paper: http://www-bcl.cs.may.ie/~barak/papers/toplas-reverse.pdf
    ▸ S4TF AD notes: https://gist.github.com/rxwei/
    30ba75ce092ab3b0dce4bde1fc2c9f1d
    ▸ Really interesting take on different neural networks and functional programming
    constructs: http://colah.github.io/posts/2015-09-NN-Types-FP/

    View full-size slide