Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Automatic differentiation in Scala by Xiayun Sun

Shannon
April 25, 2019

Automatic differentiation in Scala by Xiayun Sun

Shannon

April 25, 2019
Tweet

More Decks by Shannon

Other Decks in Technology

Transcript

  1. PROBLEM STATEMENT ▸ Differentiate any (differentiable) function, to any order,

    automatically ▸ But isn’t this just chain rule?
  2. DIFFERENTIATE PROGRAMMING FUNCTIONS! ▸ We want sth like this: @autodiff

    def f(x): y = 0 for i in 1 to 4: y = sin(x+y) return y ▸ Or, in Scala: @autodiff def f:Double => Double = x => (1 to 4).foldLeft(0d){case (y, _) => sin(x+y)}
  3. BUT WHY ▸ Because it’s pretty cool — let machines

    do everything! \o/ ▸ Have you heard of “deep learning” / “neural network” / $other_buzzwords? ▸ Define f: input => loss ▸ Minimise loss by moving inputs slightly in the direction of gradient of f (“gradient descent”) ▸ Now imagine write arbitrary code for f and not worry about how to differentiate that ▸ PyTorch; TensorFlow
  4. AVAILABLE APPROACHES ▸ Manual / Symbolic / Numerical ▸ Auto

    diff (“AD”) https://arxiv.org/pdf/1404.7456v1.pdf
  5. “DUAL NUMBER” TRICK ▸ Replace each numerical variable by a

    pair of (self, derivative) ▸ set `derivative` = 1 for variable to differentiate on, 0 otherwise ▸ These pairs follow a set of algebraic rules: ▸ Apply the function now with the dual number pair and its algebra: ▸ magically: f(x) => f((x, x’)) = (f(x), f’(x)) ▸ Math behind is quite neat, see references in the end https://en.wikipedia.org/wiki/Automatic_differentiation
  6. DO IT IN SCALA ▸ Operator overloading for dual number

    via implicits ▸ Source transformation via Scalameta ▸ Natural higher order functions
  7. MORE INTERESTING STUFF ▸ What we did is “forward AD”,

    there’s also “reverse AD” ▸ Full gradient vector instead of single derivative ▸ Actually useful ▸ Poke AST at compiler level ▸ S4TF ▸ Lambda calculus thingy ▸ “Lambda the Ultimate Backpropagator” ▸ http://www-bcl.cs.may.ie/~barak/papers/toplas-reverse.pdf
  8. LINKS & REFERENCES ▸ Code during live demo: search for

    “autodiff.scala” in gist from @xysun ▸ Wikipedia on AD: https://en.wikipedia.org/wiki/Automatic_differentiation ▸ Really nice & easy paper: https://arxiv.org/pdf/1404.7456v1.pdf ▸ Math behind dual numbers (section “forward mode”): https:// alexey.radul.name/ideas/2013/introduction-to-automatic-differentiation/ ▸ Lambda paper: http://www-bcl.cs.may.ie/~barak/papers/toplas-reverse.pdf ▸ S4TF AD notes: https://gist.github.com/rxwei/ 30ba75ce092ab3b0dce4bde1fc2c9f1d ▸ Really interesting take on different neural networks and functional programming constructs: http://colah.github.io/posts/2015-09-NN-Types-FP/