Automatic differentiation in Scala by Xiayun Sun

AUTO DIFFERENTIATION AND DOING IT IN SCALA XIAYUN SUN |
“JOY” | BABYLON HEALTH

THIS IS A USELESS TALK - BUT IT’S KINDA FUN

PROBLEM STATEMENT ▸ Differentiate any (differentiable) function, to any order,
automatically ▸ But isn’t this just chain rule?

DIFFERENTIATE PROGRAMMING FUNCTIONS! ▸ We want sth like this: @autodiff
def f(x): y = 0 for i in 1 to 4: y = sin(x+y) return y ▸ Or, in Scala: @autodiff def f:Double => Double = x => (1 to 4).foldLeft(0d){case (y, _) => sin(x+y)}

THIS IS HOW IT LOOKS IN PYTORCH ▸ Relax —
this is still a Scala talk

BUT WHY ▸ Because it’s pretty cool — let machines
do everything! \o/ ▸ Have you heard of “deep learning” / “neural network” / $other_buzzwords? ▸ Deﬁne f: input => loss ▸ Minimise loss by moving inputs slightly in the direction of gradient of f (“gradient descent”) ▸ Now imagine write arbitrary code for f and not worry about how to differentiate that ▸ PyTorch; TensorFlow

AVAILABLE APPROACHES ▸ Manual / Symbolic / Numerical ▸ Auto
diff (“AD”) https://arxiv.org/pdf/1404.7456v1.pdf

“DUAL NUMBER” TRICK ▸ Replace each numerical variable by a
pair of (self, derivative) ▸ set `derivative` = 1 for variable to differentiate on, 0 otherwise ▸ These pairs follow a set of algebraic rules: ▸ Apply the function now with the dual number pair and its algebra: ▸ magically: f(x) => f((x, x’)) = (f(x), f’(x)) ▸ Math behind is quite neat, see references in the end https://en.wikipedia.org/wiki/Automatic_differentiation

DO IT IN SCALA ▸ Operator overloading for dual number
via implicits ▸ Source transformation via Scalameta ▸ Natural higher order functions

⚠ LIVE CODING ⚠

MORE INTERESTING STUFF ▸ What we did is “forward AD”,
there’s also “reverse AD” ▸ Full gradient vector instead of single derivative ▸ Actually useful ▸ Poke AST at compiler level ▸ S4TF ▸ Lambda calculus thingy ▸ “Lambda the Ultimate Backpropagator” ▸ http://www-bcl.cs.may.ie/~barak/papers/toplas-reverse.pdf

LINKS & REFERENCES ▸ Code during live demo: search for
“autodiff.scala” in gist from @xysun ▸ Wikipedia on AD: https://en.wikipedia.org/wiki/Automatic_differentiation ▸ Really nice & easy paper: https://arxiv.org/pdf/1404.7456v1.pdf ▸ Math behind dual numbers (section “forward mode”): https:// alexey.radul.name/ideas/2013/introduction-to-automatic-differentiation/ ▸ Lambda paper: http://www-bcl.cs.may.ie/~barak/papers/toplas-reverse.pdf ▸ S4TF AD notes: https://gist.github.com/rxwei/ 30ba75ce092ab3b0dce4bde1fc2c9f1d ▸ Really interesting take on different neural networks and functional programming constructs: http://colah.github.io/posts/2015-09-NN-Types-FP/

Automatic differentiation in Scala by Xiayun Sun

Automatic differentiation in Scala by Xiayun Sun

Shannon

More Decks by Shannon

Other Decks in Technology

Featured

Transcript

AUTO DIFFERENTIATION AND DOING IT IN SCALA XIAYUN SUN |

THIS IS A USELESS TALK - BUT IT’S KINDA FUN

PROBLEM STATEMENT ▸ Differentiate any (differentiable) function, to any order,

DIFFERENTIATE PROGRAMMING FUNCTIONS! ▸ We want sth like this: @autodiff

THIS IS HOW IT LOOKS IN PYTORCH ▸ Relax —

BUT WHY ▸ Because it’s pretty cool — let machines

AVAILABLE APPROACHES ▸ Manual / Symbolic / Numerical ▸ Auto

“DUAL NUMBER” TRICK ▸ Replace each numerical variable by a

DO IT IN SCALA ▸ Operator overloading for dual number

⚠ LIVE CODING ⚠

MORE INTERESTING STUFF ▸ What we did is “forward AD”,

LINKS & REFERENCES ▸ Code during live demo: search for