The People Behind This Project @chewxy Darrell Chua @cfgt Data Scientist OnDeck Gareth Seneque @garethseneque Data Engineer ABC Makoto Ito @ynqa Machine Learning Engineer Mercari Xuanyi Chew @chewxy Chief Data Scientist Ordermentum

Why Go? • Many re-implementations of AlphaGo. • All in Python and with TensorFlow. • If only there’s a library for deep learning in Go out there… ! @chewxy

Gorgonia The Gorgonia family of libraries for Deep Learning: • gorgonia.org/gorgonia • gorgonia.org/tensor • gorgonia.org/cu • gorgonia.org/dawson • gorgonia.org/randomkit • gorgonia.org/vecf64 • gorgonia.org/vecf32 @chewxy

How does Gorgonia Work? 1. Create an expression graph. 2. Populate the expression graph with values. 3. Walk towards the root. @chewxy x = 1 w = 2 mul add σ b = 3

Deep Neural Network Architectures Deep neural networks are formed by many layers. @chewxy Fully Connected Layer Convolution Layer Prediction Input Many layers in between

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece @chewxy Residual Layers Convolution Layers Policy Value Input

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece @chewxy Residual Layers Convolution Layers Policy Value Input

Two Components of AlphaGo? • Neural network detects patterns on the game board and makes decisions on where to best place a piece @chewxy 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1 ... Policy Residual Layers Convolution Layers Policy Value Input

How does AlphaGo Work? • Neural network detects patterns on the game board and makes decisions on where to best place a piece @chewxy 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1 ... Policy Residual Layers Convolution Layers Policy Value Input 0.8 Value

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play @chewxy

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play @chewxy

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play @chewxy 0 1 2 3 4 5 6 7 8

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play 0 1 2 3 4 5 6 7 8 @chewxy X O

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play 0 1 2 3 4 5 6 7 8 @chewxy X O

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play 0 1 2 3 4 5 6 7 8 @chewxy X O 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.8 Policy Value

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play 0 1 2 3 4 5 6 7 8 @chewxy X O 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.8 Policy Value

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play 0 1 2 3 4 5 6 7 8 @chewxy X O? O 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.8 Policy Value

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play 0 1 2 3 4 5 6 7 8 @chewxy X O O X 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.8 Policy Value

Two Components of AlphaGo • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play @chewxy

What AlphaGo Does • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play • Take action @chewxy

AlphaZero AlphaZero is AlphaGo without training data from humans. 1. Self-play creates training data. 2. Train on self-play data. 3. Pit old version of AlphaZero neural network vs new version. @chewxy

AlphaZero AlphaZero is AlphaGo without training data from humans. 1. Self-play creates training data. 2. Train on self-play data. 3. Pit old version of AlphaZero neural network vs new version. 4. Goto 1. @chewxy

Interesting Questions and Outcomes • How to improve training speed? • Better training and optimization methodologies. • What is the goal? • Play Go well @chewxy

Interesting Questions and Outcomes • How to improve training speed? • Better training and optimization methodologies. • What is the goal? • Play Go well • Take a cue from transfer learning @chewxy

Interesting Questions and Outcomes • How to improve training speed? • Better training and optimization methodologies. • What is the goal? • Play Go well • Take a cue from transfer learning • Multi-task learning @chewxy

Interesting Questions and Outcomes • How to improve training speed? • Better training and optimization methodologies. • What is the goal? • Play Go well • Take a cue from transfer learning • Multi-task learning • What is AlphaGo good for? @chewxy

Interesting Questions and Outcomes • How to improve training speed? • Better training and optimization methodologies. • What is the goal? • Play Go well • Take a cue from transfer learning • Multi-task learning • What is AlphaGo good for? • Solving problems with large search spaces @chewxy

Interesting Questions and Outcomes • How to improve training speed? • Better training and optimization methodologies. • What is the goal? • Play Go well • Take a cue from transfer learning • Multi-task learning • What is AlphaGo good for? • Solving problems with large search spaces • Drug discovery @chewxy

Interesting Questions and Outcomes • How to improve training speed? • Better training and optimization methodologies. • What is the goal? • Play Go well • Take a cue from transfer learning • Multi-task learning • What is AlphaGo good for? • Solving problems with large search spaces • Drug discovery • Neural network weights? @chewxy

How Close Is AlphaGo to The Big Picture Goal? Ability To Humans AlphaGo Understand cause and effect ✓ Compute ✓ Tackle a diverse array of causal computation problems ✓ @chewxy

How does AlphaGo Work? • Neural network detects patterns on the game board and makes decisions on where to best place a piece • Monte-carlo tree search for best play • Take action @chewxy

Is AlphaGo a Causal Reasoner? Causal Reasoner • See patterns • Imagine alternative scenarios AlphaGo • Convolutional neural network • Monte-carlo tree search @chewxy

Is AlphaGo a Causal Reasoner? Causal Reasoner • See patterns • Imagine alternative scenarios • Interfere and take actions AlphaGo • Convolutional neural network • Monte-carlo tree search • Take action @chewxy

How Close Is AlphaGo to The Big Picture Goal? Ability To Humans AlphaGo Understand cause and effect ✓ ✓* Compute ✓ Tackle a diverse array of causal computation problems ✓ @chewxy *Contra Judea Pearl

How Close Is AlphaGo to The Big Picture Goal? Ability To Humans AlphaGo Understand cause and effect ✓ ✓* Can compute ✓ ??? Tackle a diverse array of causal computation problems ✓ @chewxy *Contra Judea Pearl

How Close Is AlphaGo to The Big Picture Goal? Ability To Humans AlphaGo Understand cause and effect ✓ ✓* Compute ✓ ??? Tackle a diverse array of causal computation problems ✓ Possible @chewxy *Contra Judea Pearl