Enjoy Deep Learning  by JavaScript

Slide 1

Slide 1 text

Yuji Isobe Enjoy Deep Learning  by JavaScript TokyoJS 2.1@ Abeja INC https://speakerdeck.com/yujiosaka/hitasurale-sitedeipuraningu

Slide 2

Slide 2 text

[ “Node.js”, “MongoDB”, “AngularJS”, “socket.io”, “React.js”, “Emotion Intelligence “  ] @yujiosaka +BWB4DSJQU

Slide 3

Slide 3 text

emin = Emotion Intelligence ؾ࣋ͪΛղ͢ΔςΫϊϩδʔͷ୳ڀ Emotion Intelligence͸ɺʮແҙࣝͷߦಈ͔Βɺ ਓͷؾ࣋ͪͷػඍΛղ͢Δ஌ੑʯΛɺਓ޻஌ೳ͓Αͼػցֶ शͷԠ༻ٕज़Λ༻͍ͯ ։ൃ͠ɺϏδωεʹԠ༻͍ͯ͠·͢ɻ In search for technology to  understand human emotion

Slide 4

Slide 4 text

ZenClerk Series [FODMFSLMJUF ;FODMFSL *OUFSFTU8JEHFU

Slide 5

Slide 5 text

ZenClerk provides online customers  with exciting shopping experience, personalized by machine learning  to detect their growing desire to buy.

Slide 6

Slide 6 text

I haven’t studied ML  before… (´ɾωɾʆ)

Slide 7

Slide 7 text

Introduction to ML

Slide 8

Slide 8 text

OK! OK! OK! I understand… Sounds Cool!(ɾ㱼ɾ) Bayesian probability k-nearest neighbors Generalized linear model Neural Network Support Vector Machine Great Wall

Slide 9

Slide 9 text

%BUB7JTVBMJ[BUJPO .BDIJOF-FBSOJOH .BUIFNBUJDT 4UBUJTUJDT $PNQVUFS4DJFODF $PNNVOJDBUJPO %PNBJO,OPXMFEHF My skill set XBOUUPEFWFMPQ

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

-FU`TUSZUIJT

Slide 13

Slide 13 text

✓ Classiﬁcation of MNIST (handwritten digits data) ✓ 28 x 28px ✓ 60,000 training data ✓ 10,000 test data 101 Digit Recognizer

Slide 14

Slide 14 text

Aim for 99% accuracy http://yann.lecun.com/exdb/mnist/

Slide 15

Slide 15 text

But it didn’t look fun at all

Slide 16

Slide 16 text

I really wanted to enjoy it

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

ͻͨ͢Βָͯ͠FF6

Slide 19

Slide 19 text

1.Do not battle if not necessary 2.Do not steal items 3.Do not pick up items Let’s Play FF6 with rules:

Slide 20

Slide 20 text

1.Use Deep Learning 2.Use Only JavaScript 3.Do not use machine learning libraries Let’s Play Kaggle with rules:

Slide 21

Slide 21 text

Begin with Google Search

Slide 22

Slide 22 text

IUUQOFVSBMOFUXPSLTBOEEFFQMFBSOJOHDPNJOEFYIUNM

Slide 23

Slide 23 text

✓ Onlinebook ✓ History from Neural Network to Deep Learning ✓ Example implementation by Python on GitHub Neural Networks and Deep Learning

Slide 24

Slide 24 text

Make strategy

Slide 25

Slide 25 text

Python→CoffeeScript→ES2015 sed & manual replacement Decaf JS &  manual replacement Deep Learning Library Written in ES2015 JavaScript Babel ﬁrst in NPM

Slide 26

Slide 26 text

Isn’t Python and CoffeeScript  very similar?

Slide 27

Slide 27 text

Python def update_mini_batch(self, mini_batch, eta): nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] for x, y in mini_batch: delta_nabla_b, delta_nabla_w = self.backprop(x, y) nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)] nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)] self.weights = [w-(eta/len(mini_batch))*nw for w, nw in zip(self.weights, nabla_w)] self.biases = [b-(eta/len(mini_batch))*nb for b, nb in zip(self.biases, nabla_b)]

Slide 28

Slide 28 text

CoffeeScript updateMiniBatch: (miniBatch, eta) -> nablaB = (Matrix.zeros(b.rows, b.cols) for b in @biases) nablaW = (Matrix.zeros(w.rows, w.cols) for w in @weights) for [x, y] in miniBatch [deltaNablaB, deltaNablaW] = @backprop(x, y) nablaB = (nb.plus(dnb) for [nb, dnb] in _.zip(nablaB, deltaNablaB)) nablaW = (nw.plus(dnw) for [nw, dnw] in _.zip(nablaW, deltaNablaW)) @weights = (w.minus(nw.mulEach(eta / miniBatch.length))  for [w, nw] in _.zip(@weights, nablaW)) @biases = (b.minus(nb.mulEach(eta / miniBatch.length))  for [b, nb] in _.zip(@biases, nablaB))

Slide 29

Slide 29 text

Implement Numpy’s API

Slide 30

Slide 30 text

numpy.nan_to_num nanToNum() { let thisData = this.data, rows = this.rows, cols = this.cols; let row, col, result = new Array(rows); for (row=0; row

Slide 31

Slide 31 text

numpy.ravel ravel() { let thisData = this.data, rows = this.rows, cols = this.cols; let a = new Array(rows * cols); for (let i = 0, jBase = 0; i

Slide 32

Slide 32 text

https://github.com/juliankrispel/decaf

Slide 33

Slide 33 text

Manual Replacement

Slide 34

Slide 34 text

…

Slide 35

Slide 35 text

It worked…lol

Slide 36

Slide 36 text

It’s about time to study

Slide 37

Slide 37 text

χϡʔϥϧωοτϫʔΫ ਆܦճ࿏໢ɺӳOFVSBMOFUXPSL // ͸ɺ೴ػೳʹݟΒΕΔ͍͔ͭ͘ͷಛੑΛܭࢉػ ্ͷγϛϡϨʔγϣϯʹΑͬͯදݱ͢Δ͜ͱΛ໨ࢦͨ͠਺ֶϞσϧͰ͋Δɻ χϡʔϥϧωοτϫʔΫ8JLJQFEJB IUUQTKBXJLJQFEJBPSHXJLJχϡʔϥϧωοτϫʔΫ What is Neural Network?

Slide 38

Slide 38 text

b Perceptron Neuron Model x1 x2 x3 output w1 w2 w3 PVUQVU JGЄKXKYKC≤  JGЄKXKYKC

Slide 39

Slide 39 text

5 Perceptron Neuron Model Is the weather good? Does your  girlfriend come? Is the place  near stations? Go to the fest. 6 2 2 No Yes Yes No ≤

Slide 40

Slide 40 text

b Sigmoid Neuron Model x1 x2 x3 w1 w2 w3 PVUQVU   FYQ ЄKXKYKC output

Slide 41

Slide 41 text

Step Function (Perceptron)

Slide 42

Slide 42 text

Sigmoid Function

Slide 43

Slide 43 text

✓ Sigmoid function can produce 0 to 1 output ✓ Small difference of input makes that of output ✓ In other words sigmoid function is differentiable What’s the difference?

Slide 44

Slide 44 text

Structure w + Δw  b + Δb output + Δoutput

Slide 45

Slide 45 text

✓ Improve accuracy by modifying weights (w) and bias (b) of each neuron ( ) ✓ Techniques like Back Propagation was invented  for that purpose. Training neurons

Slide 46

Slide 46 text

What is Deep Learning?

Slide 47

Slide 47 text

Neural Network

Slide 48

Slide 48 text

Deep Learning

Slide 49

Slide 49 text

Why so popular? ✓ New techniques has been invented recently ✓ It can avoid overﬁtting when adding layers ✓ It can improve expression by adding layers

Slide 50

Slide 50 text

Let’s implement it

Slide 51

Slide 51 text

Convolutional Neural Network

Slide 52

Slide 52 text

Problem The two images are recognized different to each other 1px

Slide 53

Slide 53 text

Solution

Slide 54

Slide 54 text

Structure convolutional layer pooling layer

Slide 55

Slide 55 text

✓ Other Activation FunctionʢSoftmax/ReLUʣ ✓ Regularization (L2 Regularization/Dropout) ✓ Cross Entropy Cost Function ✓ Improving weight initialization Other techniques

Slide 56

Slide 56 text

Deep Learning is  a set of techniques There is no “Deep Learning Algorithm” You can improve accuracy by assembling many techniques  like a jigsaw puzzle

Slide 57

Slide 57 text

Problems I encountered  and how I overcame it

Slide 58

Slide 58 text

Problem 1  Allergy to mathematical expression Once I wrote the code, It was actually easy to understand. function sigmoid(z) { return 1 / (1 + Math.exp(-z)); } let output = sigmoid(w.dot(a).plus(b));   FYQ ЄKXKYKC

Slide 59

Slide 59 text

I copied and pasted from StackOverﬂow answers,  and it actually worked. costDelta(y) { this.outputDropout.minus(y); } Problem 2  I didn’t know differentiation formula

Slide 60

Slide 60 text

Softmax causes digit overﬂow if you follow textbooks.  Again, I got answers from StackOverﬂow, and it worked.  Problem 3  Textbook didn’t tell me let max = _.max(vector), tmp = _.map(vector, (v) => { return Math.exp(v - max); }), sum = _.sum(tmp); return _.map(tmp, (v) => { return v / sum; });

Slide 61

Slide 61 text

It only takes 1 hour by Python reference implementation,  but mine by Node.js takes more than 24 hours. I learned that Numpy does some crazy tricks for you. Problem 4  My computing speed is too slow I used small data set in development environment

Slide 62

Slide 62 text

Implementations with Theano and TensorFlow are  hard to reference because their API’s are too advanced. WTH is automatic differentiation!? Problem 5  Python libraries are too sophisticated I became familiar with Python libraries

Slide 63

Slide 63 text

…

Slide 64

Slide 64 text

WIP

Slide 65

Slide 65 text

IUUQTHJUIVCDPNZVKJPTBLBKTNJOE

Slide 66

Slide 66 text

Demo

Slide 67

Slide 67 text

99.1% accuracy but it takes 24 hours to run

Slide 68

Slide 68 text

Why did I do this?

Slide 69

Slide 69 text

1.To get GitHub stars (of course!) 2.To understand how deep learning works My initial motivations

Slide 70

Slide 70 text

I didn’t think it was useful  but my mind changed

Slide 71

Slide 71 text

Sometimes you want to  do prediction on browsers

Slide 72

Slide 72 text

1.You don’t have to train by JavaScript,  but you may want to predict by it You can load data trained by Python,  and use that for the prediction on browsers Promise.all([ jsmind.Netrowk.load(‘/path/to/layers.json'), jsmind.MnistLoader.loadTestDataWrapper() ]).spread(function(net, testData) { var accuracy = net.accuracy(testData); console.log('Test accuracy ' + accuracy); }); Load trained layers’ data