Deep Learning - an Introduction for Ruby Developers - Speaker Deck

Tweet

Tweet

Slide 1

Slide 1 text

Deep Learning: An Introduction for Ruby Developers

Slide 2

Slide 2 text

@geoffreylitt

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

Machine learning = learning functions from data Machine learning

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

What we’ll cover today • Some fundamental intuition for how machine learning works • Recent developments in deep neural networks • Ruby tools for machine learning, including a new emerging project

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

Apartment size (sq ft) Monthly rent ($) 500 1000 650 1200 800 1650 1000 2200 1050 2250 1200 2350 1300 2600

Slide 14

Slide 14 text

0 500 1000 1500 2000 2500 3000 400 600 800 1000 1200 1400 Monthly rent Apartment size

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

0 500 1000 1500 2000 2500 3000 400 600 800 1000 1200 1400 Monthly rent Apartment size m = -‐1 b = 2500 Error = High

Slide 18

Slide 18 text

0 500 1000 1500 2000 2500 3000 400 600 800 1000 1200 1400 Monthly rent Apartment size m = 0 b = 1750 Error = Lower

Slide 19

Slide 19 text

0 500 1000 1500 2000 2500 3000 400 600 800 1000 1200 1400 Monthly rent Apartment size m = 2.2 b = -‐200 Error = Low

Slide 20

Slide 20 text

m b Cost (aka how far off are we?)

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

Image from http://neuralnetworksanddeeplearning.com/

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

inputs[0] inputs[1] inputs[2] output output

Slide 28

Slide 28 text

inputs[0] inputs[1] inputs[2] output output weights[0] weights[1] weights[2]

Slide 29

Slide 29 text

inputs[0] inputs[1] inputs[2] output output weights[0] weights[1] weights[2]

Slide 30

Slide 30 text

inputs[0] inputs[1] inputs[2] output output weights[0] weights[1] weights[2]

Slide 31

Slide 31 text

inputs[0] inputs[1] inputs[2] output output weights[0] weights[1] weights[2]

Slide 32

Slide 32 text

inputs[0] inputs[1] inputs[2] output output weights[0] weights[1] weights[2]

Slide 33

Slide 33 text

Image from http://cs231n.github.io/ 0.1 0.8 0.9 0.1 0.0 0.2 0.8 0.1 0.0 0.4 0.3 0.0 0.1 0.5 0.1 0.0 P(cat) P(dog) Learning Weights Weight updates flow backwards through the network Error = High

Slide 34

Slide 34 text

Image from http://cs231n.github.io/ 0.1 0.5 0.1 0.0 0.1 0.2 0.3 0.6 0.0 0.2 0.8 0.1 0.5 0.7 0.7 0.8 P(cat) P(dog) Learning Weights

Slide 35

Slide 35 text

Image from http://cs231n.github.io/ 0.1 0.5 0.1 0.0 0.1 0.2 0.3 0.6 0.0 0.2 0.8 0.1 0.5 0.7 0.7 0.8 P(cat) P(dog) Learning Weights Error = Lower

Slide 36

Slide 36 text

Image from http://cs231n.github.io/ 0.0 0.3 0.5 0.1 0.1 0.2 0.1 0.0 0.1 0.9 0.5 0.6 0.5 0.9 0.7 0.8 P(cat) P(dog) Learning Weights Error = Lower

Slide 37

Slide 37 text

Image from http://cs231n.github.io/ P(cat) P(dog) Learning Weights Error = Low 0.1 0.5 0.1 0.0 0.1 0.2 0.3 0.6 0.0 0.2 0.8 0.1 0.5 0.7 0.7 0.8

Slide 38

Slide 38 text

“Deep Learning” = more layers = more data and compute needed to train = more parameters to train = things we have a ton of in 2016 neural nets with…

Slide 39

Slide 39 text

Convolutional neural networks

Slide 40

Slide 40 text

Convolutional neural networks 0% 5% 10% 15% 20% 25% 30% 2011 2012 2013 2014 2015 ImageNet Error rate Human benchmark AlexNet, 2012 GoogLeNet, 2014

Slide 41

Slide 41 text

Recurrent neural networks Vinyals et al (2014). Show and Tell: A Neural Image Caption Generator. CoRR, abs/1411.4555, .

Slide 42

Slide 42 text

Recurrent neural networks Karpathy (2015). Deep Visual-‐Semantic Alignments for Generating Image Descriptions. CVPR

Slide 43

Slide 43 text

Recurrent neural networks http://karpathy.github.io/2015/05/21/rnn-‐effectiveness/

Slide 44

Slide 44 text

Recurrent neural networks http://karpathy.github.io/2015/05/21/rnn-‐effectiveness/

Slide 45

Slide 45 text

Deep Learning = Less Feature Engineering Fancy audio feature engineering Fancy text feature engineering

Slide 46

Slide 46 text

Deep Learning = Less Feature Engineering Early part of network learns features better than humans can create

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

Option 1: Roll your own

Slide 49

Slide 49 text

Source: https://blog.intercom.io/machine-learning-way-easier-than-it-looks/

Slide 50

Slide 50 text

- + • Avoid new dependencies • Flexibility • Understand your solution • Lots of work • Testing/debugging • Performance

Slide 51

Slide 51 text

Option 2: Use a high-level library

Slide 52

Slide 52 text

https://davidcel.is/recommendable/

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

- + • Quick results • No knowledge needed • Often basic algorithms • Only set use cases

Slide 55

Slide 55 text

Option 3: Use a low-level library

Slide 56

Slide 56 text

0 20 40 60 80 100 120 140 160 180 200 Number of machine learning libraries • scipy • numpy • NLTK • scikit-learn • keras • theano • caffe • tensorflow Source: https://github.com/josephmisiti/awesome-machine-learning

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

ruby-fann (thin Ruby wrapper) libfann (written in C)

Slide 59

Slide 59 text

0 2000 4000 6000 8000 10000 12000 14000 fann theano caffe Github Stars (9/5/16)

Slide 60

Slide 60 text

0 5000 10000 15000 20000 25000 30000 35000 fann theano caffe tensorflow Github Stars (9/5/16) ?

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

C++ Core graph execution (C++) Python Ruby “The system includes front-ends for specifying TensorFlow computations in Python and C++, and we expect other front- ends to be added over time in response to the desires of both internal Google users and the broader open-source community.” SWIG

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

tensorflow.rb

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

Next steps… • Complete enough functionality to train basic neural networks • Gradient descent optimizer • Create easy install flow for dependencies • Add more test coverage • (Still far from production ready)

Slide 70

Slide 70 text

To learn more… • Andrew Ng’s Machine Learning Coursera class • Google’s Udacity Tensorflow class • Contributors to tensorflow.rb welcome!: • github.com/somaticio/tensorflow.rb • I’ll tweet slides: @geoffreylitt

Slide 71

Slide 71 text

Thanks!