An Intro to Deep Learning

Slide 1

Slide 1 text

An Intro to Deep Learning Olivier Grisel - Neurospin 2017

Slide 2

Slide 2 text

Outline • ML, DL & Artiﬁcial Intelligence • Deep Learning • Computer Vision • Natural Language Understanding and Machine Translation • Other possible applications

Slide 3

Slide 3 text

Machine Learning,  Deep Learning and Artiﬁcial Intelligence

Slide 4

Slide 4 text

Artiﬁcial Intelligence Predictive Modeling (Data Analytics)

Slide 5

Slide 5 text

Artiﬁcial Intelligence Predictive Modeling (Data Analytics) Self-driving cars IBM Watson Movie recommendations Predictive Maintenance

Slide 6

Slide 6 text

Artiﬁcial Intelligence Hand-crafted symbolic reasoning systems Predictive Modeling (Data Analytics)

Slide 7

Slide 7 text

Artiﬁcial Intelligence Hand-crafted symbolic reasoning systems Machine Learning Predictive Modeling (Data Analytics)

Slide 8

Slide 8 text

Artiﬁcial Intelligence Hand-crafted symbolic reasoning systems Machine Learning Deep Learning Predictive Modeling (Data Analytics)

Slide 9

Slide 9 text

Artiﬁcial Intelligence Hand-crafted symbolic reasoning systems Machine Learning Deep Learning Predictive Modeling (Data Analytics)

Slide 10

Slide 10 text

Deep Learning • Neural Networks from the 90’s rebranded in 2006+ • « Neuron » is a loose inspiration (not important) • Stacked architecture of modules that compute internal abstract representations from the data • Parameters are tuned from labeled examples

Slide 11

Slide 11 text

Deep Learning in the 90’s sources: LeNet5 & Stanford Deep Learning Tutorial

Slide 12

Slide 12 text

x = Input Vector h1 = Hidden Activations h2 = Hidden Activations f1(x, w1) = max(dot(x, w1), 0) y = Output Vector f3(h2, w3) = softmax(dot(h2, w3)) f2(h1, w2) = max(dot(h1, w2), 0) w1 w2 f1 f2 f3 w3

Slide 13

Slide 13 text

• All modules are differentiables • w.r.t. module inputs • w.r.t. module parameters • Training by (Stochastic) Gradient Descent • Chain rule: backpropagation algorithm • Tune parameters to minimize classiﬁcation loss

Slide 14

Slide 14 text

Recent success • 2009: state of the art acoustic model for speech recognition • 2011: state of the art road sign classiﬁcation • 2012: state of the art object classiﬁcation • 2013/14: end-to-end speech recognition, object detection • 2014/15: state of the art machine translation, getting closer for Natural Language Understanding in general

Slide 15

Slide 15 text

Why now? • More labeled data • More compute power (optimized BLAS and GPUs) • Improvements to algorithms

Slide 16

Slide 16 text

source: Alec Radford on RNNs

Slide 17

Slide 17 text

Deep Learning for Computer Vision

Slide 18

Slide 18 text

Deep Learning in the 90’s • Yann LeCun invented Convolutional Networks • First NN successfully trained with many layers

Slide 19

Slide 19 text

Early success at OCR

Slide 20

Slide 20 text

Natural image classiﬁcation until 2012 Feature Extractions Classiﬁcation Data independent Supervised Learning dog

Slide 21

Slide 21 text

Natural image classiﬁcation until 2012 Feature Extractions Classiﬁcation Data independent Supervised Learning dog cat

Slide 22

Slide 22 text

Natural image classiﬁcation until 2012 Feature Extractions Classiﬁcation Data independent Supervised Learning cat

Slide 23

Slide 23 text

NN Layer Supervised Learning dog Supervised Learning Supervised Learning NN Layer NN Layer Image classiﬁcation today

Slide 24

Slide 24 text

Image classiﬁcation today NN Layer Supervised Learning Supervised Learning Supervised Learning NN Layer NN Layer dog cat

Slide 25

Slide 25 text

Image classiﬁcation today NN Layer Supervised Learning Supervised Learning Supervised Learning NN Layer NN Layer dog cat

Slide 26

Slide 26 text

Image classiﬁcation today NN Layer Supervised Learning Supervised Learning Supervised Learning NN Layer NN Layer dog cat

Slide 27

Slide 27 text

ImageNet Challenge 2012 • 1.2M images labeled with 1000 object categories • AlexNet from the deep learning team of U. of Toronto wins with 15% error rate vs 26% for the second (traditional CV pipeline)

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

ImageNet Challenge 2013 • Clarifai ConvNet model wins at 11% error rate • Many other participants used ConvNets

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

ImageNet Challenge 2014 • Monster model: GoogLeNet at 6.7% error rate

Slide 32

Slide 32 text

GoogLeNet vs Andrej • Andrej Karpathy evaluated human performance (himself): ~5% error rate • "It is clear that humans will soon only be able to outperform state of the art image classiﬁcation models by use of signiﬁcant effort, expertise, and time.” source: What I learned from competing against a ConvNet on ImageNet

Slide 33

Slide 33 text

ImageNet Challenge 2015 • Microsoft Research Asia wins with networks with depths ranging from 34 to 152 layers • New record: 3.6% error rate

Slide 34

Slide 34 text

source: https://www.eff.org/ﬁles/AI-progress-metrics.html

Slide 35

Slide 35 text

source: https://github.com/facebookresearch/deepmask

Slide 36

Slide 36 text

source: https://github.com/Cadene/vqa.pytorch

Slide 37

Slide 37 text

source: https://github.com/Cadene/vqa.pytorch

Slide 38

Slide 38 text

Recurrent Neural Networks

Slide 39

Slide 39 text

source: The Unreasonable Effectiveness of RNNs

Slide 40

Slide 40 text

Applications of RNNs • Natural Language Processing  (e.g. Language Modeling, Sentiment Analysis) • Machine Translation  (e.g. English to French) • Speech recognition: audio to text • Speech synthesis: text to audio • Biological sequence modeling (DNA, Proteins)

Slide 41

Slide 41 text

Language modeling source: The Unreasonable Effectiveness of RNNs

Slide 42

Slide 42 text

Shakespeare source: The Unreasonable Effectiveness of RNNs

Slide 43

Slide 43 text

Linux source code

Slide 44

Slide 44 text

Attentional architectures for Machine Translation

Slide 45

Slide 45 text

Neural MT source: From language modeling to machine translation

Slide 46

Slide 46 text

Attentional Neural MT source: From language modeling to machine translation

Slide 47

Slide 47 text

source: Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Slide 48

Slide 48 text

Attention == Alignment source: Neural MT by Jointly Learning to Align and Translate

Slide 49

Slide 49 text

source: Show, Attend and Tell

Slide 50

Slide 50 text

Learning to answer questions

Slide 51

Slide 51 text

Paraphrases from web news

Slide 52

Slide 52 text

source: Teaching Machines to Read and Comprehend

Slide 53

Slide 53 text

source: Teaching Machines to Read and Comprehend

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

Medical Imaging

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

Challenges for NeuroImaging • DL need many labeled images • Few subjects per studies (costly) • Poor labels: low inter-agreement (e.g. autism) • fMRI: low SNR of input data it-self • 3D data: huge GPU memory requirements

Slide 59

Slide 59 text

Conclusion • ML and DL progress is fast paced • Many applications already in production (e.g. speech, image indexing, translation, face recognition) • Machine Learning is now moving from pattern recognition to higher level reasoning • Lack of high quality labeled data still a limitation for some applications

Slide 60

Slide 60 text

Thank you! http://twitter.com/ogrisel http://speakerdeck.com/ogrisel Online DL class: http://www.fast.ai/ Keras examples: https://keras.io/ DL Book: http://www.deeplearningbook.org/ UPS DL class: https://github.com/m2dsupsdlclass/lectures-labs