An Intro to
Deep Learning
Olivier Grisel - Neurospin 2017
Slide 2
Slide 2 text
Outline
• ML, DL & Artificial Intelligence
• Deep Learning
• Computer Vision
• Natural Language Understanding and Machine
Translation
• Other possible applications
Slide 3
Slide 3 text
Machine Learning,
Deep Learning and
Artificial Intelligence
Artificial Intelligence
Hand-crafted
symbolic
reasoning
systems
Machine Learning
Deep
Learning
Predictive Modeling
(Data Analytics)
Slide 9
Slide 9 text
Artificial Intelligence
Hand-crafted
symbolic
reasoning
systems
Machine Learning
Deep
Learning
Predictive Modeling
(Data Analytics)
Slide 10
Slide 10 text
Deep Learning
• Neural Networks from the 90’s rebranded in 2006+
• « Neuron » is a loose inspiration (not important)
• Stacked architecture of modules that compute
internal abstract representations from the data
• Parameters are tuned from labeled examples
Slide 11
Slide 11 text
Deep Learning in the 90’s
sources: LeNet5 & Stanford Deep Learning Tutorial
• All modules are differentiables
• w.r.t. module inputs
• w.r.t. module parameters
• Training by (Stochastic) Gradient Descent
• Chain rule: backpropagation algorithm
• Tune parameters to minimize classification loss
Slide 14
Slide 14 text
Recent success
• 2009: state of the art acoustic model for speech
recognition
• 2011: state of the art road sign classification
• 2012: state of the art object classification
• 2013/14: end-to-end speech recognition, object
detection
• 2014/15: state of the art machine translation, getting
closer for Natural Language Understanding in general
Slide 15
Slide 15 text
Why now?
• More labeled data
• More compute power (optimized BLAS and GPUs)
• Improvements to algorithms
Slide 16
Slide 16 text
source: Alec Radford on RNNs
Slide 17
Slide 17 text
Deep Learning for
Computer Vision
Slide 18
Slide 18 text
Deep Learning in the 90’s
• Yann LeCun invented Convolutional Networks
• First NN successfully trained with many layers
Slide 19
Slide 19 text
Early success at OCR
Slide 20
Slide 20 text
Natural image classification
until 2012
Feature
Extractions
Classification
Data
independent
Supervised
Learning
dog
Slide 21
Slide 21 text
Natural image classification
until 2012
Feature
Extractions
Classification
Data
independent
Supervised
Learning
dog
cat
Slide 22
Slide 22 text
Natural image classification
until 2012
Feature
Extractions
Classification
Data
independent
Supervised
Learning
cat
Slide 23
Slide 23 text
NN
Layer
Supervised
Learning
dog
Supervised
Learning
Supervised
Learning
NN
Layer
NN
Layer
Image classification today
Slide 24
Slide 24 text
Image classification today
NN
Layer
Supervised
Learning
Supervised
Learning
Supervised
Learning
NN
Layer
NN
Layer
dog
cat
Slide 25
Slide 25 text
Image classification today
NN
Layer
Supervised
Learning
Supervised
Learning
Supervised
Learning
NN
Layer
NN
Layer
dog
cat
Slide 26
Slide 26 text
Image classification today
NN
Layer
Supervised
Learning
Supervised
Learning
Supervised
Learning
NN
Layer
NN
Layer
dog
cat
Slide 27
Slide 27 text
ImageNet Challenge 2012
• 1.2M images labeled with 1000 object categories
• AlexNet from the deep learning team of U. of
Toronto wins with 15% error rate vs 26% for the
second (traditional CV pipeline)
Slide 28
Slide 28 text
No content
Slide 29
Slide 29 text
ImageNet Challenge 2013
• Clarifai ConvNet model wins at 11% error rate
• Many other participants used ConvNets
GoogLeNet vs Andrej
• Andrej Karpathy evaluated human performance
(himself): ~5% error rate
• "It is clear that humans will soon only be able to
outperform state of the art image classification
models by use of significant effort, expertise, and
time.”
source: What I learned from competing against a ConvNet on ImageNet
Slide 33
Slide 33 text
ImageNet Challenge 2015
• Microsoft Research Asia wins
with networks with depths
ranging from 34 to 152 layers
• New record: 3.6% error rate
Applications of RNNs
• Natural Language Processing
(e.g. Language Modeling, Sentiment Analysis)
• Machine Translation
(e.g. English to French)
• Speech recognition: audio to text
• Speech synthesis: text to audio
• Biological sequence modeling (DNA, Proteins)
Slide 41
Slide 41 text
Language modeling
source: The Unreasonable Effectiveness of RNNs
Slide 42
Slide 42 text
Shakespeare
source: The Unreasonable Effectiveness of RNNs
Slide 43
Slide 43 text
Linux source code
Slide 44
Slide 44 text
Attentional architectures
for Machine Translation
Slide 45
Slide 45 text
Neural MT
source: From language modeling to machine translation
Slide 46
Slide 46 text
Attentional Neural MT
source: From language modeling to machine translation
Slide 47
Slide 47 text
source: Google's Neural Machine Translation System: Bridging
the Gap between Human and Machine Translation
Slide 48
Slide 48 text
Attention == Alignment
source: Neural MT by Jointly Learning to Align and Translate
Slide 49
Slide 49 text
source: Show, Attend and Tell
Slide 50
Slide 50 text
Learning to answer
questions
Slide 51
Slide 51 text
Paraphrases from web news
Slide 52
Slide 52 text
source: Teaching Machines to Read and Comprehend
Slide 53
Slide 53 text
source: Teaching Machines to Read and Comprehend
Slide 54
Slide 54 text
No content
Slide 55
Slide 55 text
Medical Imaging
Slide 56
Slide 56 text
No content
Slide 57
Slide 57 text
No content
Slide 58
Slide 58 text
Challenges for
NeuroImaging
• DL need many labeled images
• Few subjects per studies (costly)
• Poor labels: low inter-agreement (e.g. autism)
• fMRI: low SNR of input data it-self
• 3D data: huge GPU memory requirements
Slide 59
Slide 59 text
Conclusion
• ML and DL progress is fast paced
• Many applications already in production (e.g.
speech, image indexing, translation, face
recognition)
• Machine Learning is now moving from pattern
recognition to higher level reasoning
• Lack of high quality labeled data still a limitation for
some applications