An Intro to Deep Learning

An Intro to Deep Learning Olivier Grisel - Neurospin 2017

Outline • ML, DL & Artiﬁcial Intelligence • Deep Learning
• Computer Vision • Natural Language Understanding and Machine Translation • Other possible applications

Machine Learning,  Deep Learning and Artiﬁcial Intelligence

Artiﬁcial Intelligence Predictive Modeling (Data Analytics)

Artiﬁcial Intelligence Predictive Modeling (Data Analytics) Self-driving cars IBM Watson
Movie recommendations Predictive Maintenance

Artiﬁcial Intelligence Hand-crafted symbolic reasoning systems Predictive Modeling (Data Analytics)

Artiﬁcial Intelligence Hand-crafted symbolic reasoning systems Machine Learning Predictive Modeling
(Data Analytics)

Artiﬁcial Intelligence Hand-crafted symbolic reasoning systems Machine Learning Deep Learning
Predictive Modeling (Data Analytics)

Deep Learning • Neural Networks from the 90’s rebranded in
2006+ • « Neuron » is a loose inspiration (not important) • Stacked architecture of modules that compute internal abstract representations from the data • Parameters are tuned from labeled examples

Deep Learning in the 90’s sources: LeNet5 & Stanford Deep
Learning Tutorial

x = Input Vector h1 = Hidden Activations h2 =
Hidden Activations f1(x, w1) = max(dot(x, w1), 0) y = Output Vector f3(h2, w3) = softmax(dot(h2, w3)) f2(h1, w2) = max(dot(h1, w2), 0) w1 w2 f1 f2 f3 w3

• All modules are differentiables • w.r.t. module inputs •
w.r.t. module parameters • Training by (Stochastic) Gradient Descent • Chain rule: backpropagation algorithm • Tune parameters to minimize classiﬁcation loss

Recent success • 2009: state of the art acoustic model
for speech recognition • 2011: state of the art road sign classiﬁcation • 2012: state of the art object classiﬁcation • 2013/14: end-to-end speech recognition, object detection • 2014/15: state of the art machine translation, getting closer for Natural Language Understanding in general

Why now? • More labeled data • More compute power
(optimized BLAS and GPUs) • Improvements to algorithms

source: Alec Radford on RNNs

Deep Learning for Computer Vision

Deep Learning in the 90’s • Yann LeCun invented Convolutional
Networks • First NN successfully trained with many layers

Early success at OCR

Natural image classiﬁcation until 2012 Feature Extractions Classiﬁcation Data independent
Supervised Learning dog

Supervised Learning dog cat

Supervised Learning cat

NN Layer Supervised Learning dog Supervised Learning Supervised Learning NN
Layer NN Layer Image classiﬁcation today

Image classiﬁcation today NN Layer Supervised Learning Supervised Learning Supervised
Learning NN Layer NN Layer dog cat

ImageNet Challenge 2012 • 1.2M images labeled with 1000 object
categories • AlexNet from the deep learning team of U. of Toronto wins with 15% error rate vs 26% for the second (traditional CV pipeline)

ImageNet Challenge 2013 • Clarifai ConvNet model wins at 11%
error rate • Many other participants used ConvNets

ImageNet Challenge 2014 • Monster model: GoogLeNet at 6.7% error
rate

GoogLeNet vs Andrej • Andrej Karpathy evaluated human performance (himself):
~5% error rate • "It is clear that humans will soon only be able to outperform state of the art image classiﬁcation models by use of signiﬁcant effort, expertise, and time.” source: What I learned from competing against a ConvNet on ImageNet

ImageNet Challenge 2015 • Microsoft Research Asia wins with networks
with depths ranging from 34 to 152 layers • New record: 3.6% error rate

source: https://www.eff.org/ﬁles/AI-progress-metrics.html

source: https://github.com/facebookresearch/deepmask

source: https://github.com/Cadene/vqa.pytorch

Recurrent Neural Networks

source: The Unreasonable Effectiveness of RNNs

Applications of RNNs • Natural Language Processing  (e.g. Language Modeling,
Sentiment Analysis) • Machine Translation  (e.g. English to French) • Speech recognition: audio to text • Speech synthesis: text to audio • Biological sequence modeling (DNA, Proteins)

Language modeling source: The Unreasonable Effectiveness of RNNs

Shakespeare source: The Unreasonable Effectiveness of RNNs

Linux source code

Attentional architectures for Machine Translation

Neural MT source: From language modeling to machine translation

Attentional Neural MT source: From language modeling to machine translation

source: Google's Neural Machine Translation System: Bridging the Gap between
Human and Machine Translation

Attention == Alignment source: Neural MT by Jointly Learning to
Align and Translate

source: Show, Attend and Tell

Learning to answer questions

Paraphrases from web news

source: Teaching Machines to Read and Comprehend

Medical Imaging

Challenges for NeuroImaging • DL need many labeled images •
Few subjects per studies (costly) • Poor labels: low inter-agreement (e.g. autism) • fMRI: low SNR of input data it-self • 3D data: huge GPU memory requirements

Conclusion • ML and DL progress is fast paced •
Many applications already in production (e.g. speech, image indexing, translation, face recognition) • Machine Learning is now moving from pattern recognition to higher level reasoning • Lack of high quality labeled data still a limitation for some applications

Thank you! http://twitter.com/ogrisel http://speakerdeck.com/ogrisel Online DL class: http://www.fast.ai/ Keras examples:
https://keras.io/ DL Book: http://www.deeplearningbook.org/ UPS DL class: https://github.com/m2dsupsdlclass/lectures-labs

An Intro to Deep Learning

An Intro to Deep Learning

More Decks by Olivier Grisel

Other Decks in Technology

Featured

Transcript