Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Intro to Deep Learning

An Intro to Deep Learning

Presentation give at Neurospin (NeuroBreakfast)

Olivier Grisel

July 26, 2017
Tweet

More Decks by Olivier Grisel

Other Decks in Technology

Transcript

  1. An Intro to
    Deep Learning
    Olivier Grisel - Neurospin 2017

    View full-size slide

  2. Outline
    • ML, DL & Artificial Intelligence
    • Deep Learning
    • Computer Vision
    • Natural Language Understanding and Machine
    Translation
    • Other possible applications

    View full-size slide

  3. Machine Learning,

    Deep Learning and
    Artificial Intelligence

    View full-size slide

  4. Artificial Intelligence
    Predictive Modeling
    (Data Analytics)

    View full-size slide

  5. Artificial Intelligence
    Predictive Modeling
    (Data Analytics)
    Self-driving cars
    IBM Watson
    Movie
    recommendations
    Predictive
    Maintenance

    View full-size slide

  6. Artificial Intelligence
    Hand-crafted
    symbolic
    reasoning
    systems
    Predictive Modeling
    (Data Analytics)

    View full-size slide

  7. Artificial Intelligence
    Hand-crafted
    symbolic
    reasoning
    systems
    Machine Learning
    Predictive Modeling
    (Data Analytics)

    View full-size slide

  8. Artificial Intelligence
    Hand-crafted
    symbolic
    reasoning
    systems
    Machine Learning
    Deep
    Learning
    Predictive Modeling
    (Data Analytics)

    View full-size slide

  9. Artificial Intelligence
    Hand-crafted
    symbolic
    reasoning
    systems
    Machine Learning
    Deep
    Learning
    Predictive Modeling
    (Data Analytics)

    View full-size slide

  10. Deep Learning
    • Neural Networks from the 90’s rebranded in 2006+
    • « Neuron » is a loose inspiration (not important)
    • Stacked architecture of modules that compute
    internal abstract representations from the data
    • Parameters are tuned from labeled examples

    View full-size slide

  11. Deep Learning in the 90’s
    sources: LeNet5 & Stanford Deep Learning Tutorial

    View full-size slide

  12. x = Input Vector
    h1 = Hidden Activations
    h2 = Hidden Activations
    f1(x, w1) = max(dot(x, w1), 0)
    y = Output Vector
    f3(h2, w3) = softmax(dot(h2, w3))
    f2(h1, w2) = max(dot(h1, w2), 0)
    w1
    w2
    f1
    f2
    f3 w3

    View full-size slide

  13. • All modules are differentiables
    • w.r.t. module inputs
    • w.r.t. module parameters
    • Training by (Stochastic) Gradient Descent
    • Chain rule: backpropagation algorithm
    • Tune parameters to minimize classification loss

    View full-size slide

  14. Recent success
    • 2009: state of the art acoustic model for speech
    recognition
    • 2011: state of the art road sign classification
    • 2012: state of the art object classification
    • 2013/14: end-to-end speech recognition, object
    detection
    • 2014/15: state of the art machine translation, getting
    closer for Natural Language Understanding in general

    View full-size slide

  15. Why now?
    • More labeled data
    • More compute power (optimized BLAS and GPUs)
    • Improvements to algorithms

    View full-size slide

  16. source: Alec Radford on RNNs

    View full-size slide

  17. Deep Learning for
    Computer Vision

    View full-size slide

  18. Deep Learning in the 90’s
    • Yann LeCun invented Convolutional Networks
    • First NN successfully trained with many layers

    View full-size slide

  19. Early success at OCR

    View full-size slide

  20. Natural image classification
    until 2012
    Feature
    Extractions
    Classification
    Data
    independent
    Supervised
    Learning
    dog

    View full-size slide

  21. Natural image classification
    until 2012
    Feature
    Extractions
    Classification
    Data
    independent
    Supervised
    Learning
    dog
    cat

    View full-size slide

  22. Natural image classification
    until 2012
    Feature
    Extractions
    Classification
    Data
    independent
    Supervised
    Learning
    cat

    View full-size slide

  23. NN
    Layer
    Supervised
    Learning
    dog
    Supervised
    Learning
    Supervised
    Learning
    NN
    Layer
    NN
    Layer
    Image classification today

    View full-size slide

  24. Image classification today
    NN
    Layer
    Supervised
    Learning
    Supervised
    Learning
    Supervised
    Learning
    NN
    Layer
    NN
    Layer
    dog
    cat

    View full-size slide

  25. Image classification today
    NN
    Layer
    Supervised
    Learning
    Supervised
    Learning
    Supervised
    Learning
    NN
    Layer
    NN
    Layer
    dog
    cat

    View full-size slide

  26. Image classification today
    NN
    Layer
    Supervised
    Learning
    Supervised
    Learning
    Supervised
    Learning
    NN
    Layer
    NN
    Layer
    dog
    cat

    View full-size slide

  27. ImageNet Challenge 2012
    • 1.2M images labeled with 1000 object categories
    • AlexNet from the deep learning team of U. of
    Toronto wins with 15% error rate vs 26% for the
    second (traditional CV pipeline)

    View full-size slide

  28. ImageNet Challenge 2013
    • Clarifai ConvNet model wins at 11% error rate
    • Many other participants used ConvNets

    View full-size slide

  29. ImageNet Challenge 2014
    • Monster model: GoogLeNet at
    6.7% error rate

    View full-size slide

  30. GoogLeNet vs Andrej
    • Andrej Karpathy evaluated human performance
    (himself): ~5% error rate
    • "It is clear that humans will soon only be able to
    outperform state of the art image classification
    models by use of significant effort, expertise, and
    time.”
    source: What I learned from competing against a ConvNet on ImageNet

    View full-size slide

  31. ImageNet Challenge 2015
    • Microsoft Research Asia wins
    with networks with depths
    ranging from 34 to 152 layers
    • New record: 3.6% error rate

    View full-size slide

  32. source: https://www.eff.org/files/AI-progress-metrics.html

    View full-size slide

  33. source: https://github.com/facebookresearch/deepmask

    View full-size slide

  34. source: https://github.com/Cadene/vqa.pytorch

    View full-size slide

  35. source: https://github.com/Cadene/vqa.pytorch

    View full-size slide

  36. Recurrent
    Neural Networks

    View full-size slide

  37. source: The Unreasonable Effectiveness of RNNs

    View full-size slide

  38. Applications of RNNs
    • Natural Language Processing

    (e.g. Language Modeling, Sentiment Analysis)
    • Machine Translation

    (e.g. English to French)
    • Speech recognition: audio to text
    • Speech synthesis: text to audio
    • Biological sequence modeling (DNA, Proteins)

    View full-size slide

  39. Language modeling
    source: The Unreasonable Effectiveness of RNNs

    View full-size slide

  40. Shakespeare
    source: The Unreasonable Effectiveness of RNNs

    View full-size slide

  41. Linux source code

    View full-size slide

  42. Attentional architectures
    for Machine Translation

    View full-size slide

  43. Neural MT
    source: From language modeling to machine translation

    View full-size slide

  44. Attentional Neural MT
    source: From language modeling to machine translation

    View full-size slide

  45. source: Google's Neural Machine Translation System: Bridging
    the Gap between Human and Machine Translation

    View full-size slide

  46. Attention == Alignment
    source: Neural MT by Jointly Learning to Align and Translate

    View full-size slide

  47. source: Show, Attend and Tell

    View full-size slide

  48. Learning to answer
    questions

    View full-size slide

  49. Paraphrases from web news

    View full-size slide

  50. source: Teaching Machines to Read and Comprehend

    View full-size slide

  51. source: Teaching Machines to Read and Comprehend

    View full-size slide

  52. Medical Imaging

    View full-size slide

  53. Challenges for
    NeuroImaging
    • DL need many labeled images
    • Few subjects per studies (costly)
    • Poor labels: low inter-agreement (e.g. autism)
    • fMRI: low SNR of input data it-self
    • 3D data: huge GPU memory requirements

    View full-size slide

  54. Conclusion
    • ML and DL progress is fast paced
    • Many applications already in production (e.g.
    speech, image indexing, translation, face
    recognition)
    • Machine Learning is now moving from pattern
    recognition to higher level reasoning
    • Lack of high quality labeled data still a limitation for
    some applications

    View full-size slide

  55. Thank you!
    http://twitter.com/ogrisel
    http://speakerdeck.com/ogrisel
    Online DL class: http://www.fast.ai/
    Keras examples: https://keras.io/
    DL Book: http://www.deeplearningbook.org/
    UPS DL class: https://github.com/m2dsupsdlclass/lectures-labs

    View full-size slide