An Intro to Deep Learning

An Intro to Deep Learning

Presentation give at Neurospin (NeuroBreakfast)

Aee56554ec30edfd680e1c937ed4e54d?s=128

Olivier Grisel

July 26, 2017
Tweet

Transcript

  1. An Intro to Deep Learning Olivier Grisel - Neurospin 2017

  2. Outline • ML, DL & Artificial Intelligence • Deep Learning

    • Computer Vision • Natural Language Understanding and Machine Translation • Other possible applications
  3. Machine Learning,
 Deep Learning and Artificial Intelligence

  4. Artificial Intelligence Predictive Modeling (Data Analytics)

  5. Artificial Intelligence Predictive Modeling (Data Analytics) Self-driving cars IBM Watson

    Movie recommendations Predictive Maintenance
  6. Artificial Intelligence Hand-crafted symbolic reasoning systems Predictive Modeling (Data Analytics)

  7. Artificial Intelligence Hand-crafted symbolic reasoning systems Machine Learning Predictive Modeling

    (Data Analytics)
  8. Artificial Intelligence Hand-crafted symbolic reasoning systems Machine Learning Deep Learning

    Predictive Modeling (Data Analytics)
  9. Artificial Intelligence Hand-crafted symbolic reasoning systems Machine Learning Deep Learning

    Predictive Modeling (Data Analytics)
  10. Deep Learning • Neural Networks from the 90’s rebranded in

    2006+ • « Neuron » is a loose inspiration (not important) • Stacked architecture of modules that compute internal abstract representations from the data • Parameters are tuned from labeled examples
  11. Deep Learning in the 90’s sources: LeNet5 & Stanford Deep

    Learning Tutorial
  12. x = Input Vector h1 = Hidden Activations h2 =

    Hidden Activations f1(x, w1) = max(dot(x, w1), 0) y = Output Vector f3(h2, w3) = softmax(dot(h2, w3)) f2(h1, w2) = max(dot(h1, w2), 0) w1 w2 f1 f2 f3 w3
  13. • All modules are differentiables • w.r.t. module inputs •

    w.r.t. module parameters • Training by (Stochastic) Gradient Descent • Chain rule: backpropagation algorithm • Tune parameters to minimize classification loss
  14. Recent success • 2009: state of the art acoustic model

    for speech recognition • 2011: state of the art road sign classification • 2012: state of the art object classification • 2013/14: end-to-end speech recognition, object detection • 2014/15: state of the art machine translation, getting closer for Natural Language Understanding in general
  15. Why now? • More labeled data • More compute power

    (optimized BLAS and GPUs) • Improvements to algorithms
  16. source: Alec Radford on RNNs

  17. Deep Learning for Computer Vision

  18. Deep Learning in the 90’s • Yann LeCun invented Convolutional

    Networks • First NN successfully trained with many layers
  19. Early success at OCR

  20. Natural image classification until 2012 Feature Extractions Classification Data independent

    Supervised Learning dog
  21. Natural image classification until 2012 Feature Extractions Classification Data independent

    Supervised Learning dog cat
  22. Natural image classification until 2012 Feature Extractions Classification Data independent

    Supervised Learning cat
  23. NN Layer Supervised Learning dog Supervised Learning Supervised Learning NN

    Layer NN Layer Image classification today
  24. Image classification today NN Layer Supervised Learning Supervised Learning Supervised

    Learning NN Layer NN Layer dog cat
  25. Image classification today NN Layer Supervised Learning Supervised Learning Supervised

    Learning NN Layer NN Layer dog cat
  26. Image classification today NN Layer Supervised Learning Supervised Learning Supervised

    Learning NN Layer NN Layer dog cat
  27. ImageNet Challenge 2012 • 1.2M images labeled with 1000 object

    categories • AlexNet from the deep learning team of U. of Toronto wins with 15% error rate vs 26% for the second (traditional CV pipeline)
  28. None
  29. ImageNet Challenge 2013 • Clarifai ConvNet model wins at 11%

    error rate • Many other participants used ConvNets
  30. None
  31. ImageNet Challenge 2014 • Monster model: GoogLeNet at 6.7% error

    rate
  32. GoogLeNet vs Andrej • Andrej Karpathy evaluated human performance (himself):

    ~5% error rate • "It is clear that humans will soon only be able to outperform state of the art image classification models by use of significant effort, expertise, and time.” source: What I learned from competing against a ConvNet on ImageNet
  33. ImageNet Challenge 2015 • Microsoft Research Asia wins with networks

    with depths ranging from 34 to 152 layers • New record: 3.6% error rate
  34. source: https://www.eff.org/files/AI-progress-metrics.html

  35. source: https://github.com/facebookresearch/deepmask

  36. source: https://github.com/Cadene/vqa.pytorch

  37. source: https://github.com/Cadene/vqa.pytorch

  38. Recurrent Neural Networks

  39. source: The Unreasonable Effectiveness of RNNs

  40. Applications of RNNs • Natural Language Processing
 (e.g. Language Modeling,

    Sentiment Analysis) • Machine Translation
 (e.g. English to French) • Speech recognition: audio to text • Speech synthesis: text to audio • Biological sequence modeling (DNA, Proteins)
  41. Language modeling source: The Unreasonable Effectiveness of RNNs

  42. Shakespeare source: The Unreasonable Effectiveness of RNNs

  43. Linux source code

  44. Attentional architectures for Machine Translation

  45. Neural MT source: From language modeling to machine translation

  46. Attentional Neural MT source: From language modeling to machine translation

  47. source: Google's Neural Machine Translation System: Bridging the Gap between

    Human and Machine Translation
  48. Attention == Alignment source: Neural MT by Jointly Learning to

    Align and Translate
  49. source: Show, Attend and Tell

  50. Learning to answer questions

  51. Paraphrases from web news

  52. source: Teaching Machines to Read and Comprehend

  53. source: Teaching Machines to Read and Comprehend

  54. None
  55. Medical Imaging

  56. None
  57. None
  58. Challenges for NeuroImaging • DL need many labeled images •

    Few subjects per studies (costly) • Poor labels: low inter-agreement (e.g. autism) • fMRI: low SNR of input data it-self • 3D data: huge GPU memory requirements
  59. Conclusion • ML and DL progress is fast paced •

    Many applications already in production (e.g. speech, image indexing, translation, face recognition) • Machine Learning is now moving from pattern recognition to higher level reasoning • Lack of high quality labeled data still a limitation for some applications
  60. Thank you! http://twitter.com/ogrisel http://speakerdeck.com/ogrisel Online DL class: http://www.fast.ai/ Keras examples:

    https://keras.io/ DL Book: http://www.deeplearningbook.org/ UPS DL class: https://github.com/m2dsupsdlclass/lectures-labs