Pro Yearly is on sale from $80 to $50! »

Recent Developments in Deep Learning

Aee56554ec30edfd680e1c937ed4e54d?s=47 Olivier Grisel
September 15, 2015

Recent Developments in Deep Learning

Paris Datageeks meetup, Sept 2015

Aee56554ec30edfd680e1c937ed4e54d?s=128

Olivier Grisel

September 15, 2015
Tweet

Transcript

  1. Recent developments in Deep Learning Olivier Grisel - Paris Datageeks

    - Sept. 2015
  2. Outline • Deep Learning quick recap • Recurrent Neural Networks

    • Attention for Machine Translation • Attention and differentiable memory for reasoning
  3. Deep Learning • Neural Networks from the 90’s rebranded in

    2006+ • « Neuron » is a loose inspiration (not important) • Stacked layers of differentiable modules (matrix multiplication, convolution, pooling, element-wise non linear operations…) • Can be trained via gradient descent on large data pairs of input-output examples
  4. Deep Learning in the 90’s sources: LeNet5 & Stanford Deep

    Learning Tutorial
  5. x = Input Vector h1 = Hidden Activations h2 =

    Hidden Activations f1(x, w1) = max(conv(x, w1), 0) y = Output Vector f3(h2, w3) = softmax(dot(h2, w3)) f2(h1, w2) = max(dot(h1, w2), 0) w1 w2 f1 f2 f3 w3
  6. Recent success • 2009: state of the art acoustic model

    for speech recognition • 2011: state of the art road sign classification • 2012: state of the art object classification • 2013/14: end-to-end speech recognition, object detection • 2014/15: state of the art machine translation, getting closer for Natural Language Understanding in general
  7. ImageNet Challenge ILSVRC2014 • 1.2 million images • 1000 classes

    • Last winner: GoogLeNet now at less than 5% error rate • Used in Google Photos for indexing
  8. Image captioning http://cs.stanford.edu/people/karpathy/deepimagesent/

  9. Why now? • More labeled data • More compute power

    (optimized BLAS and GPUs) • Improvements to algorithms
  10. source: Alec Radford on RNNs

  11. Recurrent Neural Networks

  12. source: The Unreasonable Effectiveness of RNNs

  13. Applications of RNNs • NLP (PoS, NER, Parsing, Sentiment Analysis)

    • Generative Probabilistic Language Models • Machine Translation (e.g. English to French) • Speech recognition / Speech synthesis (newer) • Biological sequence modeling (DNA, Proteins)
  14. Language modeling source: The Unreasonable Effectiveness of RNNs

  15. Shakespeare

  16. Wikipedia markup

  17. Linux source code

  18. Attentional architectures for Machine Translation

  19. Neural MT source: From language modeling to machine translation

  20. Attentional Neural MT source: From language modeling to machine translation

  21. Attention == Alignment source: Neural MT by Jointly Learning to

    Align and Translate
  22. source: Neural MT by Jointly Learning to Align and Translate

  23. source: Show, Attend and Tell

  24. Differentiable memory for reasoning

  25. Neural Turing Machines • Google DeepMind, October 2014 • Neural

    Network coupled to external memory (tape) • Analogue to a Turing Machine but differentiable • Can be used to learn to simple programs from example input / output pairs • copy, repeat copy, associative recall, • binary n-grams counts and sort
  26. NTM Architecture source: Neural Turing Machines • Turing Machine: controller

    == FSM • Neural Turing Machine controller == RNN w/ LSTM
  27. Differentiable Stack source: Inferring algorithmic patterns w Stack RNN

  28. Stack RNN trained for binary addition source: Inferring algorithmic patterns

    w Stack RNN
  29. Continuous Stack source: Learning to Transduce with Unbounded Memory

  30. None
  31. Reasoning for QA

  32. bAbI tasks https://research.facebook.com/researchers/ 1543934539189348

  33. Memory Networks source: End to end memory networks

  34. source: End to end memory networks

  35. source: Dynamic Memory Networks

  36. None
  37. Neural Reasoner

  38. source: Towards Neural Network-based Reasoning

  39. QA on real data

  40. Paraphrases from web news

  41. source: Teaching Machines to Read and Comprehend

  42. source: Teaching Machines to Read and Comprehend

  43. source: Teaching Machines to Read and Comprehend

  44. source: Teaching Machines to Read and Comprehend

  45. Conclusion • Deep Learning progress is fast paced • Many

    applications already in production (e.g. speech, image indexing, face recognition) • Machine Learning is now moving from pattern recognition to higher level reasoning • Generic AI is no longer a swear-word among machine learners
  46. Thank you! @ogrisel