Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Recent Developments in Deep Learning

Olivier Grisel
September 15, 2015

Recent Developments in Deep Learning

Paris Datageeks meetup, Sept 2015

Olivier Grisel

September 15, 2015
Tweet

More Decks by Olivier Grisel

Other Decks in Technology

Transcript

  1. Recent developments
    in Deep Learning
    Olivier Grisel - Paris Datageeks - Sept. 2015

    View Slide

  2. Outline
    • Deep Learning quick recap
    • Recurrent Neural Networks
    • Attention for Machine Translation
    • Attention and differentiable memory for reasoning

    View Slide

  3. Deep Learning
    • Neural Networks from the 90’s rebranded in 2006+
    • « Neuron » is a loose inspiration (not important)
    • Stacked layers of differentiable modules (matrix
    multiplication, convolution, pooling, element-wise
    non linear operations…)
    • Can be trained via gradient descent on large data
    pairs of input-output examples

    View Slide

  4. Deep Learning in the 90’s
    sources: LeNet5 & Stanford Deep Learning Tutorial

    View Slide

  5. x = Input Vector
    h1 = Hidden Activations
    h2 = Hidden Activations
    f1(x, w1) = max(conv(x, w1), 0)
    y = Output Vector
    f3(h2, w3) = softmax(dot(h2, w3))
    f2(h1, w2) = max(dot(h1, w2), 0)
    w1
    w2
    f1
    f2
    f3 w3

    View Slide

  6. Recent success
    • 2009: state of the art acoustic model for speech
    recognition
    • 2011: state of the art road sign classification
    • 2012: state of the art object classification
    • 2013/14: end-to-end speech recognition, object
    detection
    • 2014/15: state of the art machine translation, getting
    closer for Natural Language Understanding in general

    View Slide

  7. ImageNet Challenge
    ILSVRC2014
    • 1.2 million images
    • 1000 classes
    • Last winner: GoogLeNet now
    at less than 5% error rate
    • Used in Google Photos for
    indexing

    View Slide

  8. Image captioning
    http://cs.stanford.edu/people/karpathy/deepimagesent/

    View Slide

  9. Why now?
    • More labeled data
    • More compute power (optimized BLAS and GPUs)
    • Improvements to algorithms

    View Slide

  10. source: Alec Radford on RNNs

    View Slide

  11. Recurrent
    Neural Networks

    View Slide

  12. source: The Unreasonable Effectiveness of RNNs

    View Slide

  13. Applications of RNNs
    • NLP (PoS, NER, Parsing, Sentiment Analysis)
    • Generative Probabilistic Language Models
    • Machine Translation (e.g. English to French)
    • Speech recognition / Speech synthesis (newer)
    • Biological sequence modeling (DNA, Proteins)

    View Slide

  14. Language modeling
    source: The Unreasonable Effectiveness of RNNs

    View Slide

  15. Shakespeare

    View Slide

  16. Wikipedia markup

    View Slide

  17. Linux source code

    View Slide

  18. Attentional architectures
    for Machine Translation

    View Slide

  19. Neural MT
    source: From language modeling to machine translation

    View Slide

  20. Attentional Neural MT
    source: From language modeling to machine translation

    View Slide

  21. Attention == Alignment
    source: Neural MT by Jointly Learning to Align and Translate

    View Slide

  22. source: Neural MT by Jointly Learning to Align and Translate

    View Slide

  23. source: Show, Attend and Tell

    View Slide

  24. Differentiable memory
    for reasoning

    View Slide

  25. Neural Turing Machines
    • Google DeepMind, October 2014
    • Neural Network coupled to external memory (tape)
    • Analogue to a Turing Machine but differentiable
    • Can be used to learn to simple programs from
    example input / output pairs
    • copy, repeat copy, associative recall,
    • binary n-grams counts and sort

    View Slide

  26. NTM Architecture
    source: Neural Turing Machines
    • Turing Machine:
    controller == FSM
    • Neural Turing Machine
    controller == RNN w/ LSTM

    View Slide

  27. Differentiable Stack
    source: Inferring algorithmic patterns w Stack RNN

    View Slide

  28. Stack RNN trained for
    binary addition
    source: Inferring algorithmic patterns w Stack RNN

    View Slide

  29. Continuous Stack
    source: Learning to Transduce with Unbounded Memory

    View Slide

  30. View Slide

  31. Reasoning for QA

    View Slide

  32. bAbI tasks
    https://research.facebook.com/researchers/
    1543934539189348

    View Slide

  33. Memory Networks
    source: End to end memory networks

    View Slide

  34. source: End to end memory networks

    View Slide

  35. source: Dynamic Memory Networks

    View Slide

  36. View Slide

  37. Neural Reasoner

    View Slide

  38. source: Towards Neural Network-based Reasoning

    View Slide

  39. QA on real data

    View Slide

  40. Paraphrases from web news

    View Slide

  41. source: Teaching Machines to Read and Comprehend

    View Slide

  42. source: Teaching Machines to Read and Comprehend

    View Slide

  43. source: Teaching Machines to Read and Comprehend

    View Slide

  44. source: Teaching Machines to Read and Comprehend

    View Slide

  45. Conclusion
    • Deep Learning progress is fast paced
    • Many applications already in production (e.g.
    speech, image indexing, face recognition)
    • Machine Learning is now moving from pattern
    recognition to higher level reasoning
    • Generic AI is no longer a swear-word among
    machine learners

    View Slide

  46. Thank you!
    @ogrisel

    View Slide