Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Recent Developments in Deep Learning

Olivier Grisel
September 15, 2015

Recent Developments in Deep Learning

Paris Datageeks meetup, Sept 2015

Olivier Grisel

September 15, 2015
Tweet

More Decks by Olivier Grisel

Other Decks in Technology

Transcript

  1. Outline • Deep Learning quick recap • Recurrent Neural Networks

    • Attention for Machine Translation • Attention and differentiable memory for reasoning
  2. Deep Learning • Neural Networks from the 90’s rebranded in

    2006+ • « Neuron » is a loose inspiration (not important) • Stacked layers of differentiable modules (matrix multiplication, convolution, pooling, element-wise non linear operations…) • Can be trained via gradient descent on large data pairs of input-output examples
  3. x = Input Vector h1 = Hidden Activations h2 =

    Hidden Activations f1(x, w1) = max(conv(x, w1), 0) y = Output Vector f3(h2, w3) = softmax(dot(h2, w3)) f2(h1, w2) = max(dot(h1, w2), 0) w1 w2 f1 f2 f3 w3
  4. Recent success • 2009: state of the art acoustic model

    for speech recognition • 2011: state of the art road sign classification • 2012: state of the art object classification • 2013/14: end-to-end speech recognition, object detection • 2014/15: state of the art machine translation, getting closer for Natural Language Understanding in general
  5. ImageNet Challenge ILSVRC2014 • 1.2 million images • 1000 classes

    • Last winner: GoogLeNet now at less than 5% error rate • Used in Google Photos for indexing
  6. Why now? • More labeled data • More compute power

    (optimized BLAS and GPUs) • Improvements to algorithms
  7. Applications of RNNs • NLP (PoS, NER, Parsing, Sentiment Analysis)

    • Generative Probabilistic Language Models • Machine Translation (e.g. English to French) • Speech recognition / Speech synthesis (newer) • Biological sequence modeling (DNA, Proteins)
  8. Neural Turing Machines • Google DeepMind, October 2014 • Neural

    Network coupled to external memory (tape) • Analogue to a Turing Machine but differentiable • Can be used to learn to simple programs from example input / output pairs • copy, repeat copy, associative recall, • binary n-grams counts and sort
  9. NTM Architecture source: Neural Turing Machines • Turing Machine: controller

    == FSM • Neural Turing Machine controller == RNN w/ LSTM
  10. Conclusion • Deep Learning progress is fast paced • Many

    applications already in production (e.g. speech, image indexing, face recognition) • Machine Learning is now moving from pattern recognition to higher level reasoning • Generic AI is no longer a swear-word among machine learners