Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Empirical Analysis of LSTM Performance

Empirical Analysis of LSTM Performance

Early share of this work in progress.
How does LSTM learning scale on smaller problems?

Robin Ranjit Singh Chauhan

October 16, 2019
Tweet

More Decks by Robin Ranjit Singh Chauhan

Other Decks in Technology

Transcript

  1. Origins I was at a vantech data science meetup, Harbourfront

    Center Vancouver circa 2017: “... I wonder how powerful an LSTM cell is…?” • LSTMs = Long Short-Term Memory ◦ S. Hochreiter, J. Schmidhuber, ”Long short-term memory” • These are unit types in deep learning neural networks • LSTMs are used for sequence-related problems • Typically trained by gradient descent methods Looking out at rainy Stanley Park thinking about RNNs.
  2. • Network accepts a sequence of random ints, of range

    0 to feature_count [ 1 8 9 5 3 … 8 0 ] • Correct answer is simply return nth element (say n=3) ◦ Inputs and Outputs both one-hot encoded ◦ Example concept from “Long Short-Term Memory Networks With Python” by Jason Brownlee • This work: experiments on variations of this simple task ◦ Grid-search style visualizations of accuracy during training Experiment
  3. Credits Author contribution: Experiment, Parameter sweep code, Plots Robin Ranjit

    Singh Chauhan Initial task example Jason Brownlee, “Long Short-Term Memory Networks With Python” Rendering lib Plotly 3dMesh Deep learning lib Keras Tensorflow GRU Maybe next time : Bayesian Inference … ?