Empirical Analysis of LSTM Performance

Empirical Analysis of LSTM Performance Robin Ranjit Singh Chauhan https://twitter.com/robinc

Origins I was at a vantech data science meetup, Harbourfront
Center Vancouver circa 2017: “... I wonder how powerful an LSTM cell is…?” • LSTMs = Long Short-Term Memory ◦ S. Hochreiter, J. Schmidhuber, ”Long short-term memory” • These are unit types in deep learning neural networks • LSTMs are used for sequence-related problems • Typically trained by gradient descent methods Looking out at rainy Stanley Park thinking about RNNs.

• Network accepts a sequence of random ints, of range
0 to feature_count [ 1 8 9 5 3 … 8 0 ] • Correct answer is simply return nth element (say n=3) ◦ Inputs and Outputs both one-hot encoded ◦ Example concept from “Long Short-Term Memory Networks With Python” by Jason Brownlee • This work: experiments on variations of this simple task ◦ Grid-search style visualizations of accuracy during training Experiment

LSTM : Sequence Length 2 views on same surface

LSTM : Cell count in layer Easy mode Hard mode
2 different surfaces

LSTM : Feature count 2 views on same surface

LSTM : Data set size Logdss = log ( Data
set size )

GRU : Data set size Logdss = log ( Data
set size )

Credits Author contribution: Experiment, Parameter sweep code, Plots Robin Ranjit
Singh Chauhan Initial task example Jason Brownlee, “Long Short-Term Memory Networks With Python” Rendering lib Plotly 3dMesh Deep learning lib Keras Tensorflow GRU Maybe next time : Bayesian Inference … ?

Empirical Analysis of LSTM Performance

Empirical Analysis of LSTM Performance

Robin Ranjit Singh Chauhan

More Decks by Robin Ranjit Singh Chauhan

Other Decks in Technology

Featured

Transcript