Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Empirical Analysis of LSTM Performance

Empirical Analysis of LSTM Performance

Early share of this work in progress.
How does LSTM learning scale on smaller problems?

Robin Ranjit Singh Chauhan

October 16, 2019
Tweet

More Decks by Robin Ranjit Singh Chauhan

Other Decks in Technology

Transcript

  1. Empirical Analysis of
    LSTM Performance
    Robin Ranjit Singh Chauhan
    https://twitter.com/robinc

    View Slide

  2. Origins
    I was at a vantech data science meetup, Harbourfront
    Center Vancouver circa 2017:
    “... I wonder how powerful
    an LSTM cell is…?”
    ● LSTMs = Long Short-Term Memory
    ○ S. Hochreiter, J. Schmidhuber, ”Long short-term
    memory”
    ● These are unit types in deep learning neural networks
    ● LSTMs are used for sequence-related problems
    ● Typically trained by gradient descent methods
    Looking out at rainy Stanley Park
    thinking about RNNs.

    View Slide

  3. ● Network accepts a sequence of random ints, of range 0 to feature_count
    [ 1 8 9 5 3 … 8 0 ]
    ● Correct answer is simply return nth element (say n=3)
    ○ Inputs and Outputs both one-hot encoded
    ○ Example concept from “Long Short-Term Memory Networks With Python” by Jason Brownlee
    ● This work: experiments on variations of this simple task
    ○ Grid-search style visualizations of accuracy during training
    Experiment

    View Slide

  4. LSTM : Sequence Length
    2 views on same surface

    View Slide

  5. LSTM : Cell count in layer
    Easy mode
    Hard mode
    2 different surfaces

    View Slide

  6. LSTM : Feature count
    2 views on same surface

    View Slide

  7. LSTM : Data set size
    Logdss = log ( Data set size )

    View Slide

  8. GRU : Data set size
    Logdss = log ( Data set size )

    View Slide

  9. Credits
    Author contribution:
    Experiment, Parameter sweep code, Plots
    Robin Ranjit Singh Chauhan
    Initial task example Jason Brownlee, “Long Short-Term Memory
    Networks With Python”
    Rendering lib Plotly 3dMesh
    Deep learning lib Keras Tensorflow
    GRU
    Maybe next time : Bayesian Inference … ?

    View Slide