Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Let the AI Do the Talk: Adventures with Natural Language Generation

Let the AI Do the Talk: Adventures with Natural Language Generation

Presentation on Natural Language Generation given at the 1st PyData Cambridge meet-up:
https://www.meetup.com/PyData-Cambridge-Meetup/events/255299686/

Abstract:
Recent advances in Artificial Intelligence have shown how computers can compete with humans in a variety of mundane tasks, but what happens when creativity is required?

This talk introduces the concept of Natural Language Generation, the task of automatically generating text, for examples articles on a particular topic, poems that follow a particular style, or speech transcripts that express some attitude. Specifically, we'll discuss the case for Recurrent Neural Networks, a family of algorithms that can be trained on sequential data, and how they improve on traditional language models.

The talk is for beginners, we'll focus more on the intuitions behind the algorithms and their practical implications, and less on the mathematical details. Practical examples with Python will showcase Keras, a library to quickly prototype deep learning architectures.

Marco Bonzanini

October 31, 2018
Tweet

More Decks by Marco Bonzanini

Other Decks in Technology

Transcript

  1. Let the AI do the Talk
    Adventures with Natural Language Generation
    @MarcoBonzanini
    PyData Cambridge — 1st meet-up

    View Slide

  2. View Slide

  3. A non-profit that supports and promotes 

    world-class, innovative, open-source

    scientific computing

    View Slide

  4. https://numfocus.org/sponsored-projects

    View Slide

  5. View Slide

  6. PyData London Conference
    12-14 July 2019
    @PyDataLondon

    View Slide

  7. NATURAL LANGUAGE
    GENERATION

    View Slide

  8. Natural Language Processing
    8

    View Slide

  9. Natural Language

    Understanding
    Natural Language

    Generation
    Natural Language Processing
    9

    View Slide

  10. Natural Language Generation
    10

    View Slide

  11. The task of generating

    Natural Language
    from a machine representation
    11
    Natural Language Generation

    View Slide

  12. Applications of NLG
    12

    View Slide

  13. Applications of NLG
    13
    Summary Generation

    View Slide

  14. Applications of NLG
    14
    Weather Report Generation

    View Slide

  15. Applications of NLG
    15
    Automatic Journalism

    View Slide

  16. Applications of NLG
    16
    Virtual Assistants / Chatbots

    View Slide

  17. LANGUAGE

    MODELLING

    View Slide

  18. Language Model
    18

    View Slide

  19. Language Model
    A model that gives you
    the probability of
    a sequence of words
    19

    View Slide

  20. Language Model
    P(I’m going home)

    >

    P(Home I’m going)
    20

    View Slide

  21. Language Model
    P(I’m going home)

    >

    P(I’m going house)
    21

    View Slide

  22. Infinite Monkey Theorem
    https://en.wikipedia.org/wiki/Infinite_monkey_theorem
    22

    View Slide

  23. Infinite Monkey Theorem
    from random import choice
    from string import printable
    def monkey_hits_keyboard(n):
    output = [choice(printable) for _ in range(n)]
    print("The monkey typed:")
    print(''.join(output))
    23

    View Slide

  24. Infinite Monkey Theorem
    >>> monkey_hits_keyboard(30)
    The monkey typed:
    %
    a9AK^YKx OkVG)u3.cQ,31("!ac%
    >>> monkey_hits_keyboard(30)
    The monkey typed:
    fWE,ou)cxmV2IZ l}jSV'XxQ**9'|
    24

    View Slide

  25. n-grams
    25

    View Slide

  26. n-grams
    Sequence on N items
    from a given sample of text
    26

    View Slide

  27. n-grams
    >>> from nltk import ngrams
    >>> list(ngrams("pizza", 3))
    27

    View Slide

  28. n-grams
    >>> from nltk import ngrams
    >>> list(ngrams("pizza", 3))
    [('p', 'i', 'z'), ('i', 'z', 'z'),
    ('z', 'z', ‘a')]
    28

    View Slide

  29. n-grams
    >>> from nltk import ngrams
    >>> list(ngrams("pizza", 3))
    [('p', 'i', 'z'), ('i', 'z', 'z'),
    ('z', 'z', ‘a')]
    character-based trigrams
    29

    View Slide

  30. n-grams
    >>> s = "The quick brown fox".split()
    >>> list(ngrams(s, 2))
    30

    View Slide

  31. n-grams
    >>> s = "The quick brown fox".split()
    >>> list(ngrams(s, 2))
    [('The', 'quick'), ('quick', 'brown'),
    ('brown', 'fox')]
    31

    View Slide

  32. n-grams
    >>> s = "The quick brown fox".split()
    >>> list(ngrams(s, 2))
    [('The', 'quick'), ('quick', 'brown'),
    ('brown', 'fox')]
    word-based bigrams
    32

    View Slide

  33. From n-grams to Language Model
    33

    View Slide

  34. From n-grams to Language Model
    • Given a large dataset of text
    • Find all the n-grams
    • Compute probabilities, e.g. count bigrams:



    34

    View Slide

  35. Example: Predictive Text in Mobile
    35

    View Slide

  36. Example: Predictive Text in Mobile
    36

    View Slide

  37. 37
    most likely
    next word
    Example: Predictive Text in Mobile

    View Slide

  38. Marco is …






    38
    Example: Predictive Text in Mobile

    View Slide

  39. Marco is a good time to
    get the latest flash player
    is required for video
    playback is unavailable
    right now because this
    video is not sure if you
    have a great day.
    39
    Example: Predictive Text in Mobile

    View Slide

  40. Limitations of LM so far
    40

    View Slide

  41. Limitations of LM so far
    • P(word | full history) is too expensive
    • P(word | previous few words) is feasible
    • … Local context only! Lack of global context
    41

    View Slide

  42. QUICK INTRO TO
    NEURAL NETWORKS

    View Slide

  43. Neural Networks
    43

    View Slide

  44. Neural Networks
    44
    x1
    x2
    h1
    y1
    h2
    h2

    View Slide

  45. Neural Networks
    45
    x1
    x2
    h1
    y1
    h2
    h3
    Input layer
    Output layer
    Hidden layer(s)

    View Slide

  46. Neurone Example
    46

    View Slide

  47. Neurone Example
    47
    x1
    w2
    w1
    x2
    ?

    View Slide

  48. Neurone Example
    48
    x1
    w2
    w1
    x2
    ?
    F(w1x1 + w2x2)

    View Slide

  49. Training the Network
    49

    View Slide

  50. Training the Network
    50
    • Random weight init
    • Run input through the network
    • Compute error

    (loss function)
    • Use error to adjust weights

    (gradient descent + back-propagation)

    View Slide

  51. More on Training
    51

    View Slide

  52. More on Training
    • Batch size
    • Iterations and Epochs
    • e.g. 1,000 data points, if batch size = 100
    we need 10 iterations to complete 1 epoch
    52

    View Slide

  53. RECURRENT

    NEURAL NETWORKS

    View Slide

  54. Limitation of FFNN
    54

    View Slide

  55. Limitation of FFNN
    55
    Input and output
    of fixed size

    View Slide

  56. Recurrent Neural Networks
    56

    View Slide

  57. Recurrent Neural Networks
    57
    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

    View Slide

  58. Recurrent Neural Networks
    58
    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

    View Slide

  59. Limitation of RNN
    59

    View Slide

  60. Limitation of RNN
    60
    “Vanishing gradient”
    Cannot “remember”
    what happened long ago

    View Slide

  61. Long Short Term Memory
    61

    View Slide

  62. Long Short Term Memory
    62
    http://colah.github.io/posts/2015-08-Understanding-LSTMs/

    View Slide

  63. 63
    https://en.wikipedia.org/wiki/Long_short-term_memory

    View Slide

  64. A BIT OF
    PRACTICE

    View Slide

  65. Deep Learning in Python
    65

    View Slide

  66. Deep Learning in Python
    • Some NN support in scikit-learn
    • Many low-level frameworks: Theano,
    PyTorch, TensorFlow
    • … Keras!
    • Probably more
    66

    View Slide

  67. Keras
    67

    View Slide

  68. Keras
    • Simple, high-level API
    • Uses TensorFlow, Theano or CNTK as backend
    • Runs seamlessly on GPU
    • Easier to start with
    68

    View Slide

  69. LSTM Example
    69

    View Slide

  70. LSTM Example
    model = Sequential()
    model.add(
    LSTM(
    128,
    input_shape=(maxlen,len(chars))
    )
    )
    model.add(Dense(len(chars), activation='softmax'))
    70
    Define the network

    View Slide

  71. LSTM Example
    optimizer = RMSprop(lr=0.01)
    model.compile(

    loss='categorical_crossentropy', 

    optimizer=optimizer

    )
    71
    Configure the network

    View Slide

  72. LSTM Example
    model.fit(x, y,
    batch_size=128,
    epochs=60,
    callbacks=[print_callback])
    model.save(‘char_model.h5’)
    72
    Train the network

    View Slide

  73. LSTM Example
    for i in range(output_size):
    ...
    preds = model.predict(x_pred, verbose=0)[0]
    next_index = sample(preds, diversity)
    next_char = indices_char[next_index]
    generated += next_char
    73
    Generate text

    View Slide

  74. LSTM Example
    for i in range(output_size):
    ...
    preds = model.predict(x_pred, verbose=0)[0]
    next_index = sample(preds, diversity)
    next_char = indices_char[next_index]
    generated += next_char
    74
    Seed text

    View Slide

  75. Sample Output
    75

    View Slide

  76. Sample Output
    are the glories it included.
    Now am I lrA to r ,d?ot praki ynhh
    kpHu ndst -h ahh
    umk,hrfheleuloluprffuamdaedospe
    aeooasak sh frxpaphrNumlpAryoaho (…)
    76
    Seed text After 1 epoch

    View Slide

  77. Sample Output
    I go from thee:
    Bear me forthwitht wh, t
    che f uf ld,hhorfAs c c ff.h
    scfylhle, rigrya p s lee
    rmoy, tofhryg dd?ofr hl t y
    ftrhoodfe- r Py (…)
    77
    After ~5 epochs

    View Slide

  78. Sample Output
    a wild-goose flies,
    Unclaim'd of any manwecddeelc uavekeMw
    gh whacelcwiiaeh xcacwiDac w
    fioarw ewoc h feicucra
    h,h, :ewh utiqitilweWy ha.h pc'hr,
    lagfh
    eIwislw ofiridete w
    laecheefb .ics,aicpaweteh fiw?egp t? (…)
    78
    After 20+ epochs

    View Slide

  79. Tuning

    View Slide

  80. Tuning
    • More layers?
    • More hidden nodes? or less?
    • More data?
    • A combination?

    View Slide

  81. Wyr feirm hat. meancucd kreukk?
    , foremee shiciarplle. My,
    Bnyivlaunef sough bus:
    Wad vomietlhas nteos thun. lore
    orain, Ty thee I Boe,
    I rue. niat
    81
    Tuning
    After 1 epoch

    View Slide

  82. to Dover, where inshipp'd
    Commit them to plean me than stand and the
    woul came the wife marn to the groat pery me
    Which that the senvose in the sen in the poor
    The death is and the calperits the should
    82
    Tuning
    Much later

    View Slide

  83. FINAL REMARKS

    View Slide

  84. A Couple of Tips

    View Slide

  85. A Couple of Tips
    • You’ll need a GPU
    • Develop locally on very small dataset

    then run on cloud on real data
    • At least 1M characters in input,

    at least 20 epochs for training
    • model.save() !!!

    View Slide

  86. Summary
    • Natural Language Generation is fun
    • Simple models vs. Neural Networks
    • Keras makes your life easier
    • A lot of trial-and-error!

    View Slide

  87. THANK YOU
    @MarcoBonzanini
    speakerdeck.com/marcobonzanini
    GitHub.com/bonzanini
    marcobonzanini.com

    View Slide

  88. • Brandon Rohrer on "Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM)":
    https://www.youtube.com/watch?v=WCUNPb-5EYI
    • Chris Olah on Understanding LSTM Networks:

    http://colah.github.io/posts/2015-08-Understanding-LSTMs/
    • Andrej Karpathy on "The Unreasonable Effectiveness of Recurrent Neural Networks":

    http://karpathy.github.io/2015/05/21/rnn-effectiveness/
    Pics:
    • Weather forecast icon: https://commons.wikimedia.org/wiki/File:Newspaper_weather_forecast_-_today_and_tomorrow.svg
    • Stack of papers icon: https://commons.wikimedia.org/wiki/File:Stack_of_papers_tied.svg
    • Document icon: https://commons.wikimedia.org/wiki/File:Document_icon_(the_Noun_Project_27904).svg
    • News icon: https://commons.wikimedia.org/wiki/File:PICOL_icon_News.svg
    • Cortana icon: https://upload.wikimedia.org/wikipedia/commons/thumb/8/89/Microsoft_Cortana_light.svg/1024px-
    Microsoft_Cortana_light.svg.png
    • Siri icon: https://commons.wikimedia.org/wiki/File:Siri_icon.svg
    • Google assistant icon: https://commons.wikimedia.org/wiki/File:Google_mic.svg
    Readings & Credits

    View Slide