Pro Yearly is on sale from $80 to $50! »

An Introduction to Natural Language Generation in Python

Aa38bb7a9c35bc414da6ec7dcd8d7339?s=47 Marco Bonzanini
September 27, 2018

An Introduction to Natural Language Generation in Python

Presented at the London Python meet-up, September 2018:
https://www.meetup.com/LondonPython/events/254408773/

Title:
Let the AI Do the Talk: Adventures with Natural Language Generation

Abstract:
Recent advances in Artificial Intelligence have shown how computers can compete with humans in a variety of mundane tasks, but what happens when creativity is required?

This talk introduces the concept of Natural Language Generation, the task of automatically generating text, for examples articles on a particular topic, poems that follow a particular style, or speech transcripts that express some attitude. Specifically, we'll discuss the case for Recurrent Neural Networks, a family of algorithms that can be trained on sequential data, and how they improve on traditional language models.

The talk is for beginners, we'll focus more on the intuitions behind the algorithms and their practical implications, and less on the mathematical details. Practical examples with Python will showcase Keras, a library to quickly prototype deep learning architectures.

Aa38bb7a9c35bc414da6ec7dcd8d7339?s=128

Marco Bonzanini

September 27, 2018
Tweet

Transcript

  1. Let the AI do the Talk Adventures with Natural Language

    Generation @MarcoBonzanini London Python meet-up // September 2018
  2. None
  3. • Sept 2016: Intro to NLP • Sept 2017: Intro

    to Word Embeddings • Sept 2018: Intro to NLG • Sept 2019: ???
  4. NATURAL LANGUAGE GENERATION

  5. Natural Language Processing 5

  6. Natural Language
 Understanding Natural Language
 Generation Natural Language Processing 6

  7. Natural Language Generation 7

  8. The task of generating
 Natural Language from a machine representation

    8 Natural Language Generation
  9. Applications of NLG 9

  10. Applications of NLG 10 Summary Generation

  11. Applications of NLG 11 Weather Report Generation

  12. Applications of NLG 12 Automatic Journalism

  13. Applications of NLG 13 Virtual Assistants / Chatbots

  14. LANGUAGE
 MODELLING

  15. Language Model 15

  16. Language Model A model that gives you the probability of

    a sequence of words 16
  17. Language Model P(I’m going home)
 >
 P(Home I’m going) 17

  18. Language Model P(I’m going home)
 >
 P(I’m going house) 18

  19. Infinite Monkey Theorem https://en.wikipedia.org/wiki/Infinite_monkey_theorem 19

  20. Infinite Monkey Theorem from random import choice from string import

    printable def monkey_hits_keyboard(n): output = [choice(printable) for _ in range(n)] print("The monkey typed:") print(''.join(output)) 20
  21. Infinite Monkey Theorem >>> monkey_hits_keyboard(30) The monkey typed: % a9AK^YKx

    OkVG)u3.cQ,31("!ac% >>> monkey_hits_keyboard(30) The monkey typed: fWE,ou)cxmV2IZ l}jSV'XxQ**9'| 21
  22. n-grams 22

  23. n-grams Sequence on N items from a given sample of

    text 23
  24. n-grams >>> from nltk import ngrams >>> list(ngrams("pizza", 3)) 24

  25. n-grams >>> from nltk import ngrams >>> list(ngrams("pizza", 3)) [('p',

    'i', 'z'), ('i', 'z', 'z'), ('z', 'z', ‘a')] 25
  26. n-grams >>> from nltk import ngrams >>> list(ngrams("pizza", 3)) [('p',

    'i', 'z'), ('i', 'z', 'z'), ('z', 'z', ‘a')] character-based trigrams 26
  27. n-grams >>> s = "The quick brown fox".split() >>> list(ngrams(s,

    2)) 27
  28. n-grams >>> s = "The quick brown fox".split() >>> list(ngrams(s,

    2)) [('The', 'quick'), ('quick', 'brown'), ('brown', 'fox')] 28
  29. n-grams >>> s = "The quick brown fox".split() >>> list(ngrams(s,

    2)) [('The', 'quick'), ('quick', 'brown'), ('brown', 'fox')] word-based bigrams 29
  30. From n-grams to Language Model 30

  31. From n-grams to Language Model • Given a large dataset

    of text • Find all the n-grams • Compute probabilities, e.g. count bigrams:
 
 
 31
  32. Example: Predictive Text in Mobile 32

  33. Example: Predictive Text in Mobile 33

  34. 34 most likely next word Example: Predictive Text in Mobile

  35. Marco is …
 
 
 
 
 
 35 Example:

    Predictive Text in Mobile
  36. Marco is a good time to get the latest flash

    player is required for video playback is unavailable right now because this video is not sure if you have a great day. 36 Example: Predictive Text in Mobile
  37. Limitations of LM so far 37

  38. Limitations of LM so far • P(word | full history)

    is too expensive • P(word | previous few words) is feasible • … Local context only! Lack of global context 38
  39. QUICK INTRO TO NEURAL NETWORKS

  40. Neural Networks 40

  41. Neural Networks 41 x1 x2 h1 y1 h2 h2

  42. Neural Networks 42 x1 x2 h1 y1 h2 h3 Input

    layer Output layer Hidden layer(s)
  43. Neurone Example 43

  44. Neurone Example 44 x1 w2 w1 x2 ?

  45. Neurone Example 45 x1 w2 w1 x2 ? F(w1x1 +

    w2x2)
  46. Training the Network 46

  47. Training the Network 47 • Random weight init • Run

    input through the network • Compute error
 (loss function) • Use error to adjust weights
 (gradient descent + back-propagation)
  48. More on Training 48

  49. More on Training • Batch size • Iterations and Epochs

    • e.g. 1,000 data points, if batch size = 100 we need 10 iterations to complete 1 epoch 49
  50. RECURRENT
 NEURAL NETWORKS

  51. Limitation of FFNN 51

  52. Limitation of FFNN 52 Input and output of fixed size

  53. Recurrent Neural Networks 53

  54. Recurrent Neural Networks 54 http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  55. Recurrent Neural Networks 55 http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  56. Limitation of RNN 56

  57. Limitation of RNN 57 “Vanishing gradient” Cannot “remember” what happened

    long ago
  58. Long Short Term Memory 58

  59. Long Short Term Memory 59 http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  60. 60 https://en.wikipedia.org/wiki/Long_short-term_memory

  61. A BIT OF PRACTICE

  62. Deep Learning in Python 62

  63. Deep Learning in Python • Some NN support in scikit-learn

    • Many low-level frameworks: Theano, PyTorch, TensorFlow • … Keras! • Probably more 63
  64. Keras 64

  65. Keras • Simple, high-level API • Uses TensorFlow, Theano or

    CNTK as backend • Runs seamlessly on GPU • Easier to start with 65
  66. LSTM Example 66

  67. LSTM Example model = Sequential() model.add( LSTM( 128, input_shape=(maxlen,len(chars)) )

    ) model.add(Dense(len(chars), activation='softmax')) 67 Define the network
  68. LSTM Example optimizer = RMSprop(lr=0.01) model.compile(
 loss='categorical_crossentropy', 
 optimizer=optimizer
 )

    68 Configure the network
  69. LSTM Example model.fit(x, y, batch_size=128, epochs=60, callbacks=[print_callback]) model.save(‘char_model.h5’) 69 Train

    the network
  70. LSTM Example for i in range(output_size): ... preds = model.predict(x_pred,

    verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] generated += next_char 70 Generate text
  71. LSTM Example for i in range(output_size): ... preds = model.predict(x_pred,

    verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] generated += next_char 71 Seed text
  72. Sample Output 72

  73. Sample Output are the glories it included. Now am I

    lrA to r ,d?ot praki ynhh kpHu ndst -h ahh umk,hrfheleuloluprffuamdaedospe aeooasak sh frxpaphrNumlpAryoaho (…) 73 Seed text After 1 epoch
  74. Sample Output I go from thee: Bear me forthwitht wh,

    t che f uf ld,hhorfAs c c ff.h scfylhle, rigrya p s lee rmoy, tofhryg dd?ofr hl t y ftrhoodfe- r Py (…) 74 After ~5 epochs
  75. Sample Output a wild-goose flies, Unclaim'd of any manwecddeelc uavekeMw

    gh whacelcwiiaeh xcacwiDac w fioarw ewoc h feicucra h,h, :ewh utiqitilweWy ha.h pc'hr, lagfh eIwislw ofiridete w laecheefb .ics,aicpaweteh fiw?egp t? (…) 75 After 20+ epochs
  76. Tuning

  77. Tuning • More layers? • More hidden nodes? or less?

    • More data? • A combination?
  78. Wyr feirm hat. meancucd kreukk? , foremee shiciarplle. My, Bnyivlaunef

    sough bus: Wad vomietlhas nteos thun. lore orain, Ty thee I Boe, I rue. niat 78 Tuning After 1 epoch
  79. Second Lord: They would be ruled after this chamber, and

    my fair nues begun out of the fact, to be conveyed, Whose noble souls I'll have the heart of the wars. Clown: Come, sir, I will make did behold your worship. 79 Tuning Much later http://karpathy.github.io/2015/05/21/rnn-effectiveness/
  80. FINAL REMARKS

  81. A Couple of Tips

  82. A Couple of Tips • You’ll need a GPU •

    Develop locally on very small dataset
 then run on cloud on real data • At least 1M characters in input,
 at least 20 epochs for training • model.save() !!!
  83. Summary • Natural Language Generation is fun • Simple models

    vs. Neural Networks • Keras makes your life easier • A lot of trial-and-error!
  84. THANK YOU @MarcoBonzanini speakerdeck.com/marcobonzanini GitHub.com/bonzanini marcobonzanini.com

  85. • Brandon Rohrer on "Recurrent Neural Networks (RNN) and Long

    Short-Term Memory (LSTM)": https://www.youtube.com/watch?v=WCUNPb-5EYI • Chris Olah on Understanding LSTM Networks:
 http://colah.github.io/posts/2015-08-Understanding-LSTMs/ • Andrej Karpathy on "The Unreasonable Effectiveness of Recurrent Neural Networks":
 http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Pics: • Weather forecast icon: https://commons.wikimedia.org/wiki/File:Newspaper_weather_forecast_-_today_and_tomorrow.svg • Stack of papers icon: https://commons.wikimedia.org/wiki/File:Stack_of_papers_tied.svg • Document icon: https://commons.wikimedia.org/wiki/File:Document_icon_(the_Noun_Project_27904).svg • News icon: https://commons.wikimedia.org/wiki/File:PICOL_icon_News.svg • Cortana icon: https://upload.wikimedia.org/wikipedia/commons/thumb/8/89/Microsoft_Cortana_light.svg/1024px- Microsoft_Cortana_light.svg.png • Siri icon: https://commons.wikimedia.org/wiki/File:Siri_icon.svg • Google assistant icon: https://commons.wikimedia.org/wiki/File:Google_mic.svg Readings & Credits