English to Katakana with Sequence-to-Sequence Learning in Keras

English to Katakana with Sequence-to-Sequence in Keras Wanasit T.

About me • Search Quality @ Indeed • #1 Jobsearch
Website • Information Extraction • e.g. Skills, Salary • Love working with Text/NLP • Lucene, RegEx, Text Search • Deep learning NLP

“From Data to Deployment: Full Stack Data Science” • Wed,
16 Aug 2017 19:00 - 21:00 (Tomorrow) • Indeed Tokyo Tech Office, Ebisu Garden Place Tower (This building 32F) • Register at Meetup/Doorkeeper. Search “Indeed tech talk” • Free beer/pizza !! Indeed’s Tech Talk

Outline • Why Katakana? • Introduction to Sequence-to-Sequence Learning •
Sequence-to-Sequence Learning in Keras

TensorFlow’s Seq2Seq Tutorial

• Issue #550: Translate.py - hard to detect convergence •
Issue #600: Translate (Seq2Seq) Tutorial Expectations Problems with TF’s tutorial

DL Machine Translation • Require large datasets • e.g. 20
GBs translation pairs • It’s unlikely you would find them in the real world • Require long training time • Training Time >>> Writing Code • e.g. one-line change, then wait 12 hours • No fine-tuning. • No pre-trained VGG, Inception etc.

Why Katakana • A smaller machine translation problem • Require
a smaller dataset • Wikipedia title pairs (on my github) • Require a smaller model • 1-layer Seq2Seq without attention • a few hours on CPU

Introduction to Sequence-to-Sequence • Recurrent Neural Network (RNN / LSTM),
Encoder / Decoder • References ◦ Sequence-to-Sequence (paper) ◦ Understanding LSTM Networks ◦ The Unreasonable Effectiveness of Recurrent Neural Networks

Recurrent Neural Network (RNN) (credit: colah’s blog, Understanding LSTM Network)

Recurrent Neural Network (RNN) (credit: Andrej Karpathy blog)

RNN to summarize input Take the output after feed all
the input not that terribly bad “... not that terribly bad” ~ Good? ...

RNN to generate output Keep sampling next output from RNN
(Language Model) <START> Hi , my name ...

RNN to generate output ++ Adding previous output back to
input helps RNN memorize <START> Hi , my name ... Hi , my

Sequence-to-Sequence Model Use two RNNs to read input / write
output B A A ... <> ババナ ... ナナ Encoder Decoder ... ... ...

Sequence-to-Sequence Model in Keras • References ◦ Getting started with
the Keras functional API ◦ Keras’s Recurrent Layers

Keras functional API 3 layers neural network in Keras from
keras.layers import Input, Dense from keras.models import Model inputs = Input(shape=(784,)) x = Dense(64, activation='relu')(inputs) x = Dense(64, activation='relu')(x) predictions = Dense(10, activation='softmax')(x) model = Model(inputs=inputs, outputs=predictions) model.compile(optimizer='sgd', loss='categorical_crossentropy') model.fit(data, labels)

Sequence-to-Sequence Model B A A ... <> ババナ
... ナナ Encoder Decoder ... ... ...

Encoder in Keras encoder = Embedding( Input_dict_size, 64, input_length=ENGLISH_LENGTH, mask_zero=True
)(encoder_input) encoder = LSTM( units=64, return_sequences=False )(encoder) LSTM LSTM B A LSTM A ... encoder_input ... Embedding <encoder_output>

Decoder in Keras decoder = Embedding( output_dict_size, 64, input_length=KATAKANA_LENGTH, mask_zero=True
)( decoder_input ) decoder = LSTM( 64, return_sequences=True )( decoder, initial_state=[encoder, encoder] ) decoder = TimeDistributed( Dense(output_dict_size, activation="softmax") )( decoder ) LSTM LSTM <> ババナ ... LSTM ナナ ... ... decoder_input decoder_output Embedding Dense Dense Dense <encoder_output>

Combining them into a model encoder = Embedding( input_dict_size, 64,
input_length=ENGLISH_LENGTH, mask_zero=True)(encoder_input) encoder = LSTM(64,return_sequences=False)(encoder) decoder = Embedding( output_dict_size,64,input_length=KATAKANA_LENGTH,mask_zero=True)(decoder_input) decoder = LSTM(units=64,return_sequences=True)(decoder, initial_state=[encoder, encoder]) decoder = TimeDistributed(Dense(output_dict_size, activation="softmax"))(decoder) model = Model(inputs=[encoder_input, decoder_input], outputs=[decoder]) model.compile(optimizer='adam', loss='binary_crossentropy') model.fit( x=[training_encoder_input, training_decoder_input], y=[training_decoder_output], batch_size=64, epochs=60)

Summary • Why Katakana? • Introduction to Sequence-to-Sequence Model •
Sequence-to-Sequence Model in Keras Try training machine to write Katakana

English to Katakana with Sequence-to-Sequence L...

English to Katakana with Sequence-to-Sequence Learning in Keras

Wanasit Tanakitrungruang

Other Decks in Technology

Featured

Transcript

English to Katakana with Sequence-to-Sequence in Keras Wanasit T.

About me • Search Quality @ Indeed • #1 Jobsearch

“From Data to Deployment: Full Stack Data Science” • Wed,

Outline • Why Katakana? • Introduction to Sequence-to-Sequence Learning •

TensorFlow’s Seq2Seq Tutorial

• Issue #550: Translate.py - hard to detect convergence •

DL Machine Translation • Require large datasets • e.g. 20

Why Katakana • A smaller machine translation problem • Require

Introduction to Sequence-to-Sequence • Recurrent Neural Network (RNN / LSTM),

Recurrent Neural Network (RNN) (credit: colah’s blog, Understanding LSTM Network)

Recurrent Neural Network (RNN) (credit: colah’s blog, Understanding LSTM Network)

Recurrent Neural Network (RNN) (credit: Andrej Karpathy blog)

RNN to summarize input Take the output after feed all

RNN to generate output Keep sampling next output from RNN

RNN to generate output ++ Adding previous output back to

Sequence-to-Sequence Model Use two RNNs to read input / write

Sequence-to-Sequence Model in Keras • References ◦ Getting started with

Keras functional API 3 layers neural network in Keras from

Sequence-to-Sequence Model B A A ... <> ババナ

Encoder in Keras encoder = Embedding( Input_dict_size, 64, input_length=ENGLISH_LENGTH, mask_zero=True

Decoder in Keras decoder = Embedding( output_dict_size, 64, input_length=KATAKANA_LENGTH, mask_zero=True

Combining them into a model encoder = Embedding( input_dict_size, 64,

Demo

Summary • Why Katakana? • Introduction to Sequence-to-Sequence Model •