A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction
Shamil Chollampatt and Hwee Tou Ng Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018 2018-11-13 0

• Convolutional Encoder-DecoderGEC • • RNN •
Pre-trained word embeddings A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction 1

A Multilayer Convolutional Encoder-Decoder NN 2

• fastTextword embeddings üEnglish ü Pre-trained
word embeddings 3

# • Edit Operation (EO) • !$"$" !
• Language model (LM) • 5-gram LM • $" Rescore 4

• Training • Lang-8 + NUCLE (1.3M sentence pairs) •
Development • NUCLE (5.4K sentence pairs) • Pre-training word embeddings • Wikipedia (1.78B words) • Training language model • Common Crawl corpus (94B words) Dataset 5

Result 6

Result Pre-trained embeddings → 7

Result Ensemble → 8

Result +Rescore → 9

Result +SpellCheck → 10

Result → SoTA 11

RNN vs CNN 12

RNN vs CNN 13

RNN vs CNN 14 RNN*-'+)%$ -'"& → Precision
CNN,'# → (! → Recall

Embedding Initialization 15

• Convolutional Encoder-DecoderGEC • CNNRNN • SoTA
• Pre-trained word embeddings • Language modelEdit Operation • Conclusion 16

Model and Training Details • Source and target embeddings: 500
dimensions • Source and target vocabularies: 30K (BPE) • Pre-trained word embeddings • Using fastText • On the Wikipedia corpus • Using a skip-gram model with a window size of 5 • Character N-gram sequences of size between 3 and 6 • Encoder-decoder • 7 convolutional layers • With a convolution window width of 3 • Output of each encoder and decoder layer: 1024 dimensions • Dropout: 0.2 • Batch size: 32 • Learning rate: 0.25 with learning rate annealing factor of 0.1 • Momentum value: 0.99 • Beam width: 12 • Training a single model tales around 18 hours 18

19 Other Result

20 Analysis

21 https://github.com/nusnlp/mlconvgec2018

A Multilayer Convolutional Encoder-Decoder Neur...

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

youichiro

More Decks by youichiro

Other Decks in Technology

Featured

Transcript