Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

youichiro
November 12, 2018

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

長岡技術科学大学
自然言語処理研究室
文献紹介(2018-11-13)

youichiro

November 12, 2018
Tweet

More Decks by youichiro

Other Decks in Technology

Transcript

  1. A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

    Shamil Chollampatt and Hwee Tou Ng Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018  2018-11-13       0
  2. • Convolutional Encoder-DecoderGEC •    • RNN •

    Pre-trained word embeddings  A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction 1
  3.   # • Edit Operation (EO) • !$"$" !

    • Language model (LM) • 5-gram LM  • $"  Rescore 4
  4. • Training • Lang-8 + NUCLE (1.3M sentence pairs) •

    Development • NUCLE (5.4K sentence pairs) • Pre-training word embeddings • Wikipedia (1.78B words) • Training language model • Common Crawl corpus (94B words) Dataset 5
  5. RNN vs CNN 14 RNN*-'+)%$ -'"&   → Precision

    CNN,'#   → (!   → Recall
  6. • Convolutional Encoder-DecoderGEC • CNNRNN   •  SoTA

    • Pre-trained word embeddings • Language modelEdit Operation  •  Conclusion 16
  7. 17

  8. Model and Training Details • Source and target embeddings: 500

    dimensions • Source and target vocabularies: 30K (BPE) • Pre-trained word embeddings • Using fastText • On the Wikipedia corpus • Using a skip-gram model with a window size of 5 • Character N-gram sequences of size between 3 and 6 • Encoder-decoder • 7 convolutional layers • With a convolution window width of 3 • Output of each encoder and decoder layer: 1024 dimensions • Dropout: 0.2 • Batch size: 32 • Learning rate: 0.25 with learning rate annealing factor of 0.1 • Momentum value: 0.99 • Beam width: 12 • Training a single model tales around 18 hours 18