Slide 15
Slide 15 text
● Language models
○ 5-gram LM: KenLM
○ Neural LM: Transformer decoder architecture
○ Dataset: One Billion Word Benchmark dataset
● Seq2seq models
○ SMT: (Junczys-Dowmunt and Grundkiewicz 2016)
○ NMT: Transformer
○ Datasets: NUCLE, Lang-8
● NMT, NLMはbyte pair encoding(BPE)を使用
● Beam size: 12
Experimental setup
15