Slide 1

Slide 1 text

文献紹介 平成29年5月23日(火) Recurrent Neural Network based Language Model 長岡技術科学大学 自然言語処理研究室 修士2年 NGUYEN VAN HAI

Slide 2

Slide 2 text

Information 2 Tomas Mikolov, Martin Karafiat, Lukas Burget, JanCernocky, and Sanjeev Khudanpur Recurrent neural network based language model In 11th Annual Conference of the International Speech Communication Association, pp.1045– 1048, 2010

Slide 3

Slide 3 text

1. Introduction • Statistical language modeling: • Predict the next word in textual data • Special language domain: • Sentence must be described by parse trees • Morphology of words, syntax and semantics • There are some significant progress in language model • Measure by ability of models to better predict sequential data 3

Slide 4

Slide 4 text

2. Model Description • Simple Recurrent Neural Network • Optimization 4

Slide 5

Slide 5 text

2.1 Simple Recurrent Neural Network 5

Slide 6

Slide 6 text

2.1 Simple Recurrent Neural Network 6 • Networks are trained in several epochs • Weights are initialized to small values • Train network by standard backpropagation algorithm with stochastic gradient descent • Error vector:

Slide 7

Slide 7 text

2.2 Optimization 7 • Word-probabilities:

Slide 8

Slide 8 text

3. Experiments • Wall Street Journal (WSJ) Experiments • NIST Rich Transcription Evaluation 2005 (RT05) Experiments 8

Slide 9

Slide 9 text

3.1 WSJ Experiments • Training corpus • 37M words from NYT section of English Gigaword • Training 6.4M words (300K sentences) • Perplexity evaluated on 230K words • Kneser-Ney smoothed 5-gram as KN5 • RNN 90/2 • Hidden layer size is 90 • Threshold for merging words to rare token is 2 9

Slide 10

Slide 10 text

3.1 WSJ Experiments 10

Slide 11

Slide 11 text

3.1 WSJ Experiments 11

Slide 12

Slide 12 text

3.1 WSJ Experiments 12

Slide 13

Slide 13 text

3. NIST RT05 Experiments 13

Slide 14

Slide 14 text

Conclusion and future work • In WSJ, WER • Around 18% with the same data • Around 12% when backoff model is trained with data 5 times than RNN model • NIST RT05 can outperform big backoff models 14 Vietnamese Morphological Analysis 2017/05/17