Slide 12
Slide 12 text
Adaptation 実験結果
Adaptation methods EN-SC EN-ROBOT JP-ROBOT
(No adaptation) 34.70 8.99 6.24
Cross Entropy 10.78 6.44 4.06
sMBR [1] 10.83 6.65 3.14
KL-regularization [2] 10.77 6.23 3.28
ViterbiNet AM 9.65 6.09 2.93
ViterbiNet WFST 13.08 3.89 3.06
ViterbiNet E2E 9.26 3.54 2.66
提案手法
• ViterbiNet AM: WFSTのパラメータは固定してAMのパラメータだけ再学習
• ViterbiNet WFST: AMのパラメータは固定してWFSTの重みだけ再学習
• ViterbiNet E2E: WFST/AMのどちらもパラメータを再学習
Sentence Error Rate (%)
[1] M. Gibson and T. Hain, “Hypothesis spaces for minimum Bayes risk training in large vocabulary speech recognition,” in 9th International Conference on Spoken Language Processing, 2006.
[2] D. Yu, K. Yao, H. Su, G. Li, and F. Seide, “KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition,” in Proc. of IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), 2013, pp. 7893–7897.