Confidence Modeling
for
Neural Machine Translation
Taichi Aida, Kazuhide Yamamoto
Nagaoka University of Technology
IALP2019
Slide 2
Slide 2 text
Introduction
- Neural Machine Translation (NMT)
system is widely used
- NMT system outputs are wide range
of quality
- there may be mistranslations...
2
➢Can we estimate the quality of sentence
in the process of generation?
Slide 3
Slide 3 text
Introduction
- Our goal is outputting only high-
quality translations
- NOT output low-quality translations
➔input sentences > output sentences
- Output translations are reliable
- Helps for translators
3
Slide 4
Slide 4 text
Methods
4
Figure 1. Overview of proposed method
Slide 5
Slide 5 text
Methods
5
Figure 1. Overview of proposed method
Slide 6
Slide 6 text
Methods
6
Figure 1. Overview of proposed method
Slide 7
Slide 7 text
Methods
- Indices
- Sentence log-likelihood
- Average variance 7
Slide 8
Slide 8 text
Sentence log-likelihood
8
Slide 9
Slide 9 text
Sentence log-likelihood
9
- Taking the sum of log-probabilities
of all output words in a sentence
Slide 10
Slide 10 text
Average variance
10
Slide 11
Slide 11 text
Average variance
11
- Using top-5 candidates to calculate
variance from top in each part of
sentence
Slide 12
Slide 12 text
Experiments
1. Appropriateness of indices
- correlation with BLEU
- (The Pearson correlation coefficient)
2. Using threshold
- Changing the threshold…
- number of output sentences
- average BLEU in output sentences
12
Slide 13
Slide 13 text
Experiments
- Model
- Transformer (fairseq)
- Data
- ASPEC-JE (translating scientific papers)
13
Train Validation Test
1,000,000 1,790 1,812
Slide 14
Slide 14 text
1. Appropriateness of indices
- correlation with BLEU
Results
14
Indices ρ(index, BLEU)
Sentence log-likelihood 0.308
Average variance 0.268
Conclusion
➢We proposed calculating a translation
confidence from NMT features
○ sentence log-likelihood
○ average variance
➢It can limit low-quality translations
in the process of generation
17