文献紹介: Bag-of-Words as Target for Neural Machine Translation

1 Bag-of-Words as Target for Neural Machine Translation 文献紹介 2019/1/22
長岡技術科学大学自然言語処理研究室稲岡夢人

Literature • Bag-of-Words as Target for Neural Machine Translation •
Shuming Ma, Xu SUN, Yizhong Wang, Junyang Lin • Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 332-338, 2018. 2

Abstract  翻訳において正解はひとつじゃない  既存のNMTではひとつのみを正解として使用 → 他の正解は誤りとして学習される  正解同士は似たBag-of-Words (BoW)
を共有する → BoWによって正解とそれ以外を区別できる  学習セットにない正解を考慮するためにBoWを利用 → 中国語-英語の翻訳において優位性を確認 3

Introduction  NMTは首尾一貫の妥当な翻訳の生成ができる  現在のNeural Machine Translation (NMT)の多くはSequence-to-Sequence モデル(Seq2Seq)に
基づいている 4

Seq2Seq (Overview) 5 私は元気だ <BOS> I am fine
<EOS> 入力文出力文 Encoder Decoder

Seq2Seq (Encoder) 6 私は元気だ One-hot vector Embedding layer
Recurrent layer 入力文

Seq2Seq (Decoder) 7 I <BOS> I am fine am fine
<EOS> One-hot vector Embedding layer Recurrent layer One-hot vector Output layer 出力文

Introduction  NMTではひとつの正解のみを学習に用いる  他の正解は誤った翻訳と学習 → 悪影響を与える可能性 8

Introduction  正しい翻訳は似たBoWを共有 → 正しい翻訳と誤った翻訳は BoWで区別できる  文とBoWの両方を対象とする手法を提案 →
T.2よりT.1を優遇 9

Bag-of-Words Generation  マルチラベル分類問題のようにBoWを生成  Decoderの出力である単語レベルのスコアベクトルを合計して、文レベルのスコアベクトルを得る  文レベルのスコアベクトルは、文中の任意の位置に
対応する単語が出現する確率を表す 10

Notation  データセットに含まれるサンプル数：N  i番目のサンプル：(, ) (x: source, y: target)
 = 1 , 2 , … , = 1 , 2 , … , = 1 , 2 , … , はのBoWを表す 11

Bag-of-Words Generation  12 = softmax = �

Targets and Loss Function  文の翻訳とBoWの生成でそれぞれ損失関数(1 , 2 )を定義 
重みで2つの損失を足し合わせる() (𝑖𝑖 : epoch , k, : fixed-value) 1 = − � =1 log l2 = − � =1 log = 1 + 2 = min(, + 𝛼𝛼) 13 𝑖𝑖

Experiments  LDCコーパス(1.25M)で学習、NIST翻訳タスクで評価  語彙サイズを英中それぞれ5万語に設定  BLEUで評価 14

Results 15 4.55 BLEU points↑

Results 16 4.55 BLEU points↑

Results 17

Conclusions  正解訳とBoWの両方を考慮する手法を提案  提案手法が強力なベースラインに対して優位である結果  Morphologically-rich language*や低資源言語においてどのように適用するかについて今後の課題とする *
文法的関係が相対位置や助詞ではなく単語の変化で決まるような言語 18

文献紹介: Bag-of-Words as Target for Neural Machine...

文献紹介: Bag-of-Words as Target for Neural Machine Translation

Yumeto Inaoka

More Decks by Yumeto Inaoka

Other Decks in Research

Featured

Transcript

1 Bag-of-Words as Target for Neural Machine Translation 文献紹介 2019/1/22

Literature • Bag-of-Words as Target for Neural Machine Translation •

Abstract  翻訳において正解はひとつじゃない  既存のNMTではひとつのみを正解として使用 → 他の正解は誤りとして学習される  正解同士は似たBag-of-Words (BoW)

Introduction  NMTは首尾一貫の妥当な翻訳の生成ができる  現在のNeural Machine Translation (NMT)の多くはSequence-to-Sequence モデル(Seq2Seq)に

Seq2Seq (Overview) 5 私は元気だ <BOS> I am fine

Seq2Seq (Encoder) 6 私は元気だ One-hot vector Embedding layer

Seq2Seq (Decoder) 7 I <BOS> I am fine am fine

Introduction  NMTではひとつの正解のみを学習に用いる  他の正解は誤った翻訳と学習 → 悪影響を与える可能性 8

Introduction  正しい翻訳は似たBoWを共有 → 正しい翻訳と誤った翻訳は BoWで区別できる  文とBoWの両方を対象とする手法を提案 →

Bag-of-Words Generation  マルチラベル分類問題のようにBoWを生成  Decoderの出力である単語レベルのスコアベクトルを合計して、文レベルのスコアベクトルを得る  文レベルのスコアベクトルは、文中の任意の位置に

Notation  データセットに含まれるサンプル数：N  i番目のサンプル：(, ) (x: source, y: target)

Bag-of-Words Generation  12 = softmax = �

Targets and Loss Function  文の翻訳とBoWの生成でそれぞれ損失関数(1 , 2 )を定義 

Experiments  LDCコーパス(1.25M)で学習、NIST翻訳タスクで評価  語彙サイズを英中それぞれ5万語に設定  BLEUで評価 14

Results 15 4.55 BLEU points↑

Results 16 4.55 BLEU points↑

Results 17

Conclusions  正解訳とBoWの両方を考慮する手法を提案  提案手法が強力なベースラインに対して優位である結果  Morphologically-rich languageや低資源言語においてどのように適用するかについて今後の課題とする