Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sequence-to-Dependency Neural Machine Translation

Mamoru Komachi
September 19, 2017

Sequence-to-Dependency Neural Machine Translation

Wu et al. Sequence-to-Dependency Neural Machine Translation (ACL 2017).

Presented by Mamoru Komachi at Tokyo Metropolitan University.

Mamoru Komachi

September 19, 2017
Tweet

More Decks by Mamoru Komachi

Other Decks in Research

Transcript

  1. Sequence-to-Dependency Neural Machine Translation Shuangzhi Wu, Dongdong Zhang, Nan Yang,

    Mu Li, Ming Zhou ACL 2017 ※εϥΠυதͷਤද͸࿦จ͔ΒҾ༻͞Εͨ΋ͷ খொक <[email protected]> ACL2017 ಡΈձ@౦޻େ͔͚ͣ͢୆Ωϟϯύε 2017/09/19
  2. Sequence-to-dependency NMT Ͱґଘߏ଄Λ׆༻ | Τϯίʔμ͸ී௨ͷΞςϯγϣϯ͖ͭ bidirectional RNN | σίʔμ͸຋༁ͷ୯ޠੜ੒ͱґଘߏ଄໦Λಉ࣌ ʹղੳ͢ΔϞσϧ

    { 1ͭͷ RNN Ͱ୯ޠੜ੒ʢͪ͜Β͸ී௨ʣ { ΋͏1ͭͷ RNN Ͱ arc-standard shift-reduce ΞϧΰϦζϜʹΑΔґଘߏ଄ղੳ | தӳɺ೔ӳ຋༁Ͱ NMT/SMT ϕʔεϥΠϯΑΓ ౷ܭతʹ༗ҙͳੑೳ޲্ 5
  3. | ೖྗ: 𝑋 = 𝑥!, … , 𝑥" | ग़ྗ:

    𝑌 = 𝑦!, … , 𝑦# | ೖྗͷӅΕϕΫτϧ: 𝐻 = ℎ!, … , ℎ" | ग़ྗӅΕ૚: 𝑠$ | จ຺: 𝑐$ { Eq. (3)-(5) ͸Ξςϯ γϣϯͷܭࢉ 2017೥ݱࡏ;ͭ͏ͷ NMT (Sutskever et al., 2014; Bahdanau et al., 2015) 6
  4. Seq2dep NMT ͸ґଘߏ଄໦ T Λ shift-reduce ͷܥྻͰදݱ { 𝐴 =

    𝑎! , … , 𝑎" ͨͩ͠ 𝑙 = 2nʢn ͸ X ͷ௕͞ʣ ※Ξςϯγϣϯͷ a ͱ͸ผ { 𝑎# 𝜖 SH, RR 𝑑 , LR(𝑑) ※͜͜Ͱ͸ϥϕϧ͖ͭґଘߏ଄ 8
  5. ୯ޠ RNN ͱґଘߏ଄ղੳ RNN Λಉ࣌ʹ༻͍ͯղੳ | ΞΫγϣϯ͕ shift ͷͱ͖͚ͩ୯ޠ RNN

    Ͱੜ੒ | ୯ޠ RNN ͕ EOS Λग़ྗ͠ɺ͔ͭελοΫͷத ͷશ୯ޠ͕ reduce ͞Εͨͱ͖ʹऴྃ 9
  6. தӳɾ೔ӳ຋༁Ͱ࣮ݧ | σʔλ { தӳ: LDC͔Β200ສจରΛ܇࿅ɺNIST2003Λ։ ൃɺNIST2005, NIST2006, NIST2008, NIST2012

    Λ ςετ { ೔ӳ: ASPEC͔Β100ສจରΛ܇࿅ɺ1,790จରΛ ։ൃɺ1,812จରΛςετ | πʔϧ { ґଘߏ଄͸ arc-eager ґଘߏ૝ղੳ (Zhang and Nivre, 2011) 13
  7. SD-NMT ͷ࣮ݧઃఆ | ޠኮαΠζ: 30,000୯ޠʢ྆ଆʣ | ະ஌ޠॲཧ: unk ஔ׵ͱޙॲཧ (Luong

    et al., 2015) | ୯ޠຒΊࠐΈͱΞΫγϣϯຒΊࠐΈ: 512࣍ݩ | RNN ͷӅΕঢ়ଶ: 1024࣍ݩ | ॳظԽ: ਖ਼ن෼෍ (Glorot and Bengio, 2010) | ࠷దԽ: SGDʢֶश཰=1.0ʣͱ Adadelta | όοναΠζ: 96 | ϏʔϜαΠζ: 12ʢ୯ޠ༧ଌͱΞΫγϣϯ༧ଌ྆ํʣ 14
  8. ϕʔεϥΠϯ͸ͲͪΒ΋ ஶऀΒ͕࣮૷ͨ͠Ϟσϧ | SMT { ֊૚తϑϨʔζϕʔεϞσϧ (Chiang, 2005) { English

    Gigaword ͱλʔήοτଆͷίʔύεͰ܇࿅͠ ͨ 4-gram ݴޠϞσϧʢKneser-Ney εϜʔδϯάʣ | NMT { RNNsearch (Bahdanau et al., 2015) { ύϥϝʔλ͸ SD-NMT ͱಉ͡ | ධՁ { தӳ: BLEU-4ʢBootstrap resampling Ͱ༗ҙࠩݕఆʣ { ೔ӳ: BLEU+RIBES 15
  9. SD-NMT ͕ SMT ͱ NMT ϕʔεϥΠϯΑΓ΋ߴੑ ೳ | SD-NMT¥K ͸

    target bigram dependency Λ ߟྀ͠ͳ͍Ϟσϧ | ଠࣈ͸ϕʔεϥΠϯͱൺֱͯ͠౷ܭత༗ҙ ʢp<0.05ʣ 16
  10. ػց຋༁ͷ໨తݴޠଆͷ ౷ޠߏ଄͸ѻ͍ʹ͍͘ | SMT Ͱ͸ string-to-tree (Liu et al., 2006)

    ΍໨త ݴޠͰґଘߏ଄Λ࢖͏ݴޠϞσϧ (Shen et al., 2008) ͕ఏҊ͞Ε͍ͯͨ →SMT Ͱ͸େମʹ͓͍ͯ tree-to-string ͷํ͕ ߴੑೳ͕ͩͬͨ…… | NMT Ͱ͸ tree-to-sequence Ξςϯγϣϯ NMT Ϟσϧ (Eriguchi et al., 2016) ͕ఏҊ͞Ε ͍ͯͨ →ιʔεͱൺ΂ͯλʔήοτͷํ͕౷ޠ৘ใΛ ೖΕʹ͍͘ 21
  11. ·ͱΊͱࠓޙͷ՝୊ TUSJOHUPEFQFOEFODZ/.5 ·ͱΊ | ୯ޠੜ੒ͱ arc-standard ͳґଘߏ଄ղੳΛಉ ࣌ʹղੳ͢Δ string-to-dependency NMT

    ࠓޙͷ՝୊ | ଞͷࣄલ஌ࣝʢҙຯʣΛ NMT ʹ౷߹ | ଞͷ seq2seq λεΫʢจॻཁ໿ʣʹద༻ 22
  12. ॴײ | γϯϓϧͳํ๏͕ͩɺґଘߏ଄ղੳΛߟྀͨ͠ string-to-dependency NMT ϞσϧͰ͍͍ײ͡ɻ | ୯ޠੜ੒ RNN ͱґଘߏ଄ղੳ

    RNN ͸ຊ౰ʹ ͪΌΜͱڠௐͯ͠ಈ͘ͷͩΖ͏͔ʁ { ґଘߏ૝ղੳثͷੑ֨ʹΑͬͯɺ౷߹Մೳ͔Ͳ ͏͔͕มΘͬͯ͘ΔͷͰ͸ʁ { Ξϯαϯϒϧͷ࢓ํ͕࣮͸ࣗ໌Ͱ͸ͳ͍ʁ { ग़ྗଆ͕͋·Γؤ݈Ͱ͸ͳͦ͞͏ 23
  13. ࣭ٙԠ౴ᶃ | Q: bigram embeddings ͱ͸ԿΛ͍ͯ͠Δ͜ͱ ʹ૬౰͢Δͷ͔ʁ A: bigram ͱ͍͏໊લ͕෇͍͍ͯΔ͕ɺී௨ͷ

    ୯ޠ N-gram ͷΑ͏ͳ bigram Ͱ͸ͳ͘ɺґଘ ߏ଄ղੳͷ bigram Λ༻͍Δɻ୯ޠͷڞىΛݟ ͍ͯΔ͜ͱʹ૬౰͢Δɻ ࣮ݧ݁ՌΛݟΔͱɺbigram embeddings Λ༻ ͍ͳͯ͘΋ RNNsearch ΑΓ޲্͍ͯ͠Δ͕ɺ ্͕Γ෯͸ SD-NMT ΑΓ bigram embeddings ͷํ͕େ͖͍ 24
  14. ࢀߟจݙᶃ | Wu et al. Sequence-to-Dependency Neural Machine Translation. ACL

    2017. | Eriguchi et al. Tree-to-Sequence Attentional Neural Machine Translation. ACL 2016. | Joakim Nivre. Incrementality in Deterministic Dependency Parsing. Workshop on Incrementral Parsing: Bringing Engineering and Cognition Together. 2004. | Shen et al. A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model. ACL 2008. 26
  15. ࢀߟจݙᶄ | Luong et al. Addressing the Rare Word Problem

    in Neural Machine Translation. ACL 2015. | Zhang and Nivre. Transition-based Dependency Parsing with Rich Non-local Features. ACL 2011. | David Chiang. A Hierarchical Phrase-based Model for Statistical Machine Translation. ACL 2005. | Fabien Cromieres. Kyoto-NMT: A Neural Machine Translation Implementation in Chainer. COLING 2016. 27