Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Visualizing and Understanding Neural Machine Translation

Mamoru Komachi
September 15, 2017

Visualizing and Understanding Neural Machine Translation

Presentation slides for introducing the paper "Visualizing and Understanding Neural Machine Translation" by Ding et al., ACL 2017.

The slides were presented at https://sites.google.com/view/snlp-jp/home/2017 by Mamoru Komachi, Tokyo Metropolitan University

Mamoru Komachi

September 15, 2017
Tweet

More Decks by Mamoru Komachi

Other Decks in Research

Transcript

  1. Visualizing and Understanding Neural Machine Translation Yanzhuo Ding, Yang Liu,

    Huanbo Luan, Maosong Sun ACL 2017 ※εϥΠυதͷਤද͸࿦จ͔ΒҾ༻͞Εͨ΋ͷ খொक <[email protected]> ୈ9ճ࠷ઌ୺NLPษڧձ@ϦΫϧʔτMTLΧϑΣ 2017/09/15
  2. χϡʔϥϧ຋༁ಛ༗ͷ໰ ୊͕͍͔ͭ͋͘Γ·͢ | ະ஌ޠͷѻ͍ →the largest UNK in the world

    | Under-translation, over-translation →in the history of the history of the history of the … | શવؔ܎ͷͳ͍୯ޠΛग़ྗ 3 Sato et al., Japanese-English Machine Translation of Recipe Texts. WAT 2016.
  3. NMT ಛ༗ͷޡΓͷΤϥʔ ෼ੳ͸؆୯Ͱ͸͋Γ·ͤ Μ ݪ จ খܕ ߕ֪ ྨ ͯ㿆

    ͸ , Ξϛ ྨ ͷ ΞΧΠιΞϛ , ϫϨΧ ϥ ྨ ͷ χοϙϯϫϨΧϥ ͱ πΧ㿆ϧϫϨΧϥ ͸ ҵ৓ ݝ ͯ㿆 ॳΊͯ ֬ೝ ͞ Ε ͨ ɻ N M T in small crustaceans , <unk> and <unk> of <unk> and <unk> were con- firmed for the first time in Ibaraki Prefecture . ࢀ র ༁ among the small-type Crustacea , Paracanthomysis hispida of Mysidae , and Caprella japonica and C. tsugarensis of Caprellidae were confirmed for the first time in Ibaraki Prefecture . 4 Matsumura and Komachi. Tokyo Metropolitan University Neural Machine Translation System for WAT 2017. WAT 2017.
  4. ຊݚڀͷ3ߦ·ͱΊ | ૚͝ͱͷద߹ੑ఻೻ LRP: layer-wise relevance propagation (Bach et al.,

    2015) Λ༻͍ͯ NMT ͷՄࢹԽͱղऍΛ͢Δख๏ΛఏҊ | Ξςϯγϣϯʹجͮ͘Τϯίʔμɾσίʔμϑ ϨʔϜϫʔΫ (Bahdanau et al., 2015) ʹ LRP ΛదԠ | தӳ຋༁ͰέʔεελσΟΛߦ͍ɺNMT ͷ຋༁ ޡΓΛ෼ੳʢ→Ξςϯγϣϯ͚ͩΛ༻͍Δͷͱ ൺ΂ͯɺղऍɾσόοά͠΍͍͢ʣ 6
  5. 2017೥ݱࡏ;ͭ͏ͷ NMT 𝑃 𝒚 𝒙; 𝜽 = ' !"# $

    𝑃(𝑦!|𝒙, 𝒚%!; 𝜽) | 𝑃(𝑦!|𝑥, 𝑦%!; 𝜃) = 𝜌(𝑦!&#, 𝑠!, 𝑐!) | 𝑠! = 𝑔(𝑠!&#, 𝑦!, 𝑐!) | 𝑐! = ∑'"# ()# 𝛼!,'ℎ' { ℎ! = [ℎ! ; ℎ! ] { ℎ! = 𝑓(ℎ!"# , 𝑥! ) { ℎ! = 𝑓(ℎ!$# , 𝑥! ) 7 x: ೖྗʢI୯ޠʣ y: ग़ྗʢJ୯ޠʣ f, g, ρ: ඇઢܗؔ਺
  6. χϡʔϥϧωοτϫʔΫ ͷՄࢹԽɾղऍͷ໰୊ઃ ఆ |࠷ऴग़ྗ૚ʹೖྗ ૚ͷϢχοτ͕Ͳ Ε͘Β͍ߩݙ͢Δ ͔ܭࢉ (Bach et al.,

    2015; Li et al., 2016) |Ξςϯγϣϯʹج ͮ͘Τϯίʔμɾ σίʔμͰ஌Γͨ ͍ͷ͸ɺݪݴޠͱ ໨తݴޠͷ୯ޠ͕ ӈͷঢ়ଶʹͲΕ͘ Β͍ߩݙ͢Δ͔ 1. ℎ! = 𝑓(ℎ!"#, 𝑥!): ݪݴޠͷલ޲͖ӅΕঢ়ଶ 2. ℎ! = 𝑓(ℎ!$#, 𝑥!): ݪݴޠͷޙΖ޲͖ӅΕঢ়ଶ 3. ℎ! = [ℎ!; ℎ!]: ݪݴޠͷӅΕঢ়ଶ 4. 𝑐% = ∑!&# '$# 𝛼%,!ℎ! : ݪݴޠͷจ຺ϕΫτϧ 5. 𝑠% = 𝑔 𝑠%"#, 𝑦%, 𝑐% : ໨తݴޠͷӅΕঢ়ଶ 6. 𝑦% : ໨తݴޠͷ୯ޠຒΊࠐΈ 8
  7. ୯७ͳϑΟʔυϑΥϫʔυ ωοτϫʔΫͰؔ࿈౓Λܭࢉ | Wͷܭࢉ͸ԋࢉ಺༰ʢߦྻͷੵɺཁૉ͝ͱͷੵɺ ࠷େ஋౳ʣʹΑͬͯม͑Δ (Bach et al., 2015) |

    O(|G|º|V|ºOmax ) ͰܭࢉՄೳ { Omax ͸ωοτϫʔΫதͷχϡʔϩϯͷ࠷େ࣍਺ { ωοτϫʔΫશମͷχϡʔϩϯͷܭࢉΛ͢Δͷ ͰΞςϯγϣϯͷܭࢉΑΓॏ͍͕ɺฒྻܭࢉ΍ ΩϟογϡʹΑͬͯߴ଎ԽͰ͖Δ 11
  8. தӳ຋༁ͰՄࢹԽ࣮ݧ | σʔλ { ܇࿅: 125ສจͷύϥϨϧίʔύε { ։ൃ: NIST 2003ʢϞσϧબ୒ʣ

    { ςετ: NIST 2004ʢՄࢹԽʣ | πʔϧ { GroundHog (Bahdanau et al., 2015) →։ൃσʔλͰͷ BLEU είΞ͸ 32.73 12
  9. ͔͜͜Βχϡʔϥϧ຋༁ಛ༗ͷ ໰୊ͷΤϥʔ෼ੳʢ࠶ܝʣ | ະ஌ޠͷѻ͍ →the largest UNK in the world

    | Under-translation, over-translation →in the history of the history of the history of the … | શવؔ܎ͷͳ͍୯ޠΛग़ྗ 16 Sato et al., Japanese-English Machine Translation of Recipe Texts. WAT 2016.
  10. ਂ૚ֶशͷՄࢹԽͷݚڀ ͸࢝·ͬͨ͹͔Γ | ը૾ೝࣝͰ͸ग़ྗ૚ʹͲΕ͘Β͍ೖྗ૚ͷ৘ใ ͕ؔ܎͢Δͷ͔ܭࢉ͢Δݚڀ͕੝Μʹ͋Δ (Bach et al., 2015; Li

    et al., 2016; …) { Bach et al. (2015) ͱͷҧ͍͸ɺNMT ͷೖྗ͸1 ϐΫηϧͰ͸ͳ͘୯ޠϕΫτϧͰ͋Δ͜ͱ →ϕΫτϧϨϕϧͷద߹ੑͱॏΈΛܭࢉ͢Δ { Li et al. (2016) ͱͷҧ͍͸ɺภඍ෼Ͱ͸ͳ͘ద߹ ੑΛ༻͍͍ͯΔ͜ͱ →׆ੑԽؔ਺͕ඍ෼ՄೳͰ΋׈Β͔Ͱͳͯ͘΋ ͍͍ | Ξςϯγϣϯ͸ιʔεͱλʔήοτͷ୯ޠͷؔ ܎͔͠ݟΒΕͳ͍͕ɺద߹ੑ͸೚ҙͷӅΕ૚ͷ ؒͷؔ܎ͷ౓߹͍Λܭࢉ͢ΔͨΊʹ࢖͑Δ 23
  11. ·ͱΊͱࠓޙͷ՝୊ /.5ͷՄࢹԽͱղऍ | Layer-wise relevance propagation Λ༻͍Δ ͜ͱͰ NMT ͷՄࢹԽͱղऍΛߦ͏ख๏ΛఏҊ

    { ೚ҙͷӅΕ૚ͱจ຺ͷؒͷؔ࿈౓ΛܭࢉՄೳ { ΞςϯγϣϯϝΧχζϜΑΓਂ͍෼ੳ͕Մೳ | ࠓޙͷ՝୊ { ଞͷ NMT ϞσϧɺଞͷݴޠରͰͷ༗ޮੑ { ຋༁ͷؔ࿈౓Λ໌ࣔతʹߟྀ͢ΔΑ͏ͳϞσϧ 24
  12. ॴײ | ΞςϯγϣϯΛݟΔ͘Β͍͔͠؆୯ʹNMTͷ෼ ੳ͕Ͱ͖ͳ͍ͱࢥ͍ͬͯͨͷͰɺ༗ޮͦ͏ɻ { NMT ಛ༗ͷޡΓͷݪҼͷݕ౼͕ͭ͘ͷ͸େ͖͍ { ܭࢉࣗମ͸݁ߏॏͨͦ͏ʢΩϟογϡ͢Ε͹͍ ͍ͱ͔ॻ͍ͯ͋Δ͕ɺ1૚ͷγϯϓϧͳ

    NMT ͩ ͔Β͜ΕͰಈ͍͍ͯΔͷͰ͸ʁʣ | ໨తݴޠͷจ຺͸໌Β͔ʹॏཁ͕ͩɺจ຤ه߸ ͸ຊ౰ʹ຋༁ޡΓͷݪҼͳͷ͔ʁ→ଞͷ໰୊͕ จ຤ه߸ʹݱΕ͍ͯΔͷͰ͸ͳ͍͔ʁ 25
  13. ࣭ٙԠ౴ᶃ | Q: ՄࢹԽ͕Ͱ͖Δͷ͸෼͔͕ͬͨɺ࣮ࡍʹ NMT ͷσόοά͕Ͱ͖ΔΑ͏ͳํ๏͸ఏҊ͞Ε ͍ͯΔͷ͔ʁ A: σόοάํ๏·Ͱ͸ఏҊ͞Ε͓ͯΒͣɺTu et

    al. (2017) ͷΑ͏ͳ context gate Λߟྀ͢Δॏ ཁੑʹ͍ͭͯࢦఠ͞Ε͍ͯͨɻকདྷతʹ͸͜͜ ͰޡΓͷݪҼͷݕ౼Λ͚ͭͯվળ͍͖͍ͯͨ͠ɻ 26
  14. ࣭ٙԠ౴ᶄ | Q: ͜ͷΑ͏ͳ෼ੳ͕Ͱ͖Δྫ͕͋Δͷ͸෼͔ͬ ͕ͨɺ࣮ࡍʹ͜ͷΑ͏ͳ෼ੳ͕Ͱ͖Δͷ͸ఆྔ తʹ͸ͲΕ͘Β͍͋Δͷ͔ʁ A: NMT ͱ PBSMT

    ͷΤϥʔͷ෼෍ʹ͍ͭͯ͸ྫ ͑͹ Sato et al. (2016) Ͱௐ΂͕ͨɺͦΕͧΕ ͷதͰͲΕ͘Β͍͕ࠓճͷख๏Ͱ͖Ε͍ʹՄࢹ ԽͰ͖Δͷ͔͸෼͔Βͳ͍ɻࠓޙ࣮૷ͯ͠ௐ΂ ͯΈ͍ͨɻ 27
  15. ࢀߟจݙ | Ding et al. Visualizing and Understanding Neural Machine

    Translation. ACL 2017. | Bach et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 2015. | Li et al. Visualizing and understanding neural models in NLP. NAACL 2016. | Tu et al. Context gates for neural machine translations. ACL 2017. 29
  16. ʢट౎େͷNMTʣؔ࿈จ ݙ | Matsumura et al. English-Japanese Neural Machine Translation

    with Encoder-Decoder- Reconstructor. arXiv 2017. | Sato et al. Japanese-English Machine Translation of Recipe Texts. WAT 2016. | Yamagishi et al. Improving Japanese-to- English Neural Machine Translation by Voice Prediction. IJCNLP 2017. | Matsumura and Komachi. Tokyo Metropolitan University Neural Machine Translation System for WAT 2017. WAT 2017. 30