Upgrade to Pro — share decks privately, control downloads, hide ads and more …

20240226_AAMT-Japio

 20240226_AAMT-Japio

Hiroyuki Deguchi

February 26, 2024
Tweet

More Decks by Hiroyuki Deguchi

Other Decks in Research

Transcript

  1. ◼ ⚫ ⇔ ⇔ ⇔ ⇔ ⇔ ⇔ ⚫ ◼

    ⚫ ⚫ ⚫ ⚫ ◼ ⚫ ▶ ▶ ▶ ⚫
  2. ◼ ◼ ⚫ 𝑘 ⚫ 𝑝 ◼ ⚫ (Lee+, ACL2021)

    ⚫ (Fernandes+, NAACL2022) Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
  3. ◼ ◼ ◼ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ 𝑃 =

    1−𝜆 7 𝑝MT1 + ⋯ + 𝑝MT7 + 𝜆𝑝𝑘NN ▶ 𝑘 = 64, 𝜆 = 0.1, 𝜏 = 100 Khandelwal+, ICLR2021, ``Nearest Neighbor Machine Translation’’.
  4. 𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ◼ (Deguchi+, ACL2023) ⚫ ◼

    Khandelwal+, ICLR2021, ``Nearest Neighbor Machine Translation’’. Deguchi+, ACL2023, ``Subset Retrieval Nearest Neighbor Machine Translation’’.
  5. 𝒌 ◼ ⚫ ▶ ∈ ℝ𝐷 ▶ ∈ 𝒱𝑌 ⚫

    𝑓 𝒙, 𝒚<𝑡 ∈ ℝ𝐷 𝑦𝑡 ∈ 𝒱𝑌 ℳ ⊆ ℝ𝐷 × 𝒱𝑌 𝒙 𝒚
  6. 𝑘 𝒌 ◼ 𝑞 ∈ ℝ𝐷 ◼ 𝑞 𝑘 ◼

    𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ ෍ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒒 − 𝒌𝑖 2 2 𝜏 ◼ ⚫
  7. ◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)

    ⚫ ⚫ ▶ ⚫ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
  8. (Lee+, ACL2021) ◼ ◼ ℒ 𝜃 = − σ 𝑗=1

    𝑛 𝑝𝑇 𝑢𝑗 log 𝑝𝑀 𝑢𝑗 ∣ 𝑥; 𝜃 ⚫ 𝜇 ⋅,⋅ ∈ [0, 1] 𝑝𝑇 𝑢𝑖 ∝ exp 𝜇(𝑢𝑖,𝑟) 𝑇 ⚫ 𝑝𝑀 𝑢𝑖 𝑥; 𝜃) ∝ exp 𝑜𝑖 𝑢𝑖 𝑥; 𝜃 ◼ ⚫ ⚫ 𝑟 𝑢𝑖 𝜇 𝑢𝑖 , 𝑟 ▶ 𝑝𝑀 𝑝𝑇 Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’.
  9. (Lee+, ACL2021) ◼ ⚫ ⚫ ⚫ ⚫ ⚫ 𝑇 =

    0.5 ▶ 𝑝𝑇 𝑢𝑖 ∝ exp 𝜇(𝑢𝑖,𝑟) 𝑇 ⚫ 𝛽1 = 0.9, 𝛽2 = 0.98 ⚫ ◼ ⚫ ▶ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’.
  10. (Fernandes+, ACL2021) ◼ ◼ ⚫ 𝑦MAP ∗ = argmax𝑦∈𝒴 log

    𝑝𝜃 𝑦|𝑥 ⚫ 𝑦MBR ∗ = argmaxℎ∈ℋ 𝔼ො 𝑦~𝑃 𝑦|𝑥 𝑢 ℎ, ො 𝑦 ≈ 1 𝑁 ෍ 𝑖=1 𝑁 𝑢 ℎ, ො 𝑦𝑖 ▶ 𝑢 ⋅,⋅ ◼ ⚫ Fernandes+ (NAACL2022) 𝑝 ▶ Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
  11. (Goel & Byrne, CS&L 2000; Kumar & Byrne, NAACL2004) Goel

    & Byrne, CS&L Vol14., 2000, ``Minimum Bayes-risk automatic speech recognition’’. Kumar & Byrne, NAACL2004, ``Minimum Bayes-Risk Decoding for Statistical Machine Translation’’. , 1 4 5 , ◼ 𝑦MBR ∗ ≔ argmaxℎ∈ℋ 𝔼ො 𝑦~𝑃 𝑦|𝑥 𝑢 ℎ, ො 𝑦 ⚫ ℋ ⊂ 𝒴 𝑢: 𝒴 × 𝒴 → ℝ ⚫ 𝑃 𝑦|𝑥 𝑥 𝑦 ◼ ෠ 𝒴 ∈ 𝒴 𝑦MBR ∗ ≈ argmax ℎ∈ℋ 𝔼ො 𝑦∈ ෠ 𝒴 𝑢 ℎ, ො 𝑦 ⚫ ▶ ෠ 𝒴 ≔ ℋ ⚫ 𝑁 ≔ ℋ 𝒪 𝑁2 ▶
  12. (Fernandes+, NAACL2022) ⚫ ⚫ 𝑓: 𝒳 ∪ 𝒴 → ℝ𝐷

    𝐷 ▶ 𝑥 ∈ 𝒳 ▶ ℎ ∈ 𝒴 ▶ ො 𝑦 ∈ 𝒴 ⚫ 𝑠: ℝ𝐷 × ℝ𝐷 × ℝ𝐷 → ℝ ◼ 𝑦COMET_MBR ∗ = argmaxℎ∈ℋ 𝔼ො 𝑦∈ ෠ 𝒴 𝑠 𝑓 𝑥 , 𝑓 ℎ , 𝑓 ො 𝑦 ⚫ (Fernandes+, NAACL2022) ⚫ 𝒪 𝑁2 : ( ) ( ) : × × Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’. ◼
  13. ◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)

    ⚫ ⚫ ▶ ⚫ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022 , ``Quality-Aware Decoding for Neural Machine Translation’’.
  14. ◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶ ▶

    ▶ +, NLP2024, `` ’’. Deguchi+, arXiv, ``Centroid-Based Efficient Minimum Bayes Risk Decoding’’. https://arxiv.org/abs/2402.11197