Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Paper Reading: Sampling-Based Approximations to...

Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation

Avatar for Hiroyuki Deguchi

Hiroyuki Deguchi

February 15, 2023
Tweet

More Decks by Hiroyuki Deguchi

Other Decks in Research

Transcript

  1. ◼ ⚫ ⚫ 𝒚MAP = argmax 𝒉∈𝒴 log 𝑝 𝒉

    | 𝒙, 𝜃 𝒴 ▶ ⚫ 𝒚MBR = argmax 𝒉∈𝒴 𝔼 𝑢 𝒚∗, 𝒉 | 𝒙, 𝜃 = argmax 𝒉∈𝒴 𝜇𝑢 𝒉; 𝒙, 𝜃 ▶ 𝑢 𝒉 ∈ 𝒴 𝒚∗ ∈ 𝒴 ◼ 𝒴 𝜇𝑢 ⚫ ▶ ▶ 𝜇𝑢
  2. (Eikema&Aziz, COLING2020) ◼ 𝑁 ഥ ℋ 𝒙 = 𝒚 1

    , … , 𝒚 𝑁 ⚫ ◼ 𝜇𝑢 𝒉; 𝒙, 𝜃 ⚫ ො 𝜇𝑢 𝒉; 𝒙, 𝑁 ≔ 1 𝑁 σ𝑛=1 𝑁 𝑢 𝒚 𝑛 , 𝒉 ⚫ 𝒚NbyN ≔ argmax𝒉∈ ഥ ℋ 𝒙 ො 𝜇𝑢 𝒉; 𝒙, 𝑁 ◼ ⚫ 𝑁2 ▶ ▶ 𝒪 𝑁2 × 𝑈 , 𝑈 is the uppperbound cost to assess the utility function once. ⚫ “Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation”, Eikema&Aziz, COLING2020
  3. ◼ 𝑆 < 𝑁 ො 𝜇𝑢 𝒪 𝑁2 × 𝑈

    → 𝒪 𝑁 × 𝑆 × 𝑈 ◼ 𝑇 ො 𝜇𝑢proxy ⚫ ഥ ℋ𝑇 𝒙 ≔ top𝑇𝒉∈ ഥ ℋ 𝒙 ො 𝜇𝑢proxy 𝒉; 𝒙, 𝑆 ⚫ 𝒚C2F ≔ argmax𝒉∈ ഥ ℋ𝑇 𝒙 ො 𝜇𝑢target 𝒉; 𝒙, 𝐿 ▶ 𝒪 𝑁 × 𝑆 × 𝑈proxy + 𝑇 × 𝐿 × 𝑈target ▶ 𝑆 = 5 𝑆 = 50
  4. ◼ ⚫ ⚫ ⚫ ◼ ◼ (Stanojević&Sima’an, WMT2014) ⚫ ◼

    “BEER: BEtter Evaluation as Ranking”, Stanojević&Sima’an, WMT2014
  5. ◼ 𝒚NbyS ≔ argmax 𝒉∈ 𝒚 𝑘 𝑘=1 𝑁 ො

    𝜇𝑢 𝒉; 𝒙, 𝑆 ◼ 𝑆 ◼ 𝑆
  6. ◼ ⚫ ▶ 𝑁 = 405 ▶ 𝑆 = 13

    ⚫ ▶ top𝑇 = 50 ▶ ▶ 𝐿 = 100 ⚫ 𝑁 = 405 ◼ ⚫