Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20240226_AAMT-Japio
Search
Hiroyuki Deguchi
February 26, 2024
Research
190
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
20240226_AAMT-Japio
Hiroyuki Deguchi
February 26, 2024
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
1
770
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
350
サブセット探索を用いた高速なkNNニューラル機械翻訳
de9uch1
0
170
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
160
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
de9uch1
0
220
My Research Environmental Setup
de9uch1
0
340
Nearest Neighbor Machine Translation
de9uch1
0
280
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
310
paper reading - Tree Transformer
de9uch1
0
280
Other Decks in Research
See All in Research
SOTAのさらに先へ:厳しい推論制約下での高性能モデルのPost-Training
analokmaus
0
1.2k
COFFEE-Japan PROJECT Impact Report(Uminomukou Coffee)
ontheslope
0
170
東京大学工学部計数工学科、計数工学特別講義の説明資料
kikuzo
0
460
社内データ分析AIエージェントを できるだけ使いやすくする工夫
fufufukakaka
1
1.1k
National high-resolution cropland classification of Japan with agricultural census information and multi-temporal multi-modality datasets
satai
3
270
計算情報学研究室(数理情報学第7研究室)2026
tomohirokoana
0
520
PGDM: Physically Guided Diffusion Model for L Downscaling
satai
2
250
Φ-Sat-2のAutoEncoderによる情報圧縮系論文
satai
4
750
AIを叩き台として、 「検証」から「共創」へと進化するリサーチ
mela_dayo
0
280
[チュートリアル] 電波マップ構築入門 :研究動向と課題設定の勘所
k_sato
0
470
定数整数除算・剰余算最適化再考
herumi
1
120
COFFEE-Japan PROJECT Impact Report(海ノ向こうコーヒー)
ontheslope
0
1.8k
Featured
See All Featured
RailsConf 2023
tenderlove
30
1.5k
Claude Code のすすめ
schroneko
67
230k
Between Models and Reality
mayunak
4
330
Producing Creativity
orderedlist
PRO
348
40k
A Soul's Torment
seathinner
6
2.9k
How STYLIGHT went responsive
nonsquared
100
6.2k
Music & Morning Musume
bryan
47
7.2k
Information Architects: The Missing Link in Design Systems
soysaucechin
0
960
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
200
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
550
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
65
55k
Chasing Engaging Ingredients in Design
codingconduct
0
210
Transcript
None
◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ⚫ ⚫ ⚫
◼ ⚫ ⇔ ⇔ ⇔ ⇔ ⇔ ⇔ ⚫ ◼
⚫ ⚫ ⚫ ⚫ ◼ ⚫ ▶ ▶ ▶ ⚫
◼ ◼ ◼
◼ ◼ ⚫ 𝑘 ⚫ 𝑝 ◼ ⚫ (Lee+, ACL2021)
⚫ (Fernandes+, NAACL2022) Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
None
◼ ⚫ ⚫ ⚫
◼ ◼ ◼ ⚫ ▶ ▶ ▶ ⚫ ▶ ▶
◼ ⚫
◼ ⚫ ⚫ ⚫ ⚫ ▶ ◼ ⚫ × ⚫
◼ ◼ ◼ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ 𝑃 =
1−𝜆 7 𝑝MT1 + ⋯ + 𝑝MT7 + 𝜆𝑝𝑘NN ▶ 𝑘 = 64, 𝜆 = 0.1, 𝜏 = 100 Khandelwal+, ICLR2021, ``Nearest Neighbor Machine Translation’’.
𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ◼ (Deguchi+, ACL2023) ⚫ ◼
Khandelwal+, ICLR2021, ``Nearest Neighbor Machine Translation’’. Deguchi+, ACL2023, ``Subset Retrieval Nearest Neighbor Machine Translation’’.
𝒌 ◼ ⚫ ▶ ∈ ℝ𝐷 ▶ ∈ 𝒱𝑌 ⚫
𝑓 𝒙, 𝒚<𝑡 ∈ ℝ𝐷 𝑦𝑡 ∈ 𝒱𝑌 ℳ ⊆ ℝ𝐷 × 𝒱𝑌 𝒙 𝒚
𝑘 𝒌 ◼ 𝑞 ∈ ℝ𝐷 ◼ 𝑞 𝑘 ◼
𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒒 − 𝒌𝑖 2 2 𝜏 ◼ ⚫
◼ ⚫ ⚫ 𝑝 𝑝 = 0.5~0.7 ▶ ▶ ◼
× ×
None
◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)
⚫ ⚫ ▶ ⚫ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
(Lee+, ACL2021) ◼ ◼ ℒ 𝜃 = − σ 𝑗=1
𝑛 𝑝𝑇 𝑢𝑗 log 𝑝𝑀 𝑢𝑗 ∣ 𝑥; 𝜃 ⚫ 𝜇 ⋅,⋅ ∈ [0, 1] 𝑝𝑇 𝑢𝑖 ∝ exp 𝜇(𝑢𝑖,𝑟) 𝑇 ⚫ 𝑝𝑀 𝑢𝑖 𝑥; 𝜃) ∝ exp 𝑜𝑖 𝑢𝑖 𝑥; 𝜃 ◼ ⚫ ⚫ 𝑟 𝑢𝑖 𝜇 𝑢𝑖 , 𝑟 ▶ 𝑝𝑀 𝑝𝑇 Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’.
(Lee+, ACL2021) ◼ ⚫ ⚫ ⚫ ⚫ ⚫ 𝑇 =
0.5 ▶ 𝑝𝑇 𝑢𝑖 ∝ exp 𝜇(𝑢𝑖,𝑟) 𝑇 ⚫ 𝛽1 = 0.9, 𝛽2 = 0.98 ⚫ ◼ ⚫ ▶ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’.
(Fernandes+, ACL2021) ◼ ◼ ⚫ 𝑦MAP ∗ = argmax𝑦∈𝒴 log
𝑝𝜃 𝑦|𝑥 ⚫ 𝑦MBR ∗ = argmaxℎ∈ℋ 𝔼ො 𝑦~𝑃 𝑦|𝑥 𝑢 ℎ, ො 𝑦 ≈ 1 𝑁 𝑖=1 𝑁 𝑢 ℎ, ො 𝑦𝑖 ▶ 𝑢 ⋅,⋅ ◼ ⚫ Fernandes+ (NAACL2022) 𝑝 ▶ Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’.
(Goel & Byrne, CS&L 2000; Kumar & Byrne, NAACL2004) Goel
& Byrne, CS&L Vol14., 2000, ``Minimum Bayes-risk automatic speech recognition’’. Kumar & Byrne, NAACL2004, ``Minimum Bayes-Risk Decoding for Statistical Machine Translation’’. , 1 4 5 , ◼ 𝑦MBR ∗ ≔ argmaxℎ∈ℋ 𝔼ො 𝑦~𝑃 𝑦|𝑥 𝑢 ℎ, ො 𝑦 ⚫ ℋ ⊂ 𝒴 𝑢: 𝒴 × 𝒴 → ℝ ⚫ 𝑃 𝑦|𝑥 𝑥 𝑦 ◼ 𝒴 ∈ 𝒴 𝑦MBR ∗ ≈ argmax ℎ∈ℋ 𝔼ො 𝑦∈ 𝒴 𝑢 ℎ, ො 𝑦 ⚫ ▶ 𝒴 ≔ ℋ ⚫ 𝑁 ≔ ℋ 𝒪 𝑁2 ▶
(Fernandes+, NAACL2022) ⚫ ⚫ 𝑓: 𝒳 ∪ 𝒴 → ℝ𝐷
𝐷 ▶ 𝑥 ∈ 𝒳 ▶ ℎ ∈ 𝒴 ▶ ො 𝑦 ∈ 𝒴 ⚫ 𝑠: ℝ𝐷 × ℝ𝐷 × ℝ𝐷 → ℝ ◼ 𝑦COMET_MBR ∗ = argmaxℎ∈ℋ 𝔼ො 𝑦∈ 𝒴 𝑠 𝑓 𝑥 , 𝑓 ℎ , 𝑓 ො 𝑦 ⚫ (Fernandes+, NAACL2022) ⚫ 𝒪 𝑁2 : ( ) ( ) : × × Fernandes+, NAACL2022, ``Quality-Aware Decoding for Neural Machine Translation’’. ◼
◼ (Lee+, ACL2021) ⚫ ⚫ ▶ ⚫ ◼ (Fernandes+, NAACL2022)
⚫ ⚫ ▶ ⚫ Lee+, ACL2021, ``Discriminative Reranking for Neural Machine Translation’’. Fernandes+, NAACL2022 , ``Quality-Aware Decoding for Neural Machine Translation’’.
None
◼ ⚫ ⚫
◼ 𝑘 ⚫ 𝑘
◼ ⚫ ⚫ ※
◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶
◼ ⚫ ⚫ ◼ ⚫ ⚫ ⚫ ▶ ▶ ▶
▶ +, NLP2024, `` ’’. Deguchi+, arXiv, ``Centroid-Based Efficient Minimum Bayes Risk Decoding’’. https://arxiv.org/abs/2402.11197