Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
サブセット探索を用いた高速なkNNニューラル機械翻訳
Search
Hiroyuki Deguchi
March 22, 2024
Research
0
140
サブセット探索を用いた高速なkNNニューラル機械翻訳
第8回AAMTセミナー
AAMT若手翻訳研究会
最優秀賞
Hiroyuki Deguchi
March 22, 2024
Tweet
Share
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
0
600
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
300
20240226_AAMT-Japio
de9uch1
0
160
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
130
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
de9uch1
0
190
My Research Environmental Setup
de9uch1
0
310
Nearest Neighbor Machine Translation
de9uch1
0
260
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
290
paper reading - Tree Transformer
de9uch1
0
260
Other Decks in Research
See All in Research
とあるSREの博士「過程」 / A Certain SRE’s Ph.D. Journey
yuukit
11
4.9k
PhD Defense 2025: Visual Understanding of Human Hands in Interactions
tkhkaeio
1
300
[IBIS 2025] 深層基盤モデルのための強化学習驚きから理論にもとづく納得へ
akifumi_wachi
12
6.8k
Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types
satai
3
200
まずはここから:Overleaf共同執筆・CopilotでAIコーディング入門・Codespacesで独立環境
matsui_528
2
760
投資戦略202508
pw
0
570
スキマバイトサービスにおける現場起点でのデザインアプローチ
yoshioshingyouji
0
260
多言語カスタマーインタビューの“壁”を越える~PMと生成AIの共創~ 株式会社ジグザグ 松野 亘
watarumatsuno
0
150
Minimax and Bayes Optimal Best-arm Identification: Adaptive Experimental Design for Treatment Choice
masakat0
0
190
Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification
satai
3
200
Nullspace MPC
mizuhoaoki
1
360
論文紹介: ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
hisaokatsumi
0
130
Featured
See All Featured
A better future with KSS
kneath
239
18k
A Tale of Four Properties
chriscoyier
162
23k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
127
54k
How to Ace a Technical Interview
jacobian
280
24k
Why Our Code Smells
bkeepers
PRO
340
57k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
Rebuilding a faster, lazier Slack
samanthasiow
84
9.3k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.5k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
11
940
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
Transcript
𝒌
◼ ⚫ ⚫ ◼ ⚫ (Zhang+, NAACL2018; Gu+, AAAI2018; Khandelwal+,
ICLR2021) ▶ (Nagao, 1984) ▶ ⚫ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ ▶ Guiding Neural Machine Translation with Retrieved Translation Pieces (Zhang+, NAACL2018) Search Engine Guided Neural Machine Translation (Gu+, AAAI2018) Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) A framework for a mechanical translation between Japanese and English by analogy principle (Nagao, 1984)
◼ ◼ ⚫ ⚫
𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ⚫ ⚫ ◼ ⚫ ▶
⚫ ▶ ≈ Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) 𝒙 𝒚
𝒌 (Khandelwal+, ICLR2021) 𝒌𝑖 ∈ ℝ𝐷 𝑓 𝒙, 𝒚<𝑡 ∈
ℝ𝐷 Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) ◼ 𝑘 ◼ ⚫ ⚫ 𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒌𝑖 − 𝑓 𝒙, 𝒚<𝑡 2 2 𝜏 ◼ 𝑘
𝒌 ◼ (Martins+, EMNLP2022) ◼ (Meng+, ACLFindings2022) ⚫ 𝑘 𝑘
𝜆 = 0.5 𝑘 = 16 Chunk-based Nearest Neighbor Machine Translation (Martins+, EMNLP2022) Fast Nearest Neighbor Machine Translation (Meng+, ACL Findings2022)
𝒌 ◼ 𝑘 ◼ ⚫ 𝑘 (Matsui+, ACMMM2018) ⚫ 𝑘
𝑘 𝑘 Reconfigurable Inverted Index (Matsui+, ACMMM2018) 𝒌
◼ ⚫ 𝑘 ⚫ 𝑘 ◼ ◼ 𝑘
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
⚫ ⚫ ⚫ ⚫ ⚫ 𝑘 𝜆 = 0.5 𝑘
= 16 𝑛 = 56
𝑘 𝑘 ◼ 𝑘 ⚫ ▶ ⚫ ▶
◼ 𝑘 𝒌 𝒌
◼ ⚫ 𝑘
𝒌 𝒌 ◼ ⚫ ⚫ ◼ 𝑘 ⚫ ⚫ ◼
⚫
⚫ ⚫ ▶ ⚫ ▶