Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
サブセット探索を用いた高速なkNNニューラル機械翻訳
Search
Hiroyuki Deguchi
March 22, 2024
Research
0
150
サブセット探索を用いた高速なkNNニューラル機械翻訳
第8回AAMTセミナー
AAMT若手翻訳研究会
最優秀賞
Hiroyuki Deguchi
March 22, 2024
Tweet
Share
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
0
610
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
300
20240226_AAMT-Japio
de9uch1
0
170
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
140
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
de9uch1
0
190
My Research Environmental Setup
de9uch1
0
310
Nearest Neighbor Machine Translation
de9uch1
0
260
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
290
paper reading - Tree Transformer
de9uch1
0
260
Other Decks in Research
See All in Research
cvpaper.challenge 10年の軌跡 / cvpaper.challenge a decade-long journey
gatheluck
3
1k
単施設でできる臨床研究の考え方
shuntaros
0
3.3k
言語モデルの地図:確率分布と情報幾何による類似性の可視化
shimosan
8
2.2k
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
kurita
1
290
snlp2025_prevent_llm_spikes
takase
0
420
LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection
satai
3
140
Remote sensing × Multi-modal meta survey
satai
4
630
長期・短期メモリを活用したエージェントの個別最適化
isidaitc
0
330
論文紹介: ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
hisaokatsumi
0
140
VectorLLM: Human-like Extraction of Structured Building Contours via Multimodal LLMs
satai
4
500
ロボット学習における大規模検索技術の展開と応用
denkiwakame
1
170
Sat2City:3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion
satai
4
330
Featured
See All Featured
It's Worth the Effort
3n
187
29k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Principles of Awesome APIs and How to Build Them.
keavy
127
17k
Into the Great Unknown - MozCon
thekraken
40
2.2k
Faster Mobile Websites
deanohume
310
31k
The Cost Of JavaScript in 2023
addyosmani
55
9.3k
How to Ace a Technical Interview
jacobian
280
24k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.1k
Agile that works and the tools we love
rasmusluckow
331
21k
[RailsConf 2023] Rails as a piece of cake
palkan
58
6.2k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
Leading Effective Engineering Teams in the AI Era
addyosmani
8
1.3k
Transcript
𝒌
◼ ⚫ ⚫ ◼ ⚫ (Zhang+, NAACL2018; Gu+, AAAI2018; Khandelwal+,
ICLR2021) ▶ (Nagao, 1984) ▶ ⚫ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ ▶ Guiding Neural Machine Translation with Retrieved Translation Pieces (Zhang+, NAACL2018) Search Engine Guided Neural Machine Translation (Gu+, AAAI2018) Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) A framework for a mechanical translation between Japanese and English by analogy principle (Nagao, 1984)
◼ ◼ ⚫ ⚫
𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ⚫ ⚫ ◼ ⚫ ▶
⚫ ▶ ≈ Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) 𝒙 𝒚
𝒌 (Khandelwal+, ICLR2021) 𝒌𝑖 ∈ ℝ𝐷 𝑓 𝒙, 𝒚<𝑡 ∈
ℝ𝐷 Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) ◼ 𝑘 ◼ ⚫ ⚫ 𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒌𝑖 − 𝑓 𝒙, 𝒚<𝑡 2 2 𝜏 ◼ 𝑘
𝒌 ◼ (Martins+, EMNLP2022) ◼ (Meng+, ACLFindings2022) ⚫ 𝑘 𝑘
𝜆 = 0.5 𝑘 = 16 Chunk-based Nearest Neighbor Machine Translation (Martins+, EMNLP2022) Fast Nearest Neighbor Machine Translation (Meng+, ACL Findings2022)
𝒌 ◼ 𝑘 ◼ ⚫ 𝑘 (Matsui+, ACMMM2018) ⚫ 𝑘
𝑘 𝑘 Reconfigurable Inverted Index (Matsui+, ACMMM2018) 𝒌
◼ ⚫ 𝑘 ⚫ 𝑘 ◼ ◼ 𝑘
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
⚫ ⚫ ⚫ ⚫ ⚫ 𝑘 𝜆 = 0.5 𝑘
= 16 𝑛 = 56
𝑘 𝑘 ◼ 𝑘 ⚫ ▶ ⚫ ▶
◼ 𝑘 𝒌 𝒌
◼ ⚫ 𝑘
𝒌 𝒌 ◼ ⚫ ⚫ ◼ 𝑘 ⚫ ⚫ ◼
⚫
⚫ ⚫ ▶ ⚫ ▶