Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
サブセット探索を用いた高速なkNNニューラル機械翻訳
Search
Hiroyuki Deguchi
March 22, 2024
Research
0
86
サブセット探索を用いた高速なkNNニューラル機械翻訳
第8回AAMTセミナー
AAMT若手翻訳研究会
最優秀賞
Hiroyuki Deguchi
March 22, 2024
Tweet
Share
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
0
30
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
190
20240226_AAMT-Japio
de9uch1
0
110
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
110
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
de9uch1
0
140
My Research Environmental Setup
de9uch1
0
250
Nearest Neighbor Machine Translation
de9uch1
0
210
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
250
paper reading - Tree Transformer
de9uch1
0
210
Other Decks in Research
See All in Research
Weekly AI Agents News! 1月号 アーカイブ
masatoto
1
170
渋谷Well-beingアンケート調査結果
shibuyasmartcityassociation
0
400
Poster: Feasibility of Runtime-Neutral Wasm Instrumentation for Edge-Cloud Workload Handover
chikuwait
0
350
Weekly AI Agents News! 10月号 論文のアーカイブ
masatoto
1
510
地理空間情報と自然言語処理:「地球の歩き方旅行記データセット」の高付加価値化を通じて
hiroki13
1
190
Weekly AI Agents News! 12月号 プロダクト/ニュースのアーカイブ
masatoto
0
320
The many faces of AI and the role of mathematics
gpeyre
1
1.7k
Gemini と Looker で営業DX をドライブする / Driving Sales DX with Gemini and Looker
sansan_randd
0
120
資産間の相関関係を頑健に評価する指標を用いたファクターアローケーション戦略の構築
nomamist
0
130
打率7割を実現する、プロダクトディスカバリーの7つの極意(pmconf2024)
geshi0820
0
320
Whoisの闇
hirachan
3
300
DeepSeek を利用する上でのリスクと安全性の考え方
schroneko
3
740
Featured
See All Featured
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
47
5.2k
A Modern Web Designer's Workflow
chriscoyier
693
190k
Agile that works and the tools we love
rasmusluckow
328
21k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
366
25k
Building Your Own Lightsaber
phodgson
104
6.2k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
4
420
Site-Speed That Sticks
csswizardry
4
400
GraphQLとの向き合い方2022年版
quramy
44
13k
Faster Mobile Websites
deanohume
306
31k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.4k
Transcript
𝒌
◼ ⚫ ⚫ ◼ ⚫ (Zhang+, NAACL2018; Gu+, AAAI2018; Khandelwal+,
ICLR2021) ▶ (Nagao, 1984) ▶ ⚫ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ ▶ Guiding Neural Machine Translation with Retrieved Translation Pieces (Zhang+, NAACL2018) Search Engine Guided Neural Machine Translation (Gu+, AAAI2018) Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) A framework for a mechanical translation between Japanese and English by analogy principle (Nagao, 1984)
◼ ◼ ⚫ ⚫
𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ⚫ ⚫ ◼ ⚫ ▶
⚫ ▶ ≈ Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) 𝒙 𝒚
𝒌 (Khandelwal+, ICLR2021) 𝒌𝑖 ∈ ℝ𝐷 𝑓 𝒙, 𝒚<𝑡 ∈
ℝ𝐷 Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) ◼ 𝑘 ◼ ⚫ ⚫ 𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒌𝑖 − 𝑓 𝒙, 𝒚<𝑡 2 2 𝜏 ◼ 𝑘
𝒌 ◼ (Martins+, EMNLP2022) ◼ (Meng+, ACLFindings2022) ⚫ 𝑘 𝑘
𝜆 = 0.5 𝑘 = 16 Chunk-based Nearest Neighbor Machine Translation (Martins+, EMNLP2022) Fast Nearest Neighbor Machine Translation (Meng+, ACL Findings2022)
𝒌 ◼ 𝑘 ◼ ⚫ 𝑘 (Matsui+, ACMMM2018) ⚫ 𝑘
𝑘 𝑘 Reconfigurable Inverted Index (Matsui+, ACMMM2018) 𝒌
◼ ⚫ 𝑘 ⚫ 𝑘 ◼ ◼ 𝑘
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
⚫ ⚫ ⚫ ⚫ ⚫ 𝑘 𝜆 = 0.5 𝑘
= 16 𝑛 = 56
𝑘 𝑘 ◼ 𝑘 ⚫ ▶ ⚫ ▶
◼ 𝑘 𝒌 𝒌
◼ ⚫ 𝑘
𝒌 𝒌 ◼ ⚫ ⚫ ◼ 𝑘 ⚫ ⚫ ◼
⚫
⚫ ⚫ ▶ ⚫ ▶