Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
サブセット探索を用いた高速なkNNニューラル機械翻訳
Search
Hiroyuki Deguchi
March 22, 2024
Research
0
110
サブセット探索を用いた高速なkNNニューラル機械翻訳
第8回AAMTセミナー
AAMT若手翻訳研究会
最優秀賞
Hiroyuki Deguchi
March 22, 2024
Tweet
Share
More Decks by Hiroyuki Deguchi
See All by Hiroyuki Deguchi
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
0
480
20240820: Minimum Bayes Risk Decoding for High-Quality Text Generation Beyond High-Probability Text
de9uch1
0
260
20240226_AAMT-Japio
de9uch1
0
150
Searching for Needles in a Haystack: On the Role of Incidental Bilingualism in PaLM’s Translation Capability
de9uch1
0
120
Paper Reading: Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
de9uch1
0
170
My Research Environmental Setup
de9uch1
0
280
Nearest Neighbor Machine Translation
de9uch1
0
240
Paper Reading - Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
de9uch1
0
280
paper reading - Tree Transformer
de9uch1
0
240
Other Decks in Research
See All in Research
3D Gaussian Splattingによる高効率な新規視点合成技術とその応用
muskie82
5
2.5k
公立高校入試等に対する受入保留アルゴリズム(DA)導入の提言
shunyanoda
0
5.7k
業界横断 副業・兼業者の実態調査
fkske
0
160
Google Agent Development Kit (ADK) 入門 🚀
mickey_kubo
2
1k
ストレス計測方法の確立に向けたマルチモーダルデータの活用
yurikomium
0
580
SSII2025 [SS2] 横浜DeNAベイスターズの躍進を支えたAIプロダクト
ssii
PRO
7
3.4k
SSII2025 [SS1] レンズレスカメラ
ssii
PRO
2
950
ウッドスタックチャン:木材を用いた小型エージェントロボットの開発と印象評価 / ec75-sato
yumulab
1
410
データサイエンティストの採用に関するアンケート
datascientistsociety
PRO
0
980
(NULLCON Goa 2025)Windows Keylogger Detection: Targeting Past and Present Keylogging Techniques
asuna_jp
1
520
EOGS: Gaussian Splatting for Efficient Satellite Image Photogrammetry
satai
4
250
電通総研の生成AI・エージェントの取り組みエンジニアリング業務向けAI活用事例紹介
isidaitc
1
230
Featured
See All Featured
Java REST API Framework Comparison - PWX 2021
mraible
31
8.6k
Stop Working from a Prison Cell
hatefulcrawdad
270
20k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.7k
Building Adaptive Systems
keathley
43
2.6k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
107
19k
Navigating Team Friction
lara
187
15k
Producing Creativity
orderedlist
PRO
346
40k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
130
19k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
3.9k
Reflections from 52 weeks, 52 projects
jeffersonlam
351
20k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
17
940
Transcript
𝒌
◼ ⚫ ⚫ ◼ ⚫ (Zhang+, NAACL2018; Gu+, AAAI2018; Khandelwal+,
ICLR2021) ▶ (Nagao, 1984) ▶ ⚫ 𝑘 (Khandelwal+, ICLR2021) ▶ ▶ ▶ Guiding Neural Machine Translation with Retrieved Translation Pieces (Zhang+, NAACL2018) Search Engine Guided Neural Machine Translation (Gu+, AAAI2018) Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) A framework for a mechanical translation between Japanese and English by analogy principle (Nagao, 1984)
◼ ◼ ⚫ ⚫
𝒌 (Khandelwal+, ICLR2021) ◼ ⚫ ⚫ ⚫ ◼ ⚫ ▶
⚫ ▶ ≈ Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) 𝒙 𝒚
𝒌 (Khandelwal+, ICLR2021) 𝒌𝑖 ∈ ℝ𝐷 𝑓 𝒙, 𝒚<𝑡 ∈
ℝ𝐷 Nearest Neighbor Machine Translation (Khandelwal+, ICLR2021) ◼ 𝑘 ◼ ⚫ ⚫ 𝑝𝑘NN 𝑦𝑡 𝒙, 𝒚<𝑡 ∝ 𝑖=1 𝑘 𝟙𝑦𝑡=𝑣𝑖 exp − 𝒌𝑖 − 𝑓 𝒙, 𝒚<𝑡 2 2 𝜏 ◼ 𝑘
𝒌 ◼ (Martins+, EMNLP2022) ◼ (Meng+, ACLFindings2022) ⚫ 𝑘 𝑘
𝜆 = 0.5 𝑘 = 16 Chunk-based Nearest Neighbor Machine Translation (Martins+, EMNLP2022) Fast Nearest Neighbor Machine Translation (Meng+, ACL Findings2022)
𝒌 ◼ 𝑘 ◼ ⚫ 𝑘 (Matsui+, ACMMM2018) ⚫ 𝑘
𝑘 𝑘 Reconfigurable Inverted Index (Matsui+, ACMMM2018) 𝒌
◼ ⚫ 𝑘 ⚫ 𝑘 ◼ ◼ 𝑘
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
𝑛 𝑘 1 1 1 1 1 1 1 1
1
⚫ ⚫ ⚫ ⚫ ⚫ 𝑘 𝜆 = 0.5 𝑘
= 16 𝑛 = 56
𝑘 𝑘 ◼ 𝑘 ⚫ ▶ ⚫ ▶
◼ 𝑘 𝒌 𝒌
◼ ⚫ 𝑘
𝒌 𝒌 ◼ ⚫ ⚫ ◼ 𝑘 ⚫ ⚫ ◼
⚫
⚫ ⚫ ▶ ⚫ ▶