Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
EMNLP2015読み会:Effective Approaches to Attention-...
Search
tkng
October 24, 2015
Research
4.1k
2
Share
EMNLP2015読み会:Effective Approaches to Attention-based Neural Machine Translation
tkng
October 24, 2015
More Decks by tkng
See All by tkng
LSTMを用いた自然言語処理について
tkng
3
3.7k
自然言語処理と深層学習の最先端
tkng
16
7.7k
basis-of-optimization.pdf
tkng
1
1.4k
Other Decks in Research
See All in Research
老舗ものづくり企業でリサーチが変革を起こすまで - 三菱重工DXの実践
skydats
0
130
AIスーパーコンピュータにおけるLLM学習処理性能の計測と可観測性 / AI Supercomputer LLM Benchmarking and Observability
yuukit
1
820
それ、チームの改善になってますか?ー「チームとは?」から始めた組織の実験ー
hirakawa51
0
1.1k
SREはサイバネティクスの夢をみるか? / Do SREs Dream of Cybernetics?
yuukit
3
480
Ankylosing Spondylitis
ankh2054
0
160
生成的情報検索時代におけるAI利用と認知バイアス
trycycle
PRO
0
500
明日から使える!研究効率化ツール入門
matsui_528
11
6.1k
AIを叩き台として、 「検証」から「共創」へと進化するリサーチ
mela_dayo
0
240
言語モデルから言語について語る際に押さえておきたいこと
eumesy
PRO
5
2.1k
さくらインターネット研究所テックトーク2026春、研究開発Gr.25年度成果26年度方針
kikuzo
0
120
An Open and Reproducible Deep Research Agent for Long-Form Question Answering
ikuyamada
0
410
空間音響処理における物理法則に基づく機械学習
skoyamalab
0
270
Featured
See All Featured
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
180
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
1
520
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
200
End of SEO as We Know It (SMX Advanced Version)
ipullrank
3
4.1k
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
400
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
230
Rebuilding a faster, lazier Slack
samanthasiow
85
9.5k
Scaling GitHub
holman
464
140k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
10
1.1k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.7k
The browser strikes back
jonoalderson
0
970
HDC tutorial
michielstock
2
620
Transcript
Effective Approaches to Attention-based Neural Machine Translation Authors: Minh-Thang LuongɹHieu
PhamɹChristopher D. Manning ಡΉਓ: ಙӬ೭ ਤશͯ͜ͷจ͔ΒҾ༻ &./-1ಡΈձ
ࣗݾհɿಙӬ೭ • Twitter ID: @tkng • εϚʔτχϡʔεגࣜձࣾͰNLPͬͯ·͢
ࠓͷจʁ • Effective Approaches to Attention-based Neural Machine Translation •
ڈ͙Β͍͔ΒྲྀߦΓ࢝Ίͨseq2seqܥͷख ๏ͷ֦ு
Seq2seq modelͱʁ • Encoder/Decoder modelͱݴ͏ • ༁ݩͷจΛݻఆͷϕΫτϧʹΤϯίʔυ ͯ͠ɺ͔ͦ͜Β༁ޙͷจΛσίʔυ͢Δ • ՄมͷσʔλऔΓѻ͍͕͍͠ͷͰɺ
͑ͯݻఆʹͯ͠͠·͏ͱ͍͏ൃ
Ͳ͏ͬͯݻఆʹΤϯίʔυ ͢Δͷʁ • recurrent neural networkΛ͏ • http://colah.github.io/posts/2015-08-Understanding-LSTMs/ • http://kaishengtai.github.io/static/slides/treelstm-acl2015.pdf
• LSTM = recurrent neural networkͷҰछ
Seq2seqϞσϧͰͷ༁
Seq2seq·ͰͷಓͷΓ (1) • Recurrent Continuous Translation Models (EMNLP2013) • Learning
Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation (EMNLP2014)
Seq2seq·ͰͷಓͷΓ (2) • Sequence to Sequence Learning with Neural Networks
(NIPS2014) • ൺֱతγϯϓϧͳStacked LSTM͕ྑ͍ੑೳΛ ࣔ͢͜ͱ͕࣮ݧͰࣔ͞Εͨ • ϏʔϜαʔνɺٯॱͰͷೖྗɺΞϯαϯϒϧ ͷ3छྨͷ͕ೖ͍ͬͯΔ
Seq2seqϞσϧͷऑ • จʹऑ͍ • ݻ༗໊ࢺ͕ೖΕସΘΔ
AttentionʹΑΔվળ [Bahdanau+ 2015] • DecodeͷࡍͷContextʹEncodeͷࡍͷ֤࣌ࠁ ʹ͓͚ΔӅΕঢ়ଶͷॏΈ͖Λ༻͍Δ • ॏΈࣗମRNNͰܭࢉ͢Δ
ࠓճͷจͷߩݙ • ৽͍͠attention (local attention) ΛఏҊͨ͠ • ༁ݩจʹ͓͍ͯɺҐஔɹ͔ΒલޙD୯ޠ ͷӅΕঢ়ଶͷॏΈ͖ΛऔΔ •
ॏΈͷܭࢉglobal attentionͷ߹ͱಉ༷ • ɹ1ͭͣͭਐΊ͍ͯ͘߹ʢlocal-mʣ ͱɺ͜ΕࣗମRNNʹ͢Δ߹ʢlocal- pʣͷ2ͭΛ࣮ݧ͍ͯ͠Δ pt pt
local attention
local attentionͷҹ • ޠॱ͕ࣅ͍ͯΔݴޠؒͰͷ༁ͳΒɺ໌Β͔ ʹ͜ͷํ͕ྑͦ͞͏ • ӳΈ͍ͨʹޠॱ͕େ͖͘ҧ͏߹ɺ Ґஔɹͷਪఆࣗମ͕͍͠λεΫʹͳͬͪΌ ͍ͦ͏… pt
࣮ݧ݁ՌɿWMT'14
࣮ݧ݁ՌɿWMT'14 • Α͘ݟΔͱɺlocal attentionͰͷੑೳ্ +0.9ϙΠϯτ • ଞͷςΫχοΫͰՔ͍ͰΔϙΠϯτ͕ଟ͍
࣮ݧ݁ՌɿWMT'15
͍͔ͭ͘༁αϯϓϧ
·ͱΊ • Seq2seqϞσϧͷ֦ுͱͯ͠ɺlocal attention ΛఏҊͨ͠ • ఏҊख๏͍͔ͭ͘ͷ࣮ݧʹ͓͍ͯɺState of the artͷੑೳΛୡͨ͠
ײ • Local attentionΛඍ • ྨࣅ͢Δख๏ͱ۩ମతʹͲ͏ҧ͏͔͕໌շʹ ॻ͔Ε͓ͯΓɺಡΈ͔ͬͨ͢ • AttentionΛཧղͰ͖ͯΑ͔ͬͨʢখฒײʣ