Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
自然言語処理と深層学習の最先端
Search
tkng
January 15, 2016
Technology
16
7.7k
自然言語処理と深層学習の最先端
第4回 JustTechTalk の発表資料
tkng
January 15, 2016
Tweet
Share
More Decks by tkng
See All by tkng
LSTMを用いた自然言語処理について
tkng
3
3.7k
EMNLP2015読み会:Effective Approaches to Attention-based Neural Machine Translation
tkng
2
4k
basis-of-optimization.pdf
tkng
1
1.4k
Other Decks in Technology
See All in Technology
A2Aのクライアントを自作する
rynsuke
1
190
Amazon ECS & AWS Fargate 運用アーキテクチャ2025 / Amazon ECS and AWS Fargate Ops Architecture 2025
iselegant
17
5.7k
TechLION vol.41~MySQLユーザ会のほうから来ました / techlion41_mysql
sakaik
0
190
低レイヤを知りたいPHPerのためのCコンパイラ作成入門 完全版 / Building a C Compiler for PHPers Who Want to Dive into Low-Level Programming - Expanded
tomzoh
4
3.3k
CursorによるPMO業務の代替 / Automating PMO Tasks with Cursor
motoyoshi_kakaku
0
160
Clineを含めたAIエージェントを 大規模組織に導入し、投資対効果を考える / Introducing AI agents into your organization
i35_267
4
1.6k
AIエージェント最前線! Amazon Bedrock、Amazon Q、そしてMCPを使いこなそう
minorun365
PRO
15
5.3k
M3 Expressiveの思想に迫る
chnotchy
0
110
Amazon Bedrockで実現する 新たな学習体験
kzkmaeda
2
590
Yamla: Rustでつくるリアルタイム性を追求した機械学習基盤 / Yamla: A Rust-Based Machine Learning Platform Pursuing Real-Time Capabilities
lycorptech_jp
PRO
3
130
エンジニア向け技術スタック情報
kauche
1
270
なぜ私はいま、ここにいるのか? #もがく中堅デザイナー #プロダクトデザイナー
bengo4com
0
480
Featured
See All Featured
KATA
mclloyd
29
14k
Measuring & Analyzing Core Web Vitals
bluesmoon
7
490
Six Lessons from altMBA
skipperchong
28
3.9k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Side Projects
sachag
455
42k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
357
30k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.4k
Become a Pro
speakerdeck
PRO
28
5.4k
A designer walks into a library…
pauljervisheath
207
24k
Balancing Empowerment & Direction
lara
1
380
Building Adaptive Systems
keathley
43
2.6k
Why Our Code Smells
bkeepers
PRO
337
57k
Transcript
ࣗવݴޠॲཧͱਂֶशͷ࠷ઌ ಙӬ೭ +VTU5FDI5BML
ࣗવݴޠॲཧͱਂֶशͷ࠷ઌ ͷҰ෦Λհ͠·͢ ಙӬ೭ (@tkng) +VTU5FDI5BML
ࣗݾհɿಙӬ೭ • Twitter ID: @tkng • εϚʔτχϡʔεגࣜձࣾͰࣗવݴޠॲཧ ը૾ॲཧΛͬͯ·͢
None
ࣗવݴޠॲཧͱ • ࣗવݴޠʢ≠ϓϩάϥϛϯάݴޠʣΛѻ͏ • ػց༁ • ࣭Ԡ • จॻྨ •
ߏจղੳɾΓड͚ղੳ • ܗଶૉղੳɾ୯ޠׂ
ػց༁ͷྫ • Google༁ͷword lensػೳ IUUQHPPHMFUSBOTMBUFCMPHTQPUKQIBMMPIPMBPMBUPOFXNPSFQPXFSGVM@IUNM
࣭Ԡͷྫ • IBM Watson • Jeopardy!Ͱਓؒʹউར IUUQXXXOZUJNFTDPNTDJFODFKFPQBSEZXBUTPOIUNM
ਂֶशͱ • ≒ χϡʔϥϧωοτ • ۙͷྲྀߦɺҎԼͷཧ༝ʹΑΔ • ܭࢉػͷੑೳ্ • ֶशσʔλͷ૿Ճ
• ࠷దԽख๏ͳͲͷݚڀͷਐల
ࣗવݴޠॲཧͱ ਂֶशͷ࠷ઌ
Show, Attend and Tell: Neural Image Caption Generation with Visual
Attention (Xu+, 2015) • ը૾ʹର͢Δղઆจͷੜ IUUQLFMWJOYVHJUIVCJPQSPKFDUTDBQHFOIUNM
Show, Attend and Tell Ͳ͏͍͏ख๏͔ • ҎԼͷ3ͭͷΈ߹Θͤ • Convolutional Neural
Network • Long Short Term Memory • Attention
Generating Images from Captions with Attention (Mansimov+, 2015) • Ωϟϓγϣϯ͔Βը૾Λੜ͢Δ
• ࡉͰݟΕඈߦػʹݟ͑ͳ͘ͳ͍
Effective Approaches to Attention- based Neural Machine Translation (Bahdanau+, 2015)
• Deep LearningΛ༻͍ͯػց༁ • Local Attentionͱ͍͏৽͍͠ख๏ΛఏҊ • ͍͔ͭ͘ͷݴޠϖΞͰɺstate of the artΛୡ ࠷ߴਫ४
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
(Kumar+, 2015) • ৽͍͠ϞσϧʢDynamic Memory Networksʣ ΛఏҊͨ͠ • Recurrent Neural NetworkΛΈ߹ΘͤͨΑ ͏ͳϞσϧʹͳ͍ͬͯΔ • ࣭Ԡɺࢺλά͚ɺڞࢀরղੳɺධ ੳͰstate of the art
ਂֶशͷNLPʹ͓͚Δݱঢ় • ਫ਼໘Ͱɺଞͷख๏ͱେ͍͍ࠩͭͯͳ͍ • ը૾ॲཧԻೝࣝͱҧ͏ • ػց༁࣭Ԡ͕γϯϓϧͳख๏Ͱղ͚ ΔΑ͏ʹͳͬͨ • จͷੜ͕Ͱ͖ΔΑ͏ʹͳͬͨ
ࠓޙͲ͏ͳΔͷ͔ʁ • ਖ਼ɺΑ͘Θ͔Βͳ͍…… • ը૾ಈըͱΈ߹Θͤͨݚڀ૿͑ͦ͏
࠷ઌʹ͍͍ͭͯͨ͘Ίʹ
3ͭʹߜͬͯղઆ͠·͢ • Neural Networkͷجૅ • Recurrent Neural Network • ಛʹGated
Recurrent Unit • Attention
χϡʔϥϧωοτϫʔΫ = ؔ • χϡʔϥϧωοτϫʔΫɺ͋ΔछͷؔͰ ͋Δͱߟ͑Δ͜ͱ͕Ͱ͖Δ • ೖग़ྗϕΫτϧ • ඍՄೳ
γϯϓϧͳྫ͔Β࢝ΊΔ y = f(x) = W x
ग़ྗΛ0ʙ1ʹਖ਼نԽ͢Δ • y = softmax(f(x))
ଟԽͯ͠ΈΑ͏ • y = softmax(g(f(x)))
Ͳ͕͜ϨΠϠʔʁ
౾ࣝ • ϨΠϠʔͱ͍͏ݴ༿ʹؾΛ͚ͭΑ͏ • ͲͬͪΛࢦͯ͠Δ͔ᐆດʢಡΉͱ͖ʹؾΛ ͚ͭΕΘ͔Δ͕…ʣ • ϝδϟʔͳOSSͰɺؔΛࢦ͢ͷ͕ଟ ʢCaffe, Torch,
Chainer, TensorFlowʣ
Recurrent Neural Network • ࣌ܥྻʹฒͿཁૉΛ1ͭͣͭड͚औͬͯɺঢ়ଶ Λߋ৽͍ͯ͘͠ωοτϫʔΫͷ૯শ • ࠷ۙͱͯྲྀߦ͍ͯ͠Δ IUUQDPMBIHJUIVCJPQPTUT6OEFSTUBOEJOH-45.T
ͳͥRNN͕ྲྀߦ͍ͯ͠Δͷ͔ʁ • ՄมͷσʔλͷऔΓѻ͍͍͠ • RNNΛͬͨseq2seqϞσϧʢEncoder/ DecoderϞσϧͱݺͿʣͰՄมσʔλΛ ͏·͘औΓѻ͑Δࣄ͕Θ͔͖ͬͯͨ
Seq2seqϞσϧͱʁ • ՄมͷೖྗσʔλΛɺݻఆͷϕΫτϧʹ Τϯίʔυͯ͠ɺ͔ͦ͜Β༁ޙͷσʔλΛ σίʔυ͢Δ • ػց༁ࣗಈཁͳͲೖग़ྗͷ͕͞ҧ͏ λεΫͰۙݚڀ͕ਐΜͰ͍Δ
Seq2seqϞσϧͰͷ༁ 5IJT JT B QFO &04 ͜Ε ϖϯ Ͱ͢
&04 ͜Ε ϖϯ Ͱ͢
Seq2seqϞσϧͰͷ༁ 5IJT JT B QFO &04 ͜Ε ϖϯ Ͱ͢
&04 ͜Ε ϖϯ Ͱ͢ 5IJTJTBQFOΛݻఆʹ Τϯίʔυ͍ͯ͠Δʂ
Seq2seqϞσϧΛ༁ʹ͏ͱʁ • ͔ͳΓ͏·͍͘͘ࣄ͕Θ͔͍ͬͯΔ • ͨͩ࣍͠ͷ༷ͳऑ͕͋Δ • จʹऑ͍ • ݻ༗໊ࢺ͕ೖΕସΘΔ •
͜ΕΛղܾ͢Δͷ͕࣍ʹઆ໌͢ΔAttention
Attentionͱ • σίʔυ࣌ʹΤϯίʔυ࣌ͷใΛগ͚ͩ͠ ࢀর͢ΔͨΊͷΈ • গ͚ͩ͠ = બͨ͠෦͚ͩΛݟΔ • Global
AttentionͱLocal Attention͕͋Δ
Global Attention • ީิঢ়ଶͷॏΈ͖ΛAttentionͱ͢Δ • ྺ࢙తʹͪ͜Βͷํ͕ͪΐͬͱݹ͍ 5IJT JT B QFO
&04 ͜Ε ͜Ε
Local Attention • Τϯίʔυ࣌ͷঢ়ଶΛ͍͔ͭ͘બͯ͠͏ 5IJT JT B QFO &04 ͜Ε
͜Ε
Attentionͷॱং • ΛͯΔॱংɺGlobal AttentionͰ Local AttentionͰ͍͠Ͱ͋Δ • AttentionͷॱংRNNͰֶशͨ͠Γ͢Δ • લ͔ΒॱʹAttentionΛ͍͚ͯͯͩ͘Ͱੑ
ೳ্͢Δ
࣮ݧ݁ՌɿWMT'14
࣮ݧ݁ՌɿWMT'15
࣮ࡍͷ༁ͷྫ
͜͜·Ͱͷ·ͱΊ • جૅతͳχϡʔϥϧωοτϫʔΫͷղઆ • Recurrent Neural Network • Attention
ࠓ͞ͳ͔ͬͨ͜ͱ • ֶशʢback propagation, minibatchʣ • ଛࣦؔʢlog loss, cross entropy
lossʣ • ਖ਼ଇԽͷςΫχοΫ • dropout, batch normalization • ࠷దԽͷςΫχοΫ • RMSProp, AdaGrad, Adam • ֤छ׆ੑԽؔ • (Very) Leaky ReLU, Maxout
ࠓޙͷΦεεϝ • ࣗͰͳʹ͔࣮ݧͯ͠ΈΑ͏ • γϯϓϧͳྫͰ͍͍͔Β·ͣಈ͔͢ • ಈ͍ͨΒ࣍ʹࣗͰվͯ͠ΈΔ • ͱʹ͔͘खΛಈ͔͢͜ͱ͕େࣄ •
࠷ॳ͔Β͗͢͠Δ͜ͱʹखΛग़͞ͳ͍
࠷৽ใͷΞϯςφ (1) • TwitterͰػցֶशͳͲʹ͍ͭͯൃݴ͍ͯ͠Δ ਓΛϑΥϩʔ͢Δ • ͱΓ͋͑ͣ @hillbig • ͍͍ਓଞʹͨ͘͞Μ͍·͕͢
• ͍͋͠ਓ͍Δ͔Βҙͯ͠Ͷ
࠷৽ใͷΞϯςφ (2) • จΛಡ͏ • ಡΉ͚ͩ࣌ؒͷແବͳจ͋ΔͷͰҙ • ࠷ॳͷ͏ͪɺ༗໊ͳֶձʢACL, EMNLP, ICML,
NIPS, KDD, etc.ʣʹ௨ͬͯΔจʹ ߜ͕ͬͨΑ͍
࠷৽ใͷΞϯςφ (3) • จͷஶऀʹ͢Δ • จΛಡΜͰ͍Δ͏ͪʹɺ͕ࣗ໘ന͍ͱ ࢥ͏จͷஶऀ͕Կਓ͔ग़ͯ͘Δ • ͦ͏͍͏ਓͷ৽͍͠จͲ͏ʹ͔ͯ͠ νΣοΫ͠Α͏
Take home messages • ؾ͕࣋ͪΓ্͕ͬͯΔ͏ͪʹɺࣗͷखͰ ৭ʑ࣮ݧͯ͠ΈΑ͏ • ॳ৺ऀʹChainer͕Φεεϝ • ࠷৽ใωοτͰೖखͰ͖Δ
• มͳํʹҙ͕ࣝߴ͍ਓʹҙ