Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
自然言語処理と深層学習の最先端
Search
tkng
January 15, 2016
Technology
16
7.7k
自然言語処理と深層学習の最先端
第4回 JustTechTalk の発表資料
tkng
January 15, 2016
Tweet
Share
More Decks by tkng
See All by tkng
LSTMを用いた自然言語処理について
tkng
3
3.7k
EMNLP2015読み会:Effective Approaches to Attention-based Neural Machine Translation
tkng
2
4k
basis-of-optimization.pdf
tkng
1
1.4k
Other Decks in Technology
See All in Technology
Preferred Networks (PFN) とLLM Post-Training チームの紹介 / 第4回 関東Kaggler会 スポンサーセッション
pfn
PRO
1
180
知られざるprops命名の慣習 アクション編
uhyo
10
2.4k
OpenAPIから画面生成に挑戦した話
koinunopochi
0
150
TypeScript入門
recruitengineers
PRO
6
1.3k
ソフトウェア エンジニアとしての 姿勢と心構え
recruitengineers
PRO
2
600
広島発!スタートアップ開発の裏側
tsankyo
0
240
ECS モニタリング手法大整理
yendoooo
1
120
DeNA での思い出 / Memories at DeNA
orgachem
PRO
3
1.5k
夢の印税生活 / Life on Royalties
tmtms
0
280
Amazon Bedrock AgentCore でプロモーション用動画生成エージェントを開発する
nasuvitz
6
420
MySQL HeatWave:サービス概要のご紹介
oracle4engineer
PRO
4
1.7k
RAID6 を楔形文字で組んで現代人を怖がらせましょう(実装編)
mimifuwa
0
300
Featured
See All Featured
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.6k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
30
9.6k
Into the Great Unknown - MozCon
thekraken
40
2k
A Tale of Four Properties
chriscoyier
160
23k
Facilitating Awesome Meetings
lara
55
6.5k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
131
19k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Scaling GitHub
holman
462
140k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Raft: Consensus for Rubyists
vanstee
140
7.1k
Being A Developer After 40
akosma
90
590k
Reflections from 52 weeks, 52 projects
jeffersonlam
351
21k
Transcript
ࣗવݴޠॲཧͱਂֶशͷ࠷ઌ ಙӬ೭ +VTU5FDI5BML
ࣗવݴޠॲཧͱਂֶशͷ࠷ઌ ͷҰ෦Λհ͠·͢ ಙӬ೭ (@tkng) +VTU5FDI5BML
ࣗݾհɿಙӬ೭ • Twitter ID: @tkng • εϚʔτχϡʔεגࣜձࣾͰࣗવݴޠॲཧ ը૾ॲཧΛͬͯ·͢
None
ࣗવݴޠॲཧͱ • ࣗવݴޠʢ≠ϓϩάϥϛϯάݴޠʣΛѻ͏ • ػց༁ • ࣭Ԡ • จॻྨ •
ߏจղੳɾΓड͚ղੳ • ܗଶૉղੳɾ୯ޠׂ
ػց༁ͷྫ • Google༁ͷword lensػೳ IUUQHPPHMFUSBOTMBUFCMPHTQPUKQIBMMPIPMBPMBUPOFXNPSFQPXFSGVM@IUNM
࣭Ԡͷྫ • IBM Watson • Jeopardy!Ͱਓؒʹউར IUUQXXXOZUJNFTDPNTDJFODFKFPQBSEZXBUTPOIUNM
ਂֶशͱ • ≒ χϡʔϥϧωοτ • ۙͷྲྀߦɺҎԼͷཧ༝ʹΑΔ • ܭࢉػͷੑೳ্ • ֶशσʔλͷ૿Ճ
• ࠷దԽख๏ͳͲͷݚڀͷਐల
ࣗવݴޠॲཧͱ ਂֶशͷ࠷ઌ
Show, Attend and Tell: Neural Image Caption Generation with Visual
Attention (Xu+, 2015) • ը૾ʹର͢Δղઆจͷੜ IUUQLFMWJOYVHJUIVCJPQSPKFDUTDBQHFOIUNM
Show, Attend and Tell Ͳ͏͍͏ख๏͔ • ҎԼͷ3ͭͷΈ߹Θͤ • Convolutional Neural
Network • Long Short Term Memory • Attention
Generating Images from Captions with Attention (Mansimov+, 2015) • Ωϟϓγϣϯ͔Βը૾Λੜ͢Δ
• ࡉͰݟΕඈߦػʹݟ͑ͳ͘ͳ͍
Effective Approaches to Attention- based Neural Machine Translation (Bahdanau+, 2015)
• Deep LearningΛ༻͍ͯػց༁ • Local Attentionͱ͍͏৽͍͠ख๏ΛఏҊ • ͍͔ͭ͘ͷݴޠϖΞͰɺstate of the artΛୡ ࠷ߴਫ४
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
(Kumar+, 2015) • ৽͍͠ϞσϧʢDynamic Memory Networksʣ ΛఏҊͨ͠ • Recurrent Neural NetworkΛΈ߹ΘͤͨΑ ͏ͳϞσϧʹͳ͍ͬͯΔ • ࣭Ԡɺࢺλά͚ɺڞࢀরղੳɺධ ੳͰstate of the art
ਂֶशͷNLPʹ͓͚Δݱঢ় • ਫ਼໘Ͱɺଞͷख๏ͱେ͍͍ࠩͭͯͳ͍ • ը૾ॲཧԻೝࣝͱҧ͏ • ػց༁࣭Ԡ͕γϯϓϧͳख๏Ͱղ͚ ΔΑ͏ʹͳͬͨ • จͷੜ͕Ͱ͖ΔΑ͏ʹͳͬͨ
ࠓޙͲ͏ͳΔͷ͔ʁ • ਖ਼ɺΑ͘Θ͔Βͳ͍…… • ը૾ಈըͱΈ߹Θͤͨݚڀ૿͑ͦ͏
࠷ઌʹ͍͍ͭͯͨ͘Ίʹ
3ͭʹߜͬͯղઆ͠·͢ • Neural Networkͷجૅ • Recurrent Neural Network • ಛʹGated
Recurrent Unit • Attention
χϡʔϥϧωοτϫʔΫ = ؔ • χϡʔϥϧωοτϫʔΫɺ͋ΔछͷؔͰ ͋Δͱߟ͑Δ͜ͱ͕Ͱ͖Δ • ೖग़ྗϕΫτϧ • ඍՄೳ
γϯϓϧͳྫ͔Β࢝ΊΔ y = f(x) = W x
ग़ྗΛ0ʙ1ʹਖ਼نԽ͢Δ • y = softmax(f(x))
ଟԽͯ͠ΈΑ͏ • y = softmax(g(f(x)))
Ͳ͕͜ϨΠϠʔʁ
౾ࣝ • ϨΠϠʔͱ͍͏ݴ༿ʹؾΛ͚ͭΑ͏ • ͲͬͪΛࢦͯ͠Δ͔ᐆດʢಡΉͱ͖ʹؾΛ ͚ͭΕΘ͔Δ͕…ʣ • ϝδϟʔͳOSSͰɺؔΛࢦ͢ͷ͕ଟ ʢCaffe, Torch,
Chainer, TensorFlowʣ
Recurrent Neural Network • ࣌ܥྻʹฒͿཁૉΛ1ͭͣͭड͚औͬͯɺঢ়ଶ Λߋ৽͍ͯ͘͠ωοτϫʔΫͷ૯শ • ࠷ۙͱͯྲྀߦ͍ͯ͠Δ IUUQDPMBIHJUIVCJPQPTUT6OEFSTUBOEJOH-45.T
ͳͥRNN͕ྲྀߦ͍ͯ͠Δͷ͔ʁ • ՄมͷσʔλͷऔΓѻ͍͍͠ • RNNΛͬͨseq2seqϞσϧʢEncoder/ DecoderϞσϧͱݺͿʣͰՄมσʔλΛ ͏·͘औΓѻ͑Δࣄ͕Θ͔͖ͬͯͨ
Seq2seqϞσϧͱʁ • ՄมͷೖྗσʔλΛɺݻఆͷϕΫτϧʹ Τϯίʔυͯ͠ɺ͔ͦ͜Β༁ޙͷσʔλΛ σίʔυ͢Δ • ػց༁ࣗಈཁͳͲೖग़ྗͷ͕͞ҧ͏ λεΫͰۙݚڀ͕ਐΜͰ͍Δ
Seq2seqϞσϧͰͷ༁ 5IJT JT B QFO &04 ͜Ε ϖϯ Ͱ͢
&04 ͜Ε ϖϯ Ͱ͢
Seq2seqϞσϧͰͷ༁ 5IJT JT B QFO &04 ͜Ε ϖϯ Ͱ͢
&04 ͜Ε ϖϯ Ͱ͢ 5IJTJTBQFOΛݻఆʹ Τϯίʔυ͍ͯ͠Δʂ
Seq2seqϞσϧΛ༁ʹ͏ͱʁ • ͔ͳΓ͏·͍͘͘ࣄ͕Θ͔͍ͬͯΔ • ͨͩ࣍͠ͷ༷ͳऑ͕͋Δ • จʹऑ͍ • ݻ༗໊ࢺ͕ೖΕସΘΔ •
͜ΕΛղܾ͢Δͷ͕࣍ʹઆ໌͢ΔAttention
Attentionͱ • σίʔυ࣌ʹΤϯίʔυ࣌ͷใΛগ͚ͩ͠ ࢀর͢ΔͨΊͷΈ • গ͚ͩ͠ = બͨ͠෦͚ͩΛݟΔ • Global
AttentionͱLocal Attention͕͋Δ
Global Attention • ީิঢ়ଶͷॏΈ͖ΛAttentionͱ͢Δ • ྺ࢙తʹͪ͜Βͷํ͕ͪΐͬͱݹ͍ 5IJT JT B QFO
&04 ͜Ε ͜Ε
Local Attention • Τϯίʔυ࣌ͷঢ়ଶΛ͍͔ͭ͘બͯ͠͏ 5IJT JT B QFO &04 ͜Ε
͜Ε
Attentionͷॱং • ΛͯΔॱংɺGlobal AttentionͰ Local AttentionͰ͍͠Ͱ͋Δ • AttentionͷॱংRNNͰֶशͨ͠Γ͢Δ • લ͔ΒॱʹAttentionΛ͍͚ͯͯͩ͘Ͱੑ
ೳ্͢Δ
࣮ݧ݁ՌɿWMT'14
࣮ݧ݁ՌɿWMT'15
࣮ࡍͷ༁ͷྫ
͜͜·Ͱͷ·ͱΊ • جૅతͳχϡʔϥϧωοτϫʔΫͷղઆ • Recurrent Neural Network • Attention
ࠓ͞ͳ͔ͬͨ͜ͱ • ֶशʢback propagation, minibatchʣ • ଛࣦؔʢlog loss, cross entropy
lossʣ • ਖ਼ଇԽͷςΫχοΫ • dropout, batch normalization • ࠷దԽͷςΫχοΫ • RMSProp, AdaGrad, Adam • ֤छ׆ੑԽؔ • (Very) Leaky ReLU, Maxout
ࠓޙͷΦεεϝ • ࣗͰͳʹ͔࣮ݧͯ͠ΈΑ͏ • γϯϓϧͳྫͰ͍͍͔Β·ͣಈ͔͢ • ಈ͍ͨΒ࣍ʹࣗͰվͯ͠ΈΔ • ͱʹ͔͘खΛಈ͔͢͜ͱ͕େࣄ •
࠷ॳ͔Β͗͢͠Δ͜ͱʹखΛग़͞ͳ͍
࠷৽ใͷΞϯςφ (1) • TwitterͰػցֶशͳͲʹ͍ͭͯൃݴ͍ͯ͠Δ ਓΛϑΥϩʔ͢Δ • ͱΓ͋͑ͣ @hillbig • ͍͍ਓଞʹͨ͘͞Μ͍·͕͢
• ͍͋͠ਓ͍Δ͔Βҙͯ͠Ͷ
࠷৽ใͷΞϯςφ (2) • จΛಡ͏ • ಡΉ͚ͩ࣌ؒͷແବͳจ͋ΔͷͰҙ • ࠷ॳͷ͏ͪɺ༗໊ͳֶձʢACL, EMNLP, ICML,
NIPS, KDD, etc.ʣʹ௨ͬͯΔจʹ ߜ͕ͬͨΑ͍
࠷৽ใͷΞϯςφ (3) • จͷஶऀʹ͢Δ • จΛಡΜͰ͍Δ͏ͪʹɺ͕ࣗ໘ന͍ͱ ࢥ͏จͷஶऀ͕Կਓ͔ग़ͯ͘Δ • ͦ͏͍͏ਓͷ৽͍͠จͲ͏ʹ͔ͯ͠ νΣοΫ͠Α͏
Take home messages • ؾ͕࣋ͪΓ্͕ͬͯΔ͏ͪʹɺࣗͷखͰ ৭ʑ࣮ݧͯ͠ΈΑ͏ • ॳ৺ऀʹChainer͕Φεεϝ • ࠷৽ใωοτͰೖखͰ͖Δ
• มͳํʹҙ͕ࣝߴ͍ਓʹҙ