Slide 1

Slide 1 text

ਂ૚χϡʔϥϧωοτϫʔΫ Λ༻͍ͨ೔ຊޠॲཧ ट౎େֶ౦ژ γεςϜσβΠϯֶ෦ খொक 2016/06/30 ਓ޻஌ೳֶձୈ71ճਓ޻஌ೳηϛφʔ Deep Learningٕज़ͷ࢓૊Έͱࣗવݴޠॲཧ΁ͷԠ༻

Slide 2

Slide 2 text

ࣗݾ঺հ: খொकʢ͜·ͪ·΋Δʣ 2 ß 2005.03 ౦ژେֶڭཆֶ෦جૅՊֶՊ Պֶ࢙ɾՊֶ఩ֶ෼Պଔۀ ß 2010.03 ಸྑઌ୺େɾത࢜ޙظ՝ఔमྃ ത࢜ʢ޻ֶʣ ઐ໳: ࣗવݴޠॲཧ ß 2010.04ʙ2013.03 ಸྑઌ୺େ ॿڭʢদຊ༟࣏ݚڀࣨʣ ß 2013.04〜 ट౎େֶ౦ژ ।ڭतʢࣗવݴޠॲཧݚڀࣨʣ

Slide 3

Slide 3 text

ࣗવݴޠॲཧºਂ૚ֶशϚοϓ ʢ౦େ௰ҪݚʴNAIST দຊݚ OBʣ 3 ౦๺େֶ סɾԬ࡚ݚڀࣨ ౦ژେֶ ௽Ԭݚڀࣨ ट౎େֶ౦ژ খொݚڀࣨ ๛ా޻ۀେֶ ஌ೳ਺ཧݚڀࣨ NAIST দຊݚڀࣨ

Slide 4

Slide 4 text

ਂ૚ֶश@TMU ß ػց຋༁ʢ4ਓʣ ß ର࿩ʢ3ਓʣ ß ධ൑෼ੳɾจॻཁ໿ʢ2ਓʣ ß ܗଶૉղੳʢ2ਓʣ ß ෼ࢄදݱʢ1ਓʣ 4 ※ࣸਅ͸ΠϝʔδͰ͢

Slide 5

Slide 5 text

ࣗવݴޠॲཧʹ͓͚Δਂ૚ֶश ß දݱֶश Þ ʢͰ͖Δ͚ͩڭࢣͳ͠Ͱʣ ༗༻ͳಛ௃දݱΛֶश͍ͨ͠ʂ Þ word2vec, GloVe, ... →ຊ೔͸ͪ͜Βͷ࿩͸͠·ͤΜ ß ਂ૚χϡʔϥϧωοτͷΞʔΩςΫνϟ Þ ʢجຊతʹڭࢣ͋ΓͰʣ ैདྷͷNLPͰͰ͖ͳ͔ͬͨ͜ͱΛ͍ͨ͠ʂ →ຊ೔͸ͪ͜Βͷ࿩Λ͠·͢ 5

Slide 6

Slide 6 text

ॎʹਂ͍ਂ૚ֶश ß ϑΟʔυϑΥϫʔυNN, CNN, ... 6 ਂ ͍ ೖྗ૚ ӅΕ૚ ग़ྗ૚ ਂ૚ֶशϚδ΍͹͍ 😍

Slide 7

Slide 7 text

CNN Λ࢖ͬͨධՁۃੑ෼ྨ 7 word2vec ʹΑΔ 1-hot ϕΫτϧͷ ୯ޠຒΊࠐΈද ͍ΖΜͳ૭෯΍ ૉੑΛ࢖ͬͨ ৞ΈࠐΈ૚ ϓʔϦϯά૚ ग़ྗ૚ (શ݁߹) จͷ෼ࢄදݱ ʢn࣍ݩͷ୯ޠ ͕kݸ͋Δʣ ß Yoon Kim. Convolutional Neural Networks for Sentence Classification. EMNLP 2014. 😀ૉੑςϯϓϨʔτͷઃܭ͸ෆཁ ݴޠඇґଘ

Slide 8

Slide 8 text

ԣʹਂ͍ਂ૚ֶश ß ϦΧϨϯτ NN 8 ೖྗ૚ ӅΕ૚ ग़ྗ૚ ਂ૚ֶश Ϛδ ΍͹͍ Ϛδ ΍͹͍ ਂ͍

Slide 9

Slide 9 text

RNN Λ༻͍ͨݴޠϞσϧ 9 ß Mikolov et al. Recurrent Neural Network Based Language Model. Interspeech2010. →word2vec ͷ࡞ऀͷ ʢw2vΑΓલͷʣݚڀ ß Ի੠ೝࣝɾػց຋༁ ͳͲͰ޿͘ར༻ ࢖͏จ຺ʹ໌֬ͳڥ໨͕ͳ͍ ඼ࢺͷΑ͏ͳ໌֬ͳΫϥε͕ͳ͍

Slide 10

Slide 10 text

ਂ૚ֶशΛ࢖͑͹ϚϧνϞʔμϧͳ ೖྗ͔Βݴޠੜ੒Ͱ͖·͢ ß ը૾͚͔ͩΒΩϟϓγϣϯΛੜ੒ http://deeplearning.cs.toronto.edu/i2t http://googleresearch.blogspot.jp/2014/11/a-picture-is- worth-thousand-coherent.html 10

Slide 11

Slide 11 text

ࣼΊʹਂ͍ਂ૚ֶश ß ࠶ؼత (recursive) NN 11 ਂ૚ֶश Ϛδ ΍͹͍ 😍 ϑϨʔζϕΫτϧͷ৞Έ ࠐΈͰจϕΫτϧΛߏ੒ 😋 😑 😑 😍 ʮϚδ΍͹͍ʯͱ͍͏ϑ ϨʔζϕΫτϧΛ୯ޠϕ Ϋτϧ͔Β࠶ؼతʹߏ੒

Slide 12

Slide 12 text

࠶ؼతχϡʔϥϧωοτϫʔΫ Λ༻͍ͨը૾ೝࣝͱߏจղੳ 12 • Socher et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. ICML 2011. • ྡ઀͢Δը૾ྖҬɾ୯ ޠ͔Β࠶ؼతʹߏ଄Λ ೝࣝ͢Δ →Staford Parser ʹ౷ ߹ (ACL 2013)

Slide 13

Slide 13 text

ݴޠॲཧʹ͓͚Δਂ૚χϡʔϥϧ ωοτϫʔΫͷΞʔΩςΫνϟ ß ଟ૚χϡʔϥϧωοτϫʔΫ ʹϑΟʔυϑΥϫʔυNN, CNN, ଟ૚Φʔ τΤϯίʔμ, ... ß ʢ޿ٛͷʣRNN Þ ϦΧϨϯτχϡʔϥϧωοτϫʔΫ ʹʢڱٛͷʣrecurrent NN, LSTM, GRU, ... Þ ࠶ؼతχϡʔϥϧωοτϫʔΫ ʹrecursive NN, Tree-LSTM, ... 13 ೝ ࣝ ʴ ੜ ੒ ೝ ࣝ

Slide 14

Slide 14 text

ຊ೔ͷ໨࣍ ß ਂ૚χϡʔϥϧωοτͷΞʔΩςΫνϟ ß ਂ૚ֶशͱ೔ຊޠॲཧͷέʔεελσΟ Þ ࣝผλεΫɿධՁۃੑ෼ྨ Þ ੜ੒λεΫɿػց຋༁ ß ਂ૚ֶशΛ࢝Ί͍ͨਓ΁ͷϝοηʔδ 14

Slide 15

Slide 15 text

ࣝผλεΫ: ධՁۃੑ෼ྨ 15 ß จ͕ϙδςΟϒ͔ωΨςΟϒ͔෼ྨ ß ϦαʔνΫΤενϣϯɿ ͲΜͳख͕͔ΓΛ࢖͑͹͜Ε͕ϙδςΟ ϒͳͷ͔ωΨςΟϒͳͷ͔෼͔Δʁ

Slide 16

Slide 16 text

จͷධՁۃੑ͸ϑϨʔζ ʢ෦෼໦ʣ͔Β࠶ؼతʹܭࢉͰ͖Δ 16 Nakagawa et al. Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables. HLT-NAACL 2010. ͖ͬ͞ݟͨߏ଄ͱ ࣅ͍ͯΔʂʂʂʂ 😀౷ޠ৘ใΛ׆༻͢Δ͜ͱ͕Ͱ͖Δ 😩ૉੑςϯϓϨʔτͷઃܭ͕ඞཁ

Slide 17

Slide 17 text

࠶ؼతχϡʔϥϧωοτϫʔΫͰ ײ৘ۃੑ෼ྨ 17 😀ૉੑςϯϓϨʔτͷઃܭ͸ෆཁ 😩ϑϨʔζ୯ҐͰϙδωΨͷλά ͕͍ͭͨσʔλ͕ඞཁ ß Socher et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. EMNLP 2013.

Slide 18

Slide 18 text

ଟ૚ΦʔτΤϯίʔμΛ༻͍ͨ ೔ຊޠධՁۃੑ෼ྨ 18 Zhang and Komachi. Japanese Sentiment Classification with Stacked Denoising Auto- Encoder using Distributed Word Representation. PACLIC 2015. ೖ ྗ ૚ ग़ ྗ ૚ Ӆ Ε ૚ Ӆ Ε ૚ Ӆ Ε ૚ 😀ૉੑςϯϓϨʔτͷઃܭ͸ෆཁ ݴޠඇґଘ શ ݁ ߹ શ ݁ ߹ શ ݁ ߹ શ ݁ ߹

Slide 19

Slide 19 text

೔ຊޠධՁۃੑ෼ྨͰ͸ଟ૚Φʔ τΤϯίʔμʢSdAʣ͕࠷΋ޮՌత 19 MFS=࠷සग़λά (neg), Tree-CRF=Nakagawa et al. (2010), LogRes=ϩδεςΟοΫճؼ, BoF=Bag-of-features ͰจϕΫτϧߏ੒, w2v=word2vec ͷฏۉͰจϕΫτϧߏ੒

Slide 20

Slide 20 text

෼ྨ໰୊ʹ͓͚Δ՝୊ ß ෼ੳͷ໰୊ Þ ֶश݁Ռͷղऍ͕೉͍͠ →஫ҙ (attention) ͰՄࢹԽ ß λεΫઃܭͷ໰୊ Þ Ϋϥε਺͕ະ஌ʢe.g. ڞࢀরղੳʣͷ৔߹ →Vinyals et al. Pointer Networks. NIPS 2015. 20

Slide 21

Slide 21 text

ຊ೔ͷ໨࣍ ß ਂ૚χϡʔϥϧωοτͷΞʔΩςΫνϟ ß ਂ૚ֶशͱ೔ຊޠॲཧͷέʔεελσΟ Þ ࣝผλεΫɿධՁۃੑ෼ྨ Þ ੜ੒λεΫɿػց຋༁ ß ਂ૚ֶशΛ࢝Ί͍ͨਓ΁ͷϝοηʔδ 21

Slide 22

Slide 22 text

ß ͋ΔݴޠͰॻ͔ΕͨจΛผͷݴޠʹ຋༁ ß ϦαʔνΫΤενϣϯɿ ௒ଟΫϥε෼ྨ໰୊ΛͲ͏ղ͔͘ʁ ΞϥΠϝϯτΛͲ͏ϞσϧԽ͢Ε͹͍͍ʁ ੜ੒λεΫ: ػց຋༁ 22 https://twitter.com/haldaume3/status/732333418038562816

Slide 23

Slide 23 text

໨త ݴޠ ର༁ίʔύε͔Β౷ܭతʹֶश ß ϊΠδʔνϟωϧϞσϧ ʢ࣮ࡍʹ͸͜ΕΛઢܗର਺Ϟσϧʹͨ͠΋ͷʣ 23 ˆ e = argmax e P(e | f ) = argmax e P( f | e)P(e) ର༁ίʔύε ݪݴޠ ໨త ݴޠ ᶃΞϥΠϝϯτ ᶄϧʔϧநग़ P(f | e) ຋༁Ϟσϧ ໨త ݴޠ ໨త ݴޠ P(e) ݴޠϞσϧ ᶅσίʔυ argmax ᶆ࠷దԽ BLEU ݪݴޠ ໨త ݴޠ ੜίʔύε ։ൃίʔύε ʢࢀর༁ʣ

Slide 24

Slide 24 text

2ͭͷRNNΛ૊Έ߹Θͤͯܥྻม׵ (sequence to sequence) Ͱݴޠੜ੒ ß ΤϯίʔμɾσίʔμΞϓϩʔν 24 ਂ૚ֶश Ϛδ ΍͹͍ DL is DL is really cool really cool Τ ϯ ί ồ μ σ ί ồ μ ݪݴޠͷ୯ޠϕΫ τϧ͔ΒจϕΫτ ϧΛΤϯίʔυ

Slide 25

Slide 25 text

ೖྗͱग़ྗͷରԠ͸ ஫ҙ (attention) ͭͭ͠σίʔυ ß ΤϯίʔμଆͷӅΕ૚ͷॏΈ෇͖࿨ 25 ਂ૚ֶश Ϛδ ΍͹͍ DL is is really cool really cool จϕΫτϧΛ1ͭ ʹѹॖͤͣɺͲ ͷ෦෼Λ࢖͏͔ Λʮ஫ҙʯ͢Δ DL

Slide 26

Slide 26 text

طଘͷ౷ܭతػց຋༁Λ χϡʔϥϧ຋༁͕ൈ͖ͭͭ͋Γ·͢ ß Luong et al. Effective Approaches to Attention-based Neural Machine Translation. EMNLP 2015. 26

Slide 27

Slide 27 text

ܥྻ͚ͩͰͳ͘ɺ໦ߏ଄ʹ΋ Ξςϯγϣϯ͢Δ͜ͱ͕Ͱ͖·͢ ß Eriguchi et al. Tree-to-Sequence Attentional Neural Machine Translation. ACL 2016. ß Τϯίʔμͷ۟ߏ଄ʹΞςϯγϣϯ͠ɺ౷ޠ ߏ଄Λߟྀͨ͠χϡʔϥϧӳ೔຋༁Λ࣮ݱ 27

Slide 28

Slide 28 text

ੜ੒λεΫʹ͓͚Δ՝୊ ß ະ஌ޠͷ໰୊ Þ ෼ྨ໰୊ͳͷͰະ஌ޠ͸ग़ͤͳ͍ →ະ஌ޠॲཧ, CopyNet, ... ß ධՁͷ໰୊ Þ ୯ޠ୯ҐͷΫϩεΤϯτϩϐʔͩͱաֶश →࠷ऴతʹ࠷େԽ͍ͨ͠ධՁई౓ (e.g. BLEU) ʹΑΔ࠷దԽ, ΧϦΩϡϥϜֶश, ... 28

Slide 29

Slide 29 text

ຊ೔ͷ໨࣍ ß ਂ૚χϡʔϥϧωοτͷΞʔΩςΫνϟ ß ਂ૚ֶशͱ೔ຊޠॲཧͷέʔεελσΟ Þ ࣝผλεΫɿධՁۃੑ෼ྨ Þ ੜ੒λεΫɿػց຋༁ ß ਂ૚ֶशΛ࢝Ί͍ͨਓ΁ͷϝοηʔδ 29

Slide 30

Slide 30 text

ͳͥਂ૚ֶश͕͏·͘ߦ͘ͷ͔ʁ ᶃେن໛σʔλΛܭࢉػͰԥΔ ß ͏·͘ߦ͍ͬͯΔʢΑ͏ʹݟ͑Δʣλε Ϋ͸େମେن໛σʔλʢίʔύεͰݴ͑ ͹਺ඦສจن໛ʣ͕͋Δ Þ ʮաֶशʯͯ͠΋͍͍λεΫઃఆ ß ࠷దԽɾਖ਼ଇԽͳͲͷΞϧΰϦζϜͷൃ లͱɺେن໛σʔλΛѻ͏ٕज़ͷਐา Þ աֶश͠ͳ͍Α͏ʹʮաֶशʯ 30

Slide 31

Slide 31 text

ਂ૚ֶशΛ͢Δʹ͸ GPU ͕ඞཁʁ ß CPU ΑΓ਺ഒҎ্ߴ଎Ͱ͋Δ͜ͱ͕ଟ͍ ͷͰɺͱΓ͋͑ͣ1ຕߪೖ͢Δͷ͕͓קΊ Þ ήʔϜ༻ PC ͩͱ20ສԁʙ ß ωοτϫʔΫߏ଄͕Մม ʢݴޠॲཧͰ͋Γ͕ͪʣ ͳ৔߹ɺͦ͜·Ͱ଎͘͸ ͳΒͳ͍͜ͱ΋͋Δ͕… 31 https://flic.kr/p/rNx69d

Slide 32

Slide 32 text

ͳͥਂ૚ֶश͕͏·͘ߦ͘ͷ͔ʁ ᶄ཭ࢄͷੈք͔Β࿈ଓͷੈք΁ ß Klein and Manning. Accurate Unlexicalized Parsing. ACL 2003. ϊʔυͷਂ͞ɾ෯͕Մม 32 Parent annotation ʢ਌ΧςΰϦͷϥϕϧͰࡉ෼Խʣ

Slide 33

Slide 33 text

ૉੑΤϯδχΞϦϯά͕ෆཁͰ΋ ύϥϝʔλνϡʔχϯά͕େมʁ ß ύϥϝʔλઃఆʹහײͳύϥϝʔλͱͦ ͏Ͱͳ͍ύϥϝʔλ͕͋Δ Þ ࠷దԽΞϧΰϦζϜΛֶश͢Δͱ͍͏࿩·Ͱ ͋Δ→Andrychowicz et al. Learning to Learn by Gradient Descent by Gradient Descent. arXiv 2016. ß ͲͷΑ͏ͳλεΫͰԿ͕ޮ͔͘ɺʹ͍ͭ ͯ͸ϊ΢ϋ΢͕͋Δ͔΋…… Þ ׬શʹຖճࢼߦࡨޡɺͰ͸ͳ͍͕ɺ࠷ॳ͸͠ ͹Β͘ࢼߦࡨޡ͸ෆՄආ 33

Slide 34

Slide 34 text

ਂ૚ֶशΛ࢝Ί͍ͨਓ΁ͷ take-home message ß େن໛σʔλɾܭࢉػࢿݯ͕ඞཁ Þ GPGPU ʹৄ͘͠ͳΔ Þ ࣗ෼ͰσʔλΛ࡞ΔͳΒΫϥ΢υιʔγϯά Ͱ͖Δ͔Ͳ͏͔ʢϊ΢ϋ΢ؚΊʣ͕෼͔Ε໨ ß άϧʔϓͰ৘ใڞ༗ Þ NLP-DLษڧձ, Twitter, Slack, ... ß ͍ͨΔͱ͜ΖϨουΦʔγϟϯ Þ arXiv ʹग़ͨཌिʹ࣮૷͕ग़ͨΓ͢Δͷ͸βϥ 34

Slide 35

Slide 35 text

ຊ೔ͷ·ͱΊ ß ࣗવݴޠॲཧʹ͓͚Δ3छྨͷਂ૚ֶश Þ ॎɺԣɺࣼΊʹੵΈ্͛Δ ß ਂ૚ֶशͷέʔεελσΟ Þ ೝࣝλεΫͰ͸ૉੑΤϯδχΞϦϯάͷܰݮ ͱؤ݈ੑͷ޲্ʹΑΔߴ͍෼ྨਫ਼౓ͷୡ੒ Þ ੜ੒λεΫͰ͸RNNͰ௕ڑ཭ґଘΛϞσϧԽɺ ΞςϯγϣϯͰจͷҙຯදݱͷਂԽ ß ਂ૚ֶश͸ࣗવݴޠॲཧͰ΋ϒϨΠΫε ϧʔʢಛʹେن໛σʔλΛ༻͍ͨੜ੒ʣ 35

Slide 36

Slide 36 text

ϦϑΝϨϯεʢ2016೥ʣ ß ਂ૚ֶशʹΑΔࣗવݴޠॲཧͷݚڀಈ޲ http://www.slideshare.net/stairlab/ss- 61806151 ß ࠷ۙͷDeep Learningք۾ʹ͓͚ΔAttention ࣄ৘ http://www.slideshare.net/yutakikuchi927/ deep-learning-nlp-attention 36

Slide 37

Slide 37 text

ϦϑΝϨϯεʢ2015೥ʣ ß ݴޠͱ஌ࣝͷਂ૚ֶश http://www.slideshare.net/unnonouno/ss- 52283060 ß χϡʔϥϧωοτʹجͮ͘ػց຋༁ http://phontron.com/slides/neubig15kyoto- slides.pdf ß ࣗવݴޠॲཧʹ͓͚ΔσΟʔϓϥʔχϯάͷ ൃల http://2boy.org/~yuta/publications/DL4NL- tsuboi-kashimalab.pdf 37