Slide 1

Slide 1 text

ࣗવݴޠॲཧͷ৽ల։ 2015೥4݄21೔ ट౎େֶ౦ژ γεςϜσβΠϯֶ෦ খொक

Slide 2

Slide 2 text

ࣗݾ঺հ: খொकʢ͜·ͪ·΋Δʣ 2 ß 2005.03 ౦ژେֶڭཆֶ෦جૅՊֶՊ Պֶ࢙ɾՊֶ఩ֶ෼Պଔۀ ß 2010.03 ಸྑઌ୺େɾത࢜ޙظ՝ఔमྃ ത࢜ʢ޻ֶʣ ઐ໳: ࣗવݴޠॲཧ ß 2010.04ʙ2013.03 ಸྑઌ୺େ ॿڭʢদຊ༟࣏ݚڀࣨʣ ß 2013.04〜 ट౎େֶ౦ژ ।ڭतʢࣗવݴޠॲཧݚڀࣨʣ

Slide 3

Slide 3 text

ຊ೔ͷ໨࣍ ß ਂ૚ֶश͕ࣗવݴޠॲཧʹ༩͑ΔΠϯύ Ϋτ ß ࣗવݴޠॲཧͷ৽ͨͳൃల 3

Slide 4

Slide 4 text

ਂ૚ֶशʢdeep learningʣ ß ෳ਺ϨΠϠʔͷχϡʔϥϧωοτϫʔΫ ʹΑͬͯෳࡶͳϞσϧΛֶश͢Δ࢓૊Έ ß ༷ʑͳύλʔϯೝࣝλεΫͰେ෯ͳੑೳ ޲্Λୡ੒͠ɺGoogle, Facebook, Microsoft, Baidu ͳͲ͞·͟·ͳاۀ͕͜ ͧͬͯݚڀ 4

Slide 5

Slide 5 text

Lee et al., ICML 2009. 5

Slide 6

Slide 6 text

ਂ૚ֶशͷ௕ॴ ß ૉੑ޻ֶʢfeature engineeringʣ͕ෆཁɻ ϥϕϧͳ͠σʔλ͔Βࣗಈతʹ༗ޮͳૉ ੑͷ૊Έ߹Θֶ͕ͤशՄೳɻ →ϋΠύʔύϥϝʔλ͸ଘࡏ ß σʔλ͔ΒେҬతͳදݱֶशʢdistributed representationʣ͕Մೳ →ΫϥελϦϯά͸ہॴతͳදݱֶश 6

Slide 7

Slide 7 text

χϡʔϥϧωοτϫʔΫ ͷϒϨΠΫεϧʔ ß Hinton et al., A Fast Learning Algorithm for Deep Belief Nets, Neural Computing, 2006. ß χϡʔϥϧωοτϫʔΫ͸1950೥୅͔Β ͕͋ͬͨɺදݱೳྗ͕ߴ͗ͯ͢ʢσʔλ ྔʹରͯ͠ʣաֶशʹͳΓ΍͔ͬͨ͢ɻ →૚͝ͱʹֶशΛߦ͍ɺෳ਺૚ΛॏͶΔ ͜ͱͰաֶशͷ໰୊͕ղܾͰ͖ͨʂ 7

Slide 8

Slide 8 text

࠶ؼతχϡʔϥϧωοτϫʔΫ Λ༻͍ͨը૾ೝࣝͱߏจղੳ 8 • Parsing Natural Scenes and Natural Language with Recursive Neural Networks, Socher et al., ICML 2011. • ྡ઀͢Δը૾ྖҬɾ୯ ޠ͔Β࠶ؼతʹߏ଄Λ ೝࣝ͢Δ →Staford Parser ʹ౷ ߹ (ACL 2013)

Slide 9

Slide 9 text

࠶ؼతχϡʔϥϧωοτϫʔΫͰ ϑϨʔζͷײ৘ۃੑ෼ྨ΋࣮ݱ 9 • Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, Socher et al., EMNLP 2013.

Slide 10

Slide 10 text

Socher et al. (NIPS 2011): ୯ޠϕΫ τϧ͔ΒจͷҙຯΛ࠶ؼతʹܭࢉ 10

Slide 11

Slide 11 text

ϦΧϨϯτχϡʔϥϧωοτ ϫʔΫͰແݶ௕ͷจ຺ΛߟྀՄೳ 11 • Recurrent Neural Network based Language Model, Mikolov et al., InterSpeech 2010. →աڈͷཤྺΛߟྀͯ͠ݱࡏͷ୯ޠΛ༧ଌ͢ΔϞσϧ

Slide 12

Slide 12 text

ػց຋༁΋ܥྻ͔ΒܥྻΛੜ੒͢ ΔϞσϧͱͯ͠ਂ૚ֶशͰѻ͑Δ ß Sequence to Sequence Learning with Neural Networks, Sutskever et al., NIPS 2014. →LSTM (Long-Short Term Memory) Λ2ͭ༻ ͍ɺೖྗܥྻΛݻఆ௕ͷϕΫτϧʹม׵ ͠ɺͦͷϕΫτϧ͔Βग़ྗܥྻΛੜ੒ 12

Slide 13

Slide 13 text

จࣈ͚͔ͩΒਂ૚ֶशͰςΩετ ෼ྨ΍ϓϩάϥϜ͕Ͱ͖ͯ͠·͏ ß Text Understanding from Scratch, Zhang and LeCun, arXiv 2015. →จࣈ͚͔ͩΒதӳͷςΩετ෼ྨثΛֶश ß Learning to Execute, Zaremba and Sutskever, arXiv 2015. →RNNͱLTSM͚͔ͩΒPythonϓϩάϥϜΛ ʮֶशʯ࣮ͯ͠ߦ 13

Slide 14

Slide 14 text

ਂ૚ֶशΛ࢖ͬͯϚϧνϞʔμϧ ͳೖग़ྗΛࣗવʹ౷߹ ß ը૾͚͔ͩΒΩϟϓγϣϯΛੜ੒ http://deeplearning.cs.toronto.edu/i2t http://googleresearch.blogspot.jp/2014/11/a-picture-is- worth-thousand-coherent.html 14

Slide 15

Slide 15 text

ຊ೔ͷ໨࣍ ß ਂ૚ֶश͕ࣗવݴޠॲཧʹ༩͑ΔΠϯύ Ϋτ ß ࣗવݴޠॲཧͷ৽ͨͳൃల 15

Slide 16

Slide 16 text

ࣗવݴޠॲཧͷ੒ޭ ß ࣝผϞσϧ Þ λά͖ͭίʔύεΛ༻ҙͯ͠ڭࢣ͋Γֶश Þ ܗଶૉղੳɺݻ༗දݱೝࣝɺߏจղੳɺetc ß ࠷దԽ໰୊ Þ ϥϯΩϯά΍૊Έ߹Θͤ࠷దԽ໰୊ʹఆࣜԽ Þ ΢Σϒݕࡧɺػց຋༁ɺจॻཁ໿ɺetc 16

Slide 17

Slide 17 text

ੈքΛڍ͛ͨଟݴޠॲཧͷͨΊͷ ཁૉٕज़ͷݚڀ։ൃ ß CoNLL: Conference on Natural Language Learning ͷڞ௨λεΫʢຖ೥։࠵ʣ Þ 2012: ଟݴޠஊ࿩ղੳ Þ 2009: ଟݴޠߏจɾҙຯղੳ Þ 2006, 2007: ଟݴޠߏจղੳ ß ಉ͡ΞϧΰϦζϜΛෳ਺ͷݴޠʹద༻͠ɺ ݴޠʹΑΒͳ͍ղੳख๏Λ୳ٻ 17

Slide 18

Slide 18 text

Java ʹΑΔଟݴޠॲཧπʔϧ ʢ঎༻ͷϞσϧϥΠηϯε͸ཁަবʣ ß Stanford CoreNLP (Java) Þ ӳޠɺεϖΠϯޠɺதࠃޠͷܗଶૉղੳɾݻ ༗දݱೝࣝɾߏจղੳɾஊ࿩ղੳπʔϧ ß Apache OpenNLP (Java) Þ σϯϚʔΫޠɺυΠπޠɺӳޠɺεϖΠϯޠɺ ΦϥϯμޠɺϙϧτΨϧޠɺε΢Σʔσϯޠ Λαϙʔτ ß LingPipe (Java) Þ ӳޠʢ඼ࢺ෇༩ɾݻ༗දݱநग़ʣɾதࠃޠ ʢ୯ޠ෼ׂʣͷϞσϧ 18

Slide 19

Slide 19 text

ଟݴޠܗଶૉղੳͷͨΊͷ λά࢓༷ͱίʔύε ß A Universal Part-of-Speech Tagset, Petrov et al., LREC 2012. Þ 22ݴޠ: ӳޠɺதࠃޠɺ೔ຊޠɺؖࠃޠɺetc Þ ଟݴޠɾݴޠΛ·͍ͨͩߏจղੳͷݚڀ։ൃ ͷͨΊʹɺ·ͣ඼ࢺΛҰ؏͚͍ͯͭͨ͠ Þ ೔ຊޠ͸೔ຊޠॻ͖ݴ༿ۉߧίʔύε ʢBCCWJʣͷ୹୯Ґʹ४ڌͨ͠୯ޠ෼ׂ 19

Slide 20

Slide 20 text

ଟݴޠ܎Γड͚ղੳͷͨΊͷ λά࢓༷ͱίʔύε ß Universal Dependency Annotation for Multilingual Parsing, McDonald et al., ACL 2013. Þ υΠπޠɾӳޠɾε΢ΣʔσϯޠɾεϖΠϯޠɾ ϑϥϯεޠɾؖࠃޠɾetc Þ ೔ຊޠ Universal Dependencies ͷࢼҊ, ۚࢁΒ, ݴ ޠॲཧֶձ೥࣍େձ 2015. 20

Slide 21

Slide 21 text

ࣗવݴޠॲཧͷཁૉٕज़͸੒ख़ظ ཁૉٕज़ ਫ਼౓ ܗଶૉղੳʢ෼͔ͪॻ͖ʣ 99% ߏจղੳʢ܎Γड͚ʣ 90% ҙຯղੳʢड़ޠ߲ߏ଄ʣ 60% ஊ࿩ղੳʢจΛ௒͑ͨؔ܎ʣ 30% 21 ղ ੳ ͷ ྲྀ Ε จਖ਼ղ཰ʹ͢Δͱ5ׂ ཁૉٕज़୯ମͰͷਫ਼౓޲্͸಄ଧͪ ᶃΞϓϦέʔγϣϯʹଈͨ͠ੑೳධՁͷඞཁ ᶄਫ਼౓Ҏ֎ͷ໘ͰͷΞϐʔϧ

Slide 22

Slide 22 text

ӳޠͷݴޠղੳ΋৽ฉهࣄ͔Β ΢ΣϒςΩετ΁ ß Workshop on Syntactic Analysis on Non- Canonical Language (SANCL 2012) ß Google English Web Treebank (2012) Þ ΢ΣϒςΩετʢϒϩάɺχϡʔεάϧʔϓɺ ϝʔϧɺϦϏϡʔɺQA ʣʹܗଶૉɾߏจʢ܎ Γड͚ʣ৘ใΛλά͚ͮ 22

Slide 23

Slide 23 text

΢ΣϒςΩετ΋ɺΑΓ೉͍͠ Ϣʔβੜ੒ܕͷςΩετղੳ΁ ß Tweet NLPʢӳޠͷΈʣ http://www.ark.cs.cmu.edu/TweetNLP/ Þ Twokenizer: ܗଶૉղੳ Þ Tweeboparser: ܎Γड͚ղੳ Þ Tweebank: Twitter ίʔύε Þ Twitter Word Clusters: ୯ޠΫϥελ 23

Slide 24

Slide 24 text

฼ޠ࿩ऀ͕ॻ͍ͨจ๏తʹਖ਼͍͠ςΩ ετ͔ΒɺݴޠֶशऀͷςΩετ΁ ß 2011೥લޙ͔Βຖ೥ͷΑ͏ʹӳޠֶशऀ ͷ࡞จͷจ๏ޡΓగਖ਼ڞ௨λεΫ͕։࠵ Þ Helping Our Own (HOO) 2011, 2012 Þ CoNLL 2013, 2014 ß ӳޠֶशऀίʔύε΋ଟ਺ϦϦʔε Þ NUS Corpus of Learner English Þ Lang-8 Learner Corpora 24

Slide 25

Slide 25 text

ݻ༗දݱೝࣝɾޠٛᐆດੑղফ ͔Β entity linking ΁ ß ݻ༗දݱೝࣝ Þ ݻ༗දݱͷՕॴΛಉఆ ß entity linking Þ ݻ༗දݱ͕ԿΛࢦ͔͢ᐆດੑղফ Þ Wikify (Wikification) 25 ҆ഒट૬͕ࣄ࣮ޡೝΛೝΊɺҨ״Λද໌ͨ͠ɻ

Slide 26

Slide 26 text

ຊ೔ͷ·ͱΊ ß ਂ૚ֶश͕ݴޠॲཧʹ༩͑ΔΠϯύΫτ Þ ߏจղੳ͔Βҙຯղੳ·Ͱ end-to-end Þ ϚϧνϞʔμϧʢը૾ɾԻ੠ɾݴޠʣॲཧ Þ ςΩετੜ੒͕ࠓޙരൃతʹීٴͦ͠͏ ß ࣗવݴޠॲཧͷ৽ͨͳൃల Þ ݴޠඇґଘͳख๏ͷݕ౼ͱ໰୊ͷ෼ੳ Þ ؤ݈ͳղੳख๏ͷ໛ࡧ Þ ΢Σϒͷొ৔ʹΑΔݹͯ͘৽͍͠໰୊ઃఆ 26