Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Frotiers of Natural Language Processing
Search
Mamoru Komachi
April 23, 2015
Technology
0
12
Frotiers of Natural Language Processing
Recruit Technologies Open Lab #01 (テーマ: 自然言語処理)で話したときに使ったスライドです。
https://atnd.org/events/64383
Mamoru Komachi
April 23, 2015
Tweet
Share
More Decks by Mamoru Komachi
See All by Mamoru Komachi
IM2024
mamoruk
0
200
大規模言語モデルのインパクトと課題/oc2023
mamoruk
0
35
Exploring and Adapting Chinese GPT to Pinyin Input Method
mamoruk
0
110
Recent advances in natural language understanding and natural language generation
mamoruk
0
97
Introduction to Natural Language Processing
mamoruk
0
28
Generative Adversarial Network for Natural Language Processing
mamoruk
0
38
Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning
mamoruk
2
740
Sequence-to-Dependency Neural Machine Translation
mamoruk
0
29
Visualizing and Understanding Neural Machine Translation
mamoruk
0
28
Other Decks in Technology
See All in Technology
0→1事業こそPMは営業すべし / pmconf #落選お披露目 / PM should do sales in zero to one
roki_n_
PRO
1
1.7k
20250116_JAWS_Osaka
takuyay0ne
2
210
Amazon Route 53, 待ちに待った TLSAレコードのサポート開始
kenichinakamura
0
180
【Oracle Cloud ウェビナー】2025年のセキュリティ脅威を読み解く:リスクに備えるためのレジリエンスとデータ保護
oracle4engineer
PRO
1
110
iPadOS18でフローティングタブバーを解除してみた
sansantech
PRO
1
150
Unsafe.BitCast のすゝめ。
nenonaninu
0
200
TSのコードをRustで書き直した話
askua
3
370
AWSサービスアップデート 2024/12 Part3
nrinetcom
PRO
0
150
EMConf JP の楽しみ方 / How to enjoy EMConf JP
pauli
2
150
Azureの開発で辛いところ
re3turn
0
240
AWS re:Invent 2024 recap in 20min / JAWSUG 千葉 2025.1.14
shimy
1
110
データ基盤におけるIaCの重要性とその運用
mtpooh
4
600
Featured
See All Featured
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
28
2.2k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
29
960
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
19
2.4k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
28
4.5k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
Building a Scalable Design System with Sketch
lauravandoore
460
33k
Thoughts on Productivity
jonyablonski
68
4.4k
Code Review Best Practice
trishagee
65
17k
Faster Mobile Websites
deanohume
305
30k
Optimizing for Happiness
mojombo
376
70k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
33
2.7k
How to Ace a Technical Interview
jacobian
276
23k
Transcript
ࣗવݴޠॲཧͷ৽ల։ 20154݄21 टେֶ౦ژ γεςϜσβΠϯֶ෦ খொक
ࣗݾհ: খொकʢ͜·ͪ·Δʣ 2 ß 2005.03 ౦ژେֶڭཆֶ෦جૅՊֶՊ Պֶ࢙ɾՊֶֶՊଔۀ ß 2010.03 ಸྑઌେɾത࢜ޙظ՝ఔमྃ
ത࢜ʢֶʣ ઐ: ࣗવݴޠॲཧ ß 2010.04ʙ2013.03 ಸྑઌେ ॿڭʢদຊ༟࣏ݚڀࣨʣ ß 2013.04〜 टେֶ౦ژ ।ڭतʢࣗવݴޠॲཧݚڀࣨʣ
ຊͷ࣍ ß ਂֶश͕ࣗવݴޠॲཧʹ༩͑ΔΠϯύ Ϋτ ß ࣗવݴޠॲཧͷ৽ͨͳൃల 3
ਂֶशʢdeep learningʣ ß ෳϨΠϠʔͷχϡʔϥϧωοτϫʔΫ ʹΑͬͯෳࡶͳϞσϧΛֶश͢ΔΈ ß ༷ʑͳύλʔϯೝࣝλεΫͰେ෯ͳੑೳ ্Λୡ͠ɺGoogle, Facebook, Microsoft,
Baidu ͳͲ͞·͟·ͳاۀ͕͜ ͧͬͯݚڀ 4
Lee et al., ICML 2009. 5
ਂֶशͷॴ ß ૉੑֶʢfeature engineeringʣ͕ෆཁɻ ϥϕϧͳ͠σʔλ͔Βࣗಈతʹ༗ޮͳૉ ੑͷΈ߹Θֶ͕ͤशՄೳɻ →ϋΠύʔύϥϝʔλଘࡏ ß σʔλ͔ΒେҬతͳදݱֶशʢdistributed representationʣ͕Մೳ
→ΫϥελϦϯάہॴతͳදݱֶश 6
χϡʔϥϧωοτϫʔΫ ͷϒϨΠΫεϧʔ ß Hinton et al., A Fast Learning Algorithm
for Deep Belief Nets, Neural Computing, 2006. ß χϡʔϥϧωοτϫʔΫ1950͔Β ͕͋ͬͨɺදݱೳྗ͕ߴ͗ͯ͢ʢσʔλ ྔʹରͯ͠ʣաֶशʹͳΓ͔ͬͨ͢ɻ →͝ͱʹֶशΛߦ͍ɺෳΛॏͶΔ ͜ͱͰաֶशͷ͕ղܾͰ͖ͨʂ 7
࠶ؼతχϡʔϥϧωοτϫʔΫ Λ༻͍ͨը૾ೝࣝͱߏจղੳ 8 • Parsing Natural Scenes and Natural Language
with Recursive Neural Networks, Socher et al., ICML 2011. • ྡ͢Δը૾ྖҬɾ୯ ޠ͔Β࠶ؼతʹߏΛ ೝࣝ͢Δ →Staford Parser ʹ౷ ߹ (ACL 2013)
࠶ؼతχϡʔϥϧωοτϫʔΫͰ ϑϨʔζͷײۃੑྨ࣮ݱ 9 • Recursive Deep Models for Semantic Compositionality
Over a Sentiment Treebank, Socher et al., EMNLP 2013.
Socher et al. (NIPS 2011): ୯ޠϕΫ τϧ͔ΒจͷҙຯΛ࠶ؼతʹܭࢉ 10
ϦΧϨϯτχϡʔϥϧωοτ ϫʔΫͰແݶͷจ຺ΛߟྀՄೳ 11 • Recurrent Neural Network based Language Model,
Mikolov et al., InterSpeech 2010. →աڈͷཤྺΛߟྀͯ͠ݱࡏͷ୯ޠΛ༧ଌ͢ΔϞσϧ
ػց༁ܥྻ͔ΒܥྻΛੜ͢ ΔϞσϧͱͯ͠ਂֶशͰѻ͑Δ ß Sequence to Sequence Learning with Neural Networks,
Sutskever et al., NIPS 2014. →LSTM (Long-Short Term Memory) Λ2ͭ༻ ͍ɺೖྗܥྻΛݻఆͷϕΫτϧʹม ͠ɺͦͷϕΫτϧ͔Βग़ྗܥྻΛੜ 12
จࣈ͚͔ͩΒਂֶशͰςΩετ ྨϓϩάϥϜ͕Ͱ͖ͯ͠·͏ ß Text Understanding from Scratch, Zhang and LeCun,
arXiv 2015. →จࣈ͚͔ͩΒதӳͷςΩετྨثΛֶश ß Learning to Execute, Zaremba and Sutskever, arXiv 2015. →RNNͱLTSM͚͔ͩΒPythonϓϩάϥϜΛ ʮֶशʯ࣮ͯ͠ߦ 13
ਂֶशΛͬͯϚϧνϞʔμϧ ͳೖग़ྗΛࣗવʹ౷߹ ß ը૾͚͔ͩΒΩϟϓγϣϯΛੜ http://deeplearning.cs.toronto.edu/i2t http://googleresearch.blogspot.jp/2014/11/a-picture-is- worth-thousand-coherent.html 14
ຊͷ࣍ ß ਂֶश͕ࣗવݴޠॲཧʹ༩͑ΔΠϯύ Ϋτ ß ࣗવݴޠॲཧͷ৽ͨͳൃల 15
ࣗવݴޠॲཧͷޭ ß ࣝผϞσϧ Þ λά͖ͭίʔύεΛ༻ҙͯ͠ڭࢣ͋Γֶश Þ ܗଶૉղੳɺݻ༗දݱೝࣝɺߏจղੳɺetc ß ࠷దԽ Þ
ϥϯΩϯάΈ߹Θͤ࠷దԽʹఆࣜԽ Þ Σϒݕࡧɺػց༁ɺจॻཁɺetc 16
ੈքΛڍ͛ͨଟݴޠॲཧͷͨΊͷ ཁૉٕज़ͷݚڀ։ൃ ß CoNLL: Conference on Natural Language Learning ͷڞ௨λεΫʢຖ։࠵ʣ
Þ 2012: ଟݴޠஊղੳ Þ 2009: ଟݴޠߏจɾҙຯղੳ Þ 2006, 2007: ଟݴޠߏจղੳ ß ಉ͡ΞϧΰϦζϜΛෳͷݴޠʹద༻͠ɺ ݴޠʹΑΒͳ͍ղੳख๏Λ୳ٻ 17
Java ʹΑΔଟݴޠॲཧπʔϧ ʢ༻ͷϞσϧϥΠηϯεཁަবʣ ß Stanford CoreNLP (Java) Þ ӳޠɺεϖΠϯޠɺதࠃޠͷܗଶૉղੳɾݻ ༗දݱೝࣝɾߏจղੳɾஊղੳπʔϧ
ß Apache OpenNLP (Java) Þ σϯϚʔΫޠɺυΠπޠɺӳޠɺεϖΠϯޠɺ ΦϥϯμޠɺϙϧτΨϧޠɺεΣʔσϯޠ Λαϙʔτ ß LingPipe (Java) Þ ӳޠʢࢺ༩ɾݻ༗දݱநग़ʣɾதࠃޠ ʢ୯ޠׂʣͷϞσϧ 18
ଟݴޠܗଶૉղੳͷͨΊͷ λά༷ͱίʔύε ß A Universal Part-of-Speech Tagset, Petrov et al.,
LREC 2012. Þ 22ݴޠ: ӳޠɺதࠃޠɺຊޠɺؖࠃޠɺetc Þ ଟݴޠɾݴޠΛ·͍ͨͩߏจղੳͷݚڀ։ൃ ͷͨΊʹɺ·ͣࢺΛҰ؏͚͍ͯͭͨ͠ Þ ຊޠຊޠॻ͖ݴ༿ۉߧίʔύε ʢBCCWJʣͷ୯Ґʹ४ڌͨ͠୯ޠׂ 19
ଟݴޠΓड͚ղੳͷͨΊͷ λά༷ͱίʔύε ß Universal Dependency Annotation for Multilingual Parsing, McDonald
et al., ACL 2013. Þ υΠπޠɾӳޠɾεΣʔσϯޠɾεϖΠϯޠɾ ϑϥϯεޠɾؖࠃޠɾetc Þ ຊޠ Universal Dependencies ͷࢼҊ, ۚࢁΒ, ݴ ޠॲཧֶձ࣍େձ 2015. 20
ࣗવݴޠॲཧͷཁૉٕज़ख़ظ ཁૉٕज़ ਫ਼ ܗଶૉղੳʢ͔ͪॻ͖ʣ 99% ߏจղੳʢΓड͚ʣ 90% ҙຯղੳʢड़ޠ߲ߏʣ 60% ஊղੳʢจΛ͑ͨؔʣ
30% 21 ղ ੳ ͷ ྲྀ Ε จਖ਼ղʹ͢Δͱ5ׂ ཁૉٕज़୯ମͰͷਫ਼্಄ଧͪ ᶃΞϓϦέʔγϣϯʹଈͨ͠ੑೳධՁͷඞཁ ᶄਫ਼Ҏ֎ͷ໘ͰͷΞϐʔϧ
ӳޠͷݴޠղੳ৽ฉهࣄ͔Β ΣϒςΩετ ß Workshop on Syntactic Analysis on Non- Canonical
Language (SANCL 2012) ß Google English Web Treebank (2012) Þ ΣϒςΩετʢϒϩάɺχϡʔεάϧʔϓɺ ϝʔϧɺϦϏϡʔɺQA ʣʹܗଶૉɾߏจʢ Γड͚ʣใΛλά͚ͮ 22
ΣϒςΩετɺΑΓ͍͠ ϢʔβੜܕͷςΩετղੳ ß Tweet NLPʢӳޠͷΈʣ http://www.ark.cs.cmu.edu/TweetNLP/ Þ Twokenizer: ܗଶૉղੳ Þ
Tweeboparser: Γड͚ղੳ Þ Tweebank: Twitter ίʔύε Þ Twitter Word Clusters: ୯ޠΫϥελ 23
ޠऀ͕ॻ͍ͨจ๏తʹਖ਼͍͠ςΩ ετ͔ΒɺݴޠֶशऀͷςΩετ ß 2011લޙ͔ΒຖͷΑ͏ʹӳޠֶशऀ ͷ࡞จͷจ๏ޡΓగਖ਼ڞ௨λεΫ͕։࠵ Þ Helping Our Own (HOO)
2011, 2012 Þ CoNLL 2013, 2014 ß ӳޠֶशऀίʔύεଟϦϦʔε Þ NUS Corpus of Learner English Þ Lang-8 Learner Corpora 24
ݻ༗දݱೝࣝɾޠٛᐆດੑղফ ͔Β entity linking ß ݻ༗දݱೝࣝ Þ ݻ༗දݱͷՕॴΛಉఆ ß
entity linking Þ ݻ༗දݱ͕ԿΛࢦ͔͢ᐆດੑղফ Þ Wikify (Wikification) 25 ҆ഒट૬͕ࣄ࣮ޡೝΛೝΊɺҨ״Λද໌ͨ͠ɻ
ຊͷ·ͱΊ ß ਂֶश͕ݴޠॲཧʹ༩͑ΔΠϯύΫτ Þ ߏจղੳ͔Βҙຯղੳ·Ͱ end-to-end Þ ϚϧνϞʔμϧʢը૾ɾԻɾݴޠʣॲཧ Þ ςΩετੜ͕ࠓޙരൃతʹීٴͦ͠͏
ß ࣗવݴޠॲཧͷ৽ͨͳൃల Þ ݴޠඇґଘͳख๏ͷݕ౼ͱͷੳ Þ ؤ݈ͳղੳख๏ͷࡧ Þ ΣϒͷొʹΑΔݹͯ͘৽͍͠ઃఆ 26