Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Frotiers of Natural Language Processing
Search
Mamoru Komachi
April 23, 2015
Technology
0
20
Frotiers of Natural Language Processing
Recruit Technologies Open Lab #01 (テーマ: 自然言語処理)で話したときに使ったスライドです。
https://atnd.org/events/64383
Mamoru Komachi
April 23, 2015
Tweet
Share
More Decks by Mamoru Komachi
See All by Mamoru Komachi
IM2024
mamoruk
0
290
大規模言語モデルのインパクトと課題/oc2023
mamoruk
0
55
Exploring and Adapting Chinese GPT to Pinyin Input Method
mamoruk
0
130
Recent advances in natural language understanding and natural language generation
mamoruk
0
120
Introduction to Natural Language Processing
mamoruk
0
46
Generative Adversarial Network for Natural Language Processing
mamoruk
0
52
Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning
mamoruk
2
760
Sequence-to-Dependency Neural Machine Translation
mamoruk
0
47
Visualizing and Understanding Neural Machine Translation
mamoruk
0
43
Other Decks in Technology
See All in Technology
Кто отправит outbox? Валентин Удальцов, автор канала Пых
lamodatech
0
330
SalesforceArchitectGroupOsaka#20_CNX'25_Report
atomica7sei
0
140
BrainPadプログラミングコンテスト記念LT会2025_社内イベント&問題解説
brainpadpr
1
160
Welcome to the LLM Club
koic
0
160
Wasm元年
askua
0
140
フィンテック養成勉強会#54
finengine
0
170
Model Mondays S2E02: Model Context Protocol
nitya
0
220
本当に使える?AutoUpgrade の新機能を実践検証してみた
oracle4engineer
PRO
1
140
250627 関西Ruby会議08 前夜祭 RejectKaigi「DJ on Ruby Ver.0.1」
msykd
PRO
2
250
Amazon S3標準/ S3 Tables/S3 Express One Zoneを使ったログ分析
shigeruoda
3
460
mrubyと micro-ROSが繋ぐロボットの世界
kishima
2
170
地図も、未来も、オープンに。 〜OSGeo.JPとFOSS4Gのご紹介〜
wata909
0
110
Featured
See All Featured
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
16
940
Bash Introduction
62gerente
614
210k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
5
210
Mobile First: as difficult as doing things right
swwweet
223
9.7k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
35
2.3k
KATA
mclloyd
29
14k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
281
13k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.8k
A Tale of Four Properties
chriscoyier
160
23k
Why Our Code Smells
bkeepers
PRO
337
57k
Docker and Python
trallard
44
3.4k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Transcript
ࣗવݴޠॲཧͷ৽ల։ 20154݄21 टେֶ౦ژ γεςϜσβΠϯֶ෦ খொक
ࣗݾհ: খொकʢ͜·ͪ·Δʣ 2 ß 2005.03 ౦ژେֶڭཆֶ෦جૅՊֶՊ Պֶ࢙ɾՊֶֶՊଔۀ ß 2010.03 ಸྑઌେɾത࢜ޙظ՝ఔमྃ
ത࢜ʢֶʣ ઐ: ࣗવݴޠॲཧ ß 2010.04ʙ2013.03 ಸྑઌେ ॿڭʢদຊ༟࣏ݚڀࣨʣ ß 2013.04〜 टେֶ౦ژ ।ڭतʢࣗવݴޠॲཧݚڀࣨʣ
ຊͷ࣍ ß ਂֶश͕ࣗવݴޠॲཧʹ༩͑ΔΠϯύ Ϋτ ß ࣗવݴޠॲཧͷ৽ͨͳൃల 3
ਂֶशʢdeep learningʣ ß ෳϨΠϠʔͷχϡʔϥϧωοτϫʔΫ ʹΑͬͯෳࡶͳϞσϧΛֶश͢ΔΈ ß ༷ʑͳύλʔϯೝࣝλεΫͰେ෯ͳੑೳ ্Λୡ͠ɺGoogle, Facebook, Microsoft,
Baidu ͳͲ͞·͟·ͳاۀ͕͜ ͧͬͯݚڀ 4
Lee et al., ICML 2009. 5
ਂֶशͷॴ ß ૉੑֶʢfeature engineeringʣ͕ෆཁɻ ϥϕϧͳ͠σʔλ͔Βࣗಈతʹ༗ޮͳૉ ੑͷΈ߹Θֶ͕ͤशՄೳɻ →ϋΠύʔύϥϝʔλଘࡏ ß σʔλ͔ΒେҬతͳදݱֶशʢdistributed representationʣ͕Մೳ
→ΫϥελϦϯάہॴతͳදݱֶश 6
χϡʔϥϧωοτϫʔΫ ͷϒϨΠΫεϧʔ ß Hinton et al., A Fast Learning Algorithm
for Deep Belief Nets, Neural Computing, 2006. ß χϡʔϥϧωοτϫʔΫ1950͔Β ͕͋ͬͨɺදݱೳྗ͕ߴ͗ͯ͢ʢσʔλ ྔʹରͯ͠ʣաֶशʹͳΓ͔ͬͨ͢ɻ →͝ͱʹֶशΛߦ͍ɺෳΛॏͶΔ ͜ͱͰաֶशͷ͕ղܾͰ͖ͨʂ 7
࠶ؼతχϡʔϥϧωοτϫʔΫ Λ༻͍ͨը૾ೝࣝͱߏจղੳ 8 • Parsing Natural Scenes and Natural Language
with Recursive Neural Networks, Socher et al., ICML 2011. • ྡ͢Δը૾ྖҬɾ୯ ޠ͔Β࠶ؼతʹߏΛ ೝࣝ͢Δ →Staford Parser ʹ౷ ߹ (ACL 2013)
࠶ؼతχϡʔϥϧωοτϫʔΫͰ ϑϨʔζͷײۃੑྨ࣮ݱ 9 • Recursive Deep Models for Semantic Compositionality
Over a Sentiment Treebank, Socher et al., EMNLP 2013.
Socher et al. (NIPS 2011): ୯ޠϕΫ τϧ͔ΒจͷҙຯΛ࠶ؼతʹܭࢉ 10
ϦΧϨϯτχϡʔϥϧωοτ ϫʔΫͰແݶͷจ຺ΛߟྀՄೳ 11 • Recurrent Neural Network based Language Model,
Mikolov et al., InterSpeech 2010. →աڈͷཤྺΛߟྀͯ͠ݱࡏͷ୯ޠΛ༧ଌ͢ΔϞσϧ
ػց༁ܥྻ͔ΒܥྻΛੜ͢ ΔϞσϧͱͯ͠ਂֶशͰѻ͑Δ ß Sequence to Sequence Learning with Neural Networks,
Sutskever et al., NIPS 2014. →LSTM (Long-Short Term Memory) Λ2ͭ༻ ͍ɺೖྗܥྻΛݻఆͷϕΫτϧʹม ͠ɺͦͷϕΫτϧ͔Βग़ྗܥྻΛੜ 12
จࣈ͚͔ͩΒਂֶशͰςΩετ ྨϓϩάϥϜ͕Ͱ͖ͯ͠·͏ ß Text Understanding from Scratch, Zhang and LeCun,
arXiv 2015. →จࣈ͚͔ͩΒதӳͷςΩετྨثΛֶश ß Learning to Execute, Zaremba and Sutskever, arXiv 2015. →RNNͱLTSM͚͔ͩΒPythonϓϩάϥϜΛ ʮֶशʯ࣮ͯ͠ߦ 13
ਂֶशΛͬͯϚϧνϞʔμϧ ͳೖग़ྗΛࣗવʹ౷߹ ß ը૾͚͔ͩΒΩϟϓγϣϯΛੜ http://deeplearning.cs.toronto.edu/i2t http://googleresearch.blogspot.jp/2014/11/a-picture-is- worth-thousand-coherent.html 14
ຊͷ࣍ ß ਂֶश͕ࣗવݴޠॲཧʹ༩͑ΔΠϯύ Ϋτ ß ࣗવݴޠॲཧͷ৽ͨͳൃల 15
ࣗવݴޠॲཧͷޭ ß ࣝผϞσϧ Þ λά͖ͭίʔύεΛ༻ҙͯ͠ڭࢣ͋Γֶश Þ ܗଶૉղੳɺݻ༗දݱೝࣝɺߏจղੳɺetc ß ࠷దԽ Þ
ϥϯΩϯάΈ߹Θͤ࠷దԽʹఆࣜԽ Þ Σϒݕࡧɺػց༁ɺจॻཁɺetc 16
ੈքΛڍ͛ͨଟݴޠॲཧͷͨΊͷ ཁૉٕज़ͷݚڀ։ൃ ß CoNLL: Conference on Natural Language Learning ͷڞ௨λεΫʢຖ։࠵ʣ
Þ 2012: ଟݴޠஊղੳ Þ 2009: ଟݴޠߏจɾҙຯղੳ Þ 2006, 2007: ଟݴޠߏจղੳ ß ಉ͡ΞϧΰϦζϜΛෳͷݴޠʹద༻͠ɺ ݴޠʹΑΒͳ͍ղੳख๏Λ୳ٻ 17
Java ʹΑΔଟݴޠॲཧπʔϧ ʢ༻ͷϞσϧϥΠηϯεཁަবʣ ß Stanford CoreNLP (Java) Þ ӳޠɺεϖΠϯޠɺதࠃޠͷܗଶૉղੳɾݻ ༗දݱೝࣝɾߏจղੳɾஊղੳπʔϧ
ß Apache OpenNLP (Java) Þ σϯϚʔΫޠɺυΠπޠɺӳޠɺεϖΠϯޠɺ ΦϥϯμޠɺϙϧτΨϧޠɺεΣʔσϯޠ Λαϙʔτ ß LingPipe (Java) Þ ӳޠʢࢺ༩ɾݻ༗දݱநग़ʣɾதࠃޠ ʢ୯ޠׂʣͷϞσϧ 18
ଟݴޠܗଶૉղੳͷͨΊͷ λά༷ͱίʔύε ß A Universal Part-of-Speech Tagset, Petrov et al.,
LREC 2012. Þ 22ݴޠ: ӳޠɺதࠃޠɺຊޠɺؖࠃޠɺetc Þ ଟݴޠɾݴޠΛ·͍ͨͩߏจղੳͷݚڀ։ൃ ͷͨΊʹɺ·ͣࢺΛҰ؏͚͍ͯͭͨ͠ Þ ຊޠຊޠॻ͖ݴ༿ۉߧίʔύε ʢBCCWJʣͷ୯Ґʹ४ڌͨ͠୯ޠׂ 19
ଟݴޠΓड͚ղੳͷͨΊͷ λά༷ͱίʔύε ß Universal Dependency Annotation for Multilingual Parsing, McDonald
et al., ACL 2013. Þ υΠπޠɾӳޠɾεΣʔσϯޠɾεϖΠϯޠɾ ϑϥϯεޠɾؖࠃޠɾetc Þ ຊޠ Universal Dependencies ͷࢼҊ, ۚࢁΒ, ݴ ޠॲཧֶձ࣍େձ 2015. 20
ࣗવݴޠॲཧͷཁૉٕज़ख़ظ ཁૉٕज़ ਫ਼ ܗଶૉղੳʢ͔ͪॻ͖ʣ 99% ߏจղੳʢΓड͚ʣ 90% ҙຯղੳʢड़ޠ߲ߏʣ 60% ஊղੳʢจΛ͑ͨؔʣ
30% 21 ղ ੳ ͷ ྲྀ Ε จਖ਼ղʹ͢Δͱ5ׂ ཁૉٕज़୯ମͰͷਫ਼্಄ଧͪ ᶃΞϓϦέʔγϣϯʹଈͨ͠ੑೳධՁͷඞཁ ᶄਫ਼Ҏ֎ͷ໘ͰͷΞϐʔϧ
ӳޠͷݴޠղੳ৽ฉهࣄ͔Β ΣϒςΩετ ß Workshop on Syntactic Analysis on Non- Canonical
Language (SANCL 2012) ß Google English Web Treebank (2012) Þ ΣϒςΩετʢϒϩάɺχϡʔεάϧʔϓɺ ϝʔϧɺϦϏϡʔɺQA ʣʹܗଶૉɾߏจʢ Γड͚ʣใΛλά͚ͮ 22
ΣϒςΩετɺΑΓ͍͠ ϢʔβੜܕͷςΩετղੳ ß Tweet NLPʢӳޠͷΈʣ http://www.ark.cs.cmu.edu/TweetNLP/ Þ Twokenizer: ܗଶૉղੳ Þ
Tweeboparser: Γड͚ղੳ Þ Tweebank: Twitter ίʔύε Þ Twitter Word Clusters: ୯ޠΫϥελ 23
ޠऀ͕ॻ͍ͨจ๏తʹਖ਼͍͠ςΩ ετ͔ΒɺݴޠֶशऀͷςΩετ ß 2011લޙ͔ΒຖͷΑ͏ʹӳޠֶशऀ ͷ࡞จͷจ๏ޡΓగਖ਼ڞ௨λεΫ͕։࠵ Þ Helping Our Own (HOO)
2011, 2012 Þ CoNLL 2013, 2014 ß ӳޠֶशऀίʔύεଟϦϦʔε Þ NUS Corpus of Learner English Þ Lang-8 Learner Corpora 24
ݻ༗දݱೝࣝɾޠٛᐆດੑղফ ͔Β entity linking ß ݻ༗දݱೝࣝ Þ ݻ༗දݱͷՕॴΛಉఆ ß
entity linking Þ ݻ༗දݱ͕ԿΛࢦ͔͢ᐆດੑղফ Þ Wikify (Wikification) 25 ҆ഒट૬͕ࣄ࣮ޡೝΛೝΊɺҨ״Λද໌ͨ͠ɻ
ຊͷ·ͱΊ ß ਂֶश͕ݴޠॲཧʹ༩͑ΔΠϯύΫτ Þ ߏจղੳ͔Βҙຯղੳ·Ͱ end-to-end Þ ϚϧνϞʔμϧʢը૾ɾԻɾݴޠʣॲཧ Þ ςΩετੜ͕ࠓޙരൃతʹීٴͦ͠͏
ß ࣗવݴޠॲཧͷ৽ͨͳൃల Þ ݴޠඇґଘͳख๏ͷݕ౼ͱͷੳ Þ ؤ݈ͳղੳख๏ͷࡧ Þ ΣϒͷొʹΑΔݹͯ͘৽͍͠ઃఆ 26