Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文字列正規化パタンの獲得と崩れ表記正規化に基づく日本語形態素解析
Search
Atsushi
April 25, 2018
0
130
文字列正規化パタンの獲得と崩れ表記正規化に基づく日本語形態素解析
2018年4月26日 文献紹介
Atsushi
April 25, 2018
Tweet
Share
More Decks by Atsushi
See All by Atsushi
文献紹介:Automated Evaluation of Out-of-Context Errors
atsumikan
0
75
文献紹介:Correction of OCR Word Segmentation Errors in Articles from the ACL Collection through Neural Machine Translation Methods
atsumikan
0
130
文献紹介:Auxiliary Objectives for Neural Error Detection Models
atsumikan
0
66
文献紹介:Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection
atsumikan
0
93
文献紹介:Low-resource OCR error detection and correction in French Clinical Texts
atsumikan
0
73
文献紹介:CMMC-BDRC Solution to the NLP-TEA-2018 Chinese Grammatical Error Diagnosis Task
atsumikan
0
92
文献紹介 : Fluency Boost Learning and Inference for Neural Grammatical Error Correction
atsumikan
0
150
文献紹介:語彙の概念化と Wikipediaを用いた英字略語の意味推定方法
atsumikan
0
120
文献紹介:The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction
atsumikan
0
100
Featured
See All Featured
Designing Dashboards & Data Visualisations in Web Apps
destraynor
225
51k
The Mythical Team-Month
searls
214
42k
Fontdeck: Realign not Redesign
paulrobertlloyd
75
4.8k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
226
16k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
6
930
Making the Leap to Tech Lead
cromwellryan
122
8.4k
Stop Working from a Prison Cell
hatefulcrawdad
265
19k
What's in a price? How to price your products and services
michaelherold
236
11k
Building Adaptive Systems
keathley
29
1.8k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
39
4.3k
We Have a Design System, Now What?
morganepeng
42
6.7k
BBQ
matthewcrist
78
8.7k
Transcript
จࣈྻਖ਼نԽύλϯͷ֫ಘͱ่Εද هਖ਼نԽʹجͮ͘ຊޠܗଶૉղੳ Ԭٕज़Պֶେֶࣗવݴޠॲཧݚڀࣨ ੁ३ࢤ จݙհ ݄ ࣗવݴޠॲཧ 7PM/PQQ ੪౻͍ͭΈ ఃޫ݄
ઙٱࢠ দඌٛത
֓ཁ w ่Εͨຊޠͷղੳʹ͓͍ͯɺܗଶૉղੳࣙॻʹଘࡏ͠ͳ ͍ޠ͕ଟ͘ग़ݱ͢ΔͨΊղੳޡΓ͕૿Ճ͢Δ w Ξϊςʔγϣϯσʔλ͔Β౷ܭతʹநग़ͨ͠จࣈྻਖ਼نԽ ύλϯͱɺจࣈछਖ਼نԽΛ༻͍ͯࣙॻΛ֦ு͢Δ w ࠶ݱɺਫ਼ͱʹߴ͍ղੳ݁ՌΛಘΔ͜ͱ͕Ͱ͖ͨ !2
ରൣғ w ޱޠௐͷ่Εදه w ྫ ʮͬ͢͝ʔ͍ʯ w ҟදه খจࣈԽ ಉԻҟදه
ͻΒ͕ͳԽ ΧλΧφԽ w ྫ ʮϠϰΝΠʯ
ఏҊख๏ͷશମ૾ !4
่Εදهͱਖ਼نදهͷ จࣈྻΞϥΠϝϯτ w ่Εදهʹର͠ਓखͰਖ਼نදهΛ ༩ w ϖΞσʔλ͔ΒจࣈྻϨϕϧͷ ਖ਼نԽύλϯΛ౷ܭతʹநग़ w จࣈྻΞϥΠϝϯτΛ
&.ΞϧΰϦζϜΛ༻͍ͯٻΊΔ #JTBOJBOE/FZ ,VCP ,BXBOBNJ 4BSVXBUBSJ BOE4IJLBOP !5
จࣈྻਖ਼نԽύλϯͱจࣈछมΛ ༻͍ͨجຊࣙॻͷ֦ு w จࣈྻਖ਼نԽύλϯ w ਖ਼نจࣈྻͷআɾஔύλϯ ɹਖ਼نԽࣙॻΛ࡞ w ૠೖύλϯ ਖ਼ن୯ޠͷ͋ΒΏΔՕॴʹૠೖ͞ΕΔ
ࣄલʹల։͢Δ͜ͱඇޮ !6
ܗଶૉϥςΟεੜ w จࣈྻਖ਼نԽͷಈతর߹ͱਖ਼نԽࣙॻɺجຊࣙॻͰܗଶ ૉϥςΟεΛੜ͢Δ w ʮͬʯʮʔʯͷ࿈ଓ จࣈ·Ͱॖ w Իͷ࿈ଓ
จࣈ·Ͱॖ !7
ࣝผϞσϧʹΑΔ දهਖ਼نԽͱܗଶૉղੳͷఆࣜԽ w ܗଶૉϥςΟεΛੜ͢Δࡍɺطଘͷܗଶૉίετࢺ ࿈ίετΛͦͷ··༻͍Δͱෆཁͳީิ͕ଟ͘ੜ͞Ε Δ w ੜͨ͠ܗଶૉϥςΟεʹର͠దͳॏΈ͚͕ඞཁ w ଟ༷ͳΛॊೈʹߟྀ͢Δ͜ͱ͕Ͱ͖ΔߏԽύʔηϓ
τϩϯΛ༻͍ͯදݱ͢Δ $PMMJOT w ֶशͰฏۉԽύʔηϓτϩϯֶशʹج͍ͮͯύϥϝʔλ ਪఆΛߦ͏ !8
ࣝผϞσϧʹΑΔ දهਖ਼نԽͱܗଶૉղੳͷఆࣜԽ !9 J൪ͷ୯ޠ ϊʔυJ͕จࣈྻਖ਼نԽΛ༻͍ͯੜ ͞ΕͨϊʔυͰ͋ΕͱͳΔ J൪ͷࢺ ϊʔυJ͕ͻΒ͕ͳมΛ༻͍ͯੜ ͞ΕͨϊʔυͰ͋ΕͱͳΔ จࣈྻਖ਼نԽޙͷจࣈྻ
ϊʔυJ͕ΧλΧφมΛ༻͍ͯੜ ͞ΕͨϊʔυͰ͋ΕͱͳΔ จࣈྻਖ਼نԽલͷจࣈྻ ରͱͳΔจࣈྻͷKจࣈ
࣮ݧσʔλͱจࣈΞϥΠϝϯτ w จࣈྻਖ਼نԽύλϯਪఆʹ༻ͨ͠σʔλ w ͷϒϩά จ ϖΞ w ʙͷ5XJUUFS
จ ϖΞ w ಘΒΕͨจࣈྻਖ਼نԽύλϯ छྨ !10
࣮ݧσʔλ wਖ਼نޠCJHSBNϞσϧͷߏங wϒϩάͱ৽ฉͷܗଶૉਖ਼ղϥϕϧͳ͠σʔλ .F$BCΛ༻͍ͯࣗಈղੳͨ݁͠Ռ wܗଶૉɺਖ਼نޠਖ਼ղσʔλ wϥϯμϜநग़ͨ͠ʙͷ5XJUUFS จ wςετσʔλ จ wֶशσʔλ
จ !11
ൺֱख๏ .F$BC*1"ࣙॻ จࣈྻਖ਼نԽͷैདྷख๏ ɼࠇڮɼԞଜ ఏҊख๏ !12
࣮ݧ݁Ռ !13
࣮ݧ݁Ռ w ద߹ શମ ɿදه ࢺ ਖ਼نදه͕ͯ͢Ұகׂͨ͠߹ w ѱԽ ͦͷଞͷྨɿΒ
ଞ Λࢀߟ !14
࣮ݧ݁Ռ ۠Γ͕վળ͕ͨ͠ਖ਼نදه͕ޡ͍ͬͯΔ ਖ਼ղͷީิల։͕Ͱ͖͍ͯͳ͍ ਖ਼نԽύλϯෆ ਖ਼ղͷܥྻྻڍͰ͖͍͕ͯͨਖ਼͘͠બͰ͖͍ͯͳ͍
σάϨʔυͯ͠͠·ͬͨ !15
·ͱΊ w 8FC্͔Βऩू่ͨ͠Εਖ਼نදهͷϖΞ͔ΒจࣈྻϨϕ ϧͷਖ਼نԽύλϯΛֶशͨ͠ w நग़ͨ͠ύλϯΛܗଶૉղੳʹಋೖ͢Δ͜ͱʹΑΓ่Εͨ ຊޠͷղੳਫ਼্͕ͨ͠ w ࣮ݧ݁Ռ͔Βղੳਫ਼ͷ্ͱ࠶ݱͷ্ʹ༗ޮͰ͋Δ ͜ͱ͕֬ೝͰ͖ͨ
w จࣈछਖ਼نԽΛΈ߹ΘͤΔ͜ͱʹΑΔ࠶ݱ্ͷޮՌ ͕େ͖͍͜ͱΛ֬ೝͰ͖ͨ !16