Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
文献紹介:Correction of OCR Word Segmentation Errors...
Search
Atsushi
January 30, 2019
0
150
文献紹介:Correction of OCR Word Segmentation Errors in Articles from the ACL Collection through Neural Machine Translation Methods
2019/1/30 文献紹介
長岡技術科学大学
自然言語処理研究室
Atsushi
January 30, 2019
Tweet
Share
More Decks by Atsushi
See All by Atsushi
文献紹介:Automated Evaluation of Out-of-Context Errors
atsumikan
0
92
文献紹介:Auxiliary Objectives for Neural Error Detection Models
atsumikan
0
85
文献紹介:Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection
atsumikan
0
110
文献紹介:Low-resource OCR error detection and correction in French Clinical Texts
atsumikan
0
110
文献紹介:CMMC-BDRC Solution to the NLP-TEA-2018 Chinese Grammatical Error Diagnosis Task
atsumikan
0
120
文献紹介 : Fluency Boost Learning and Inference for Neural Grammatical Error Correction
atsumikan
0
170
文献紹介:語彙の概念化と Wikipediaを用いた英字略語の意味推定方法
atsumikan
0
140
文献紹介:The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction
atsumikan
0
120
文献紹介: Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction
atsumikan
0
160
Featured
See All Featured
Six Lessons from altMBA
skipperchong
28
3.9k
Done Done
chrislema
184
16k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.5k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Unsuck your backbone
ammeep
671
58k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
800
Music & Morning Musume
bryan
46
6.6k
Intergalactic Javascript Robots from Outer Space
tanoku
271
27k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
45
7.5k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.3k
Raft: Consensus for Rubyists
vanstee
140
7k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
2.9k
Transcript
$PSSFDUJPOPG0$38PSE4FHNFOUBUJPO&SSPST JO"SUJDMFTGSPNUIF"$-$PMMFDUJPO UISPVHI/FVSBM.BDIJOF5SBOTMBUJPO.FUIPET Ԭٕज़Պֶେֶࣗવݴޠॲཧݚڀࣨ ੁ३ࢤ จݙհ ݄ 7JWJ/BTUBTF +VMJBO)JUTDIMFS 1SPDFFEJOHTPGUIF&MFWFOUI*OUFSOBUJPOBM$POGFSFODFPO
-BOHVBHF3FTPVSDFTBOE&WBMVBUJPO -3&$
"CTUSBDU w จʹର͢Δ0$3ޡΓͷగਖ਼ w ۭനૠೖޡΓΛमਖ਼͢ΔͨΊʹ4FR4FRϞσϧΛ༻͍ͨ w ςετσʔλʹ͓͍ͯҎ্ͷਫ਼ͱ࠶ݱΛಘͨ !2
*OUSPEVDUJPO w "$-ʹଟ͘ͷՊֶจ͕ଘࡏ͢Δ w จΛମܥԽ͠ཧ͢Δ͜ͱͰ ؔ࿈ݚڀΛݟ͚ͭΔ͜ͱ͕༰қʹͳΔ w จΛମܥԽ͢ΔͨΊʹΩʔϑϨʔζநग़͕ߦΘΕ͍ͯΔ w
ΩʔϑϨʔζநग़ͰςΩετ͕ਖ਼͍͠΄Ͳ ྑ͍݁Ռ͕ಘΒΕΔ !3
*OUSPEVDUJPO w "$-ʹిࢠఏग़͕ඪ४ͱͳΔҎલʹఏग़͞Εͨจ͕ଘࡏ w ͜ΕΒͷจ0$3ʹΑͬͯॲཧ͞Ε͍ͯΔͨΊ ޡΓΛଟؚ͘Ή w ޡΓΛగਖ਼͢Δ͜ͱͰਖ਼͍͠ΩʔϑϨʔζநग़͕ߦ͑Δ w ޡΓʹΑΔΩʔϑϨʔζநग़ͷӨڹɿ
4"''30/TZTUFN #PSEFBFUBM ʹΑͬͯ நग़͞ΕͨΩʔϑϨʔζͷݕࠪʹͯূ໌͞Ε͍ͯΔ !4
5IF"$-$PSSFDUJPO w ରͱ͢Δ"$-ͷจɿ ຊ ʙʣ w ిࢠఏग़͕ඪ४ͱͳͬͨͷ ͦΕҎલͷจ0$3ॲཧ͞Ε͍ͯΔ w
ޡΓͷछྨͱͯ͠ಛʹۭനૠೖޡΓ͕ଟ͘ଘࡏ !5
!6
!7 ஔޡΓ ۭനૠೖޡΓ ࢴͷϨΠΞτͷ߹্ ૠೖ͞ΕͨϋΠϑϯ
.BDIJOFUSBOTMBUJPOCBTFEDPSSFDUJPONPEFM w గਖ਼ରͷޡΓɿۭനૠೖޡΓ w ༻͢ΔϞσϧɿ OFNBUVTTZTUFN 4FOOSJDIFUBM ػց༁λεΫʹ͓͍ͯ4PU"Λୡ
w ೖྗɿ͔ͪॻ͖͞Ε͍ͯͳ͍จ w ग़ྗɿ͔ͪॻ͖͞Εͨ୯ޠྻ w ֬ޯ߱Լ๏ ΫϩεΤϯτϩϐʔ࠷খԽΛ༻͍Δ !8
σʔλͷܗ w 0$3ͰಘΒΕͨจΛܗ ߦ͕ϋΠϑϯͷ߹ϋΠϑϯΛআ ߦ͕υοτ ɾٙූ
ɾίϩϯ Ͱͳ͚ΕվߦΛআ w ߦʹஈམͷςΩετ͕ੜ͞ΕΔ !9
σʔλͷׂ w Ҏ߱ͷσʔλ୯ޠׂ͕ਖ਼͘͠ߦΘΕ͍ͯΔͱ Ծఆ w σʔλΛͭʹׂ͠"͔ΒσʔληοτΛ࡞ w ΑΓલͷσʔλ #
w Ҏ߱ͷσʔλ " !10
σʔληοτͷ࡞ ߦஈམͱͳ͍ͬͯΔจΛ͞Βʹׂ͢Δ w ࣍ͷจจࣈͰׂ w
จ͕จࣈΑΓ͚ΕࣈͰׂ w ͦΕͰจࣈΑΓ͚Ε จࣈͷ͞Ͱׂ !11
σʔληοτͷ࡞ 4FR4FRϞσϧʹೖྗ͢ΔͨΊͷܗࣜʹม w จࣈྻ͔ΒۭനΛআڈ w ֤จࣈͷޙΖʹۭനΛૠೖ ਖ਼ղσʔλΛ࡞͢Δ w
ਖ਼ղσʔλͷۭനΛಛผͳจࣈʢʣʹஔ w ֤จࣈͷޙΖʹۭനΛૠೖ !12
σʔληοτͷྫ !13
࣮ݧ݅ w σʔληοτ w ࡞͞Εͨσʔλɿ จ w ܇࿅σʔλɿ
จ w ςετσʔλɿ จ w ධՁ w ςετσʔλʹ͍ͭͯͷగਖ਼ͷධՁ w Ҏલͷจʹର͢Δగਖ਼ͷධՁ !14
ςετσʔλʹ͍ͭͯͷగਖ਼ͷධՁ w ධՁํ๏ w ਖ਼͔ͪ͘͠ॻ͖͞Εͨ୯ޠͷΛධՁ w 1SFDJTJPO ਖ਼͔ͪ͘͠ॻ͖Ͱ͖ͨ୯ޠγεςϜͷग़ྗ୯ޠ w 3FDBMM
ਖ਼͔ͪ͘͠ॻ͖Ͱ͖ͨ୯ޠਖ਼͍͠จͷ୯ޠ !15
ςετσʔλʹ͍ͭͯͷగਖ਼ͷධՁ 1SFDJTJPO 3FDBMM 1SFDJTJPO 3FDBMM ۭനޡΓશମͰͷείΞ
ࣜதͷۭനޡΓΛআڈͨ͠߹ͷείΞ !16
ςετσʔλʹ͍ͭͯͷగਖ਼ͷධՁ w "ͷσʔληοτͰ܇࿅ͨ͠ϞσϧΛ #ͷσʔλʹద༻͍ͨ͠ w "ͱ#ͰυϝΠϯಉ͕ͩ͡ग़ݱޠኮ͕ ҧ͏Մೳੑ͕͋Δ w ܇࿅ɾ։ൃσʔλʹग़ݱ͠ͳ͍୯ޠͰͷείΞΛධՁ 1SFDJTJPO
3FDBMM !17
Ҏલͷจʹର͢Δగਖ਼ͷධՁ w ਖ਼͍͠จ͕நग़Ͱ͖ͳ͍ͨΊ "ͷσʔλͱ ಉ͡ධՁ͕Ͱ͖ͳ͍ w 4"''30/TZTUFN #PSEFBFUBM ʹΑͬͯ
ಘΒΕͨΩʔϫʔυʹ͍ͭͯධՁΛߦͳͬͨ w ධՁํ๏ w #ͷจ͔Βແ࡞ҝʹຊͷจΛબͼ ֤Ωʔϫʔυͷ୯ޠׂ͕ਖ਼͍͔͠ਓखͰஅ w σʔληοτͷగਖ਼લͱగਖ਼ޙͷจʹରͯ͠ൺֱΛߦͳͬͨ !18
Ҏલͷจʹର͢Δగਖ਼ͷධՁ !19
$PODMVTJPO w 0$3Ͱॲཧ͞Εͨ"$-จͷҰ෦ʹ͋ΔۭനૠೖޡΓΛ 4FR4FRϞσϧΛ༻͍ͯగਖ਼ͨ͠ w ςετͰಘΒΕͨߴ͍είΞɺۭനޡΓͷେ෦Λ "$-σʔληοτΛ༻͍ͯగਖ਼Ͱ͖Δ͜ͱΛࣔ͢ w ͜ͷख๏ʹΑͬͯ"$-จͷਖ਼͍͠όʔδϣϯΛ࡞
!20