Save 37% off PRO during our Black Friday Sale! »

Scene Text Detection and Recognition: The Deep Learning Era

4090d633ec1a4e0ea495a3662bc49d2b?s=47 Yustoris
March 29, 2019

Scene Text Detection and Recognition: The Deep Learning Era

4090d633ec1a4e0ea495a3662bc49d2b?s=128

Yustoris

March 29, 2019
Tweet

Transcript

 1. Scene Text Detection and Recognition:
 The Deep Learning Era 4IBOHCBOH-POH

  9JO)F $POH:BP !ZVTUPSJTPOBS9JW5JNFT 
 2. ֓ཁ w ৘ܠจࣈೝࣝ 4DFOF5FYU3FDPHOJUJPO ʹ͓͚Δ
 ਂ૚ֶशϕʔεͷख๏ʹର͢ΔαʔϕΠ w ྺ࢙ΛৼΓฦΓͭͭख๏ͷτϨϯυ͔Βσʔληοτ·Ͱɺ
 แׅతʹѻ͍ͬͯΔ

 3. 1. Introduction +
 2. Methodology Before the Deep Learning Era

 4. w ଟ༷ੑ
 ݴޠɾܗ ࣈମɾࣈܗɾॻܗ ɾํ޲ɾ৭ɾॎԣൺ͕ଟ༷ w എܠͷଘࡏ
 എܠͷܗঢ়͕จࣈͱۃ୺ʹࣅ͍ͯΔ৔߹ɺѱӨڹ͕େ͖͍ w ը࣭ͷӨڹ


  ը࣭͕ѱ͍ͱจࣈ෦෼ͷ௵Ε΍ᕷΈ͕େ͖͘ͳΓɺѱӨڹ͕େ͖͍ ৘ܠจࣈೝࣝͷ೉͠͞ <>IUUQTXXXNPSJTBXBDPKQDVMUVSFEJDUJPOBSZΑΓൈਮ <>
 5. ਂ૚ֶशҎલͷ৘ܠจࣈೝࣝ w ಛ௃ྔநग़
 ˠจࣈ୯ҐͰͷநग़
 ˠߦݕग़
 ˠ຋ࣈ w ༷ʑͳϞσϧΛ૊Έ߹ΘͤͨQJQMJOF ࿦จ'JH

 6. 3. Methodology in the Deep Learning Era

 7. ख๏ͷτϨϯυ w 4UFQT
 ݕग़ %FUFDUJPO ೝࣝ 3FDPHOJUJPO ͷஈ֊ w

  %FUFDUJPOʜจࣈྖҬͷநग़ w 3FDPHOJUJPOʜநग़ͨ͠จࣈྖҬʹؚ·ΕΔ಺༰ͷ຋ࣈ 5SBOTDSJQUJPO w &OEUPFOE
 %FUFDUJPOͱ3FDPHOJUJPOΛҰؾ௨؏Ͱߦ͏ ࿦จ'JH
 8. ख๏ͷτϨϯυछผ ࿦จ'JH

 9. ख๏ͷτϨϯυछผ ࿦จ'JH %FUFDUJPO͸
 Ұൠ෺ମݕग़ͷख๏Λجຊͱ͠ɺ
 จࣈྖҬʹ͋Γ͕ͪͳಛ௃ FHํ޲ɾΞεϖΫτͷଟ༷ੑ ʹ߹Θ֦ͤͯு

 10. ख๏ͷτϨϯυछผ ࿦จ'JH 3FDPHOJUJPO͸
 $POOFDUJPOJTU5FNQPSBM$MBTTJpDBUJPO $5$ ͱ"UUFOUJPOͷڧ

 11. ख๏ͷτϨϯυछผ ࿦จ'JH &OEUP&OE͸
 %FUFDUJPOͱ3FDPHOJUJPOͷ྆ϞσϧΛ݁߹

 12. ख๏ͷτϨϯυछผ ࿦จ'JH पลٕज़ "VYJMJBSZ5FDIOPMPHJFT ͷϝΠϯ͸ w ਓ޻σʔλͷੜ੒ w จࣈɾ୯ޠྖҬͷΞϊςʔγϣϯͷ൒ڭࢣ͋Γֶश

 13. 3.1 Detection

 14. ֓ཁ w Ұൠ෺ମݕग़༻ͷϞσϧΛ֦ு͢Δͷ͕جຊ
 େ͖͘"ODIPSCBTFEͱ3FHJPOQSPQPTBMʹ෼ྨͰ͖Δ w ݕग़ཻ౓͸େ͖͘ύλʔϯ ςΩετશମΛ#PVOEJOH#PY ## Ͱݕग़

   ΑΓࡉ͔͍୯ҐͰ ୯ޠͳͲͰ ݕग़͠ɺޙͰ݁߹ 4FH-JOL<4J >࿦จͷը૾͔Β
 ൈਮɾҰ෦Ճ޻
 15. ྖҬݕग़ͷجຊํ਑ w "ODIPSCBTFE w ೖྗը૾Λݻఆͷ(SJEʹ෼ׂ͠ɺ֤(SJEதͷ఺Λத৺ͱ͢Δ## "ODIPS Λෳ਺ਪఆ
 ##ީิ͸ݻఆΞεϖΫτΛ࠾༻ w

  :0-0<3FENPO > ΍44%<-JV > ͳͲ͕ϕʔεϞσϧ w 3FHJPOQSPQPTBM w ೖྗը૾ʹରͯ͠ɺಛ௃ྔͳͲ͔ΒจࣈྖҬީิ 3FHJPOQSPQPTBM Λਪఆ͠ɺ
 ͦΕͧΕͷީิʹରͯ͠จࣈྖҬ͔Ͳ͏͔Λ൑ఆ w 3$//<(JSTIJDL > ͳͲ͕ϕʔεϞσϧ χϡʔϥϧωοτϫʔΫͰྖҬݕग़ˠޙॲཧ
 16. "ODIPSCBTFE (SJE #PVOEJOH#PY ## 
 ͜͜Ͱ͸ͭ ޙஈʹߦ͘΄Ͳ(SJE෼ׂ਺͕ݮΓɺ
 "ODIPS͕େ͖͘ͳΔ ##ݕग़ཻ౓Λௐ੔ QPPMJOHͰ##৘ใΛಘΔ

  ଛࣦ͸ɺਪఆ##ͱਖ਼ղ##ͱͷҐஔޡࠩͱΫϥε֬৴౓ͷࠩ෼ ྫ5FYU#PYFT<-JP >
 44%ϕʔε :0-0࿦จͷը૾͔Β
 ൈਮɾҰ෦Ճ޻
 17. 3FHJPO1SPQPTBM 'BTUFS3$//ʹՃ͑ͯɺ3FHJPOQSPQPTBMநग़ͷࡍɺ3FHJPOͷճసΛߟྀ͍ͯ͠Δ 3FHJPOQSPQPTBMΛநग़ ྫ<.B >
 'BTUFS3$//ϕʔε എܠ͔จࣈྖҬ͔ͷ෼ྨ

 18. 5FYUTQFDJpD.FUIPET w ςΩετશମΛճͰݕग़ͤͣɺখ୯ҐͰݕग़ͨ͠ޙʹ݁߹ w จࣈྖҬ͸Ұൠ෺ମΑΓํ޲ͳͲ͕༷ʑͳͨΊɺ
 ͭͷ##Λ͍͖ͳΓݕग़͢Δͷ͸ෆద੾ͳ৔߹͕͋Δ w ୯Ґ͸จࣈྖҬͷখ෦෼ $PNQPOFOUT ͱϐΫηϧ

  1JYFM ͕͋Δ
 19. $PNQPOFOUT-FWFM 4FH-JOL࿦จ'JHVSF ྫ4FH-JOL

 20. 1JYFM-FWFM 1JYFM-JOL࿦จ'JHVSF 1JYFM-JOL࿦จ'JHVSF ྫ1JYFM-JOL<%FOH > w ֤ϐΫηϧͰɺྡ઀͢ΔͭͷϐΫηϧ͕
 ಉ͡จࣈྖҬʹଐ͢Δ͔Λ൑ఆ w ࣄલͷ##ਪఆ͕͍Βͣɺۙ઀͢ΔจࣈྖҬ΋औΓ΍͍͢

 21. 4QFDJpD5BSHFUT w ؃൘ͳͲʹ͋Γ͕ͪͳɺۃ୺ͳΞεϖΫτൺɾ࿪Έɾ࿷ۂɾಛघϑΥϯτ ΁ͷରԠ͕ϝΠϯ w ྫ͑͹ɺจࣈͷ࿷ۂʹରͯ͠͸5FYU4OBLF<-POH > ͕##୯ҐͰͳ͘ԁΛ ϕʔεͱͨ͠ྖҬநग़ΛࢼΈ͍ͯΔ

 22. 3.2 Recognition

 23. ֓ཁ w %FUFDUJPOͰநग़ͨ͠จࣈྖҬʹରͯ͠຋ࣈΛߦ͏ w 3//ϕʔεͷख๏͕΄ͱΜͲͰɺͦͷதͰ΋
 $5$ $POOFDUJPOJTU5FNQPSBM$MBTTJpDBUJPO ͱ"UUFOUJPO͕
 ଟ͘ར༻͞Ε͍ͯΔ

 24. $5$ <(SBWFT > w @ ۭന ΛؚΊͨจࣈ୯ҐͰͷੜ੒֬཰ΛٻΊΔͨΊͷଛࣦؔ਺ w ೖྗͱग़ྗͷBMJHONFOU΋ಉ࣌ʹߦ͑ΔͨΊɺ
 ೖྗ௕ͱग़ྗ௕ͷҧ͍Λߟ͑ͳͯ͘Α͍

  HHHH_eell_lloo_ Hello ೖྗ௕ ग़ྗ௕
 25. $3// <4IJ > w ಛ௃ϕΫτϧΛೖྗͱͨ͠
 CJ-45. $5$Ͱ຋ࣈΛߦ͏ w 3$//ͱ໊લ͕ࠞಉͦ͠͏ʜʜ $5$Λར༻

  ಛ௃ϕΫτϧΛ-45.ͷ લஈͰநग़ ʨ
 26. "UUFOUJPO w ػց຋༁ʹ͓͚Δ"UUFOUJPO<#BIEBOBV -VPOH > Λԉ༻ w ೖྗը૾ʹରͯ͠લஈͰ৞ΈࠐΈͳͲʹΑΓ
 %FDPEFS΁ͷೖྗͱͳΔಛ௃ϕΫτϧΛநग़͓ͯ͘͠


  <"SCJUSBSJMZPSJFOUFEUFYUSFDPHOJUJPO $IFOH > w %FDPEFS΁ͷิॿೖྗͱͯ͠ɺจࣈ୯Ґͷ##Λ
 ༩͑ΔͳͲͷ޻෉͕औΒΕΔ৔߹΋͋Δ
 <'PDVTJOHBUUFOUJPO5PXBSETBDDVSBUFUFYUSFDPHOJUJPOJOOBUVSBMJNBHFT $IFOH > ೖྗͷಛ௃ϕΫτϧ ࿦จ'JH
 27. 3.3 End-to-end System

 28. ֓ཁ w %FUFDUJPOͱ3FDPHOJUJPOͷϞσϧΛͦͷ··݁߹͢Δ
 %FUFDUJPOϞσϧͰݕग़ͨ͠จࣈྖҬ͕3FDPHOJUJPOϞσϧͷೖྗͱͳΔ
 
 
 w 3FDPHOJUJPOʹ͸ಛ௃Ϛοϓ͚ͩ౉͢Α͏ʹ͢Δ ࿦จ'JH 4&&<#BSU[

  >ͳͲ ࿦จ'JH
 29. 3.4 Auxiliary Technologies

 30. "VYJMJBSZ5FDIOPMPHJFT w ਓ޻σʔλͷੜ੒ 4ZOUIFUJD%BUB w ΄ͱΜͲͷਓखͰΞϊςʔγϣϯ͞Εͨσʔλͷن໛͸਺ઍఔ౓ w എܠը૾ʹରͯ͠ɺΑΓࣗવʹจࣈྖҬΛॏͶΔ͜ͱΛ໨ඪͱ͢Δ w

  ϒʔτετϥοϐϯά #PPUTUSBQQJOH w ൒ڭࢣ͋ΓֶशʹΑΔΞϊςʔγϣϯίετͷܰݮ w গྔͷΞϊςʔγϣϯʹΑΓֶशͨ͠ϞσϧͰྖҬநग़
 ˠείΞͰ଍੾Γˠநग़ͨ͠ྖҬΛڭࢣͱͯ͠࠶౓ֶशˠʜɹͷ܁Γฦ͠
 31. 4.1 Benchmark Datasets

 32. #FODINBSL%BUBTFU w 4ZOUIFUJD%BUB w #PPUTUSBQQJOH

 33. #FODINBSL%BUBTFU w 4ZOUIFUJD%BUB w #PPUTUSBQQJOH

 34. Performance on Dataset (Detection)

 35. Performance on Dataset (Recognition) &SSBUBʹΑΔͱͱΒ͍͠

 36. Performance on Dataset (End-to-End) w8PSE4QPUUJOH
 ର৅ͱͳΔޠኮͷ຋ࣈੑೳ w&OEUP&OE
 ର৅ޠኮҎ֎ͷจશମͷ຋ࣈੑೳ

 37. 6. Conclusion

 38. 4UBUVT2VPBOE'VUVSF5SFOET w σʔληοτ΍Ϟσϧͷଟ༷ੑʹର͢Δؤ݈ੑ w ۂ͕ͬͨ DVSWFE จࣈͳͲɺಛघͳέʔεΛؚΉσʔληοτ͸গͳ͍ w Ϟσϧ΋σʔληοτͷΈʹ࠷దԽͨ͠ධՁ͕ଟ͍ w

  ଟݴޠରԠ
 Ϟσϧ΋σʔληοτ΋ෳ਺ݴޠΛಉ࣌ʹѻ͏͜ͱΛ૝ఆ͍ͯ͠ͳ͍ w ߴ଎Խ
 ਓ͕ؒͻͱ໨ݟͯจࣈΛೝࣝͰ͖Δͷʹରͯ͠ɺ·ͩ·ͩ஗͍
 '14తʹ͸ఔ౓্͕ݶ