Save 37% off PRO during our Black Friday Sale! »

Scene Text Detection and Recognition: The Deep Learning Era

4090d633ec1a4e0ea495a3662bc49d2b?s=47 Yustoris
March 29, 2019

Scene Text Detection and Recognition: The Deep Learning Era

4090d633ec1a4e0ea495a3662bc49d2b?s=128

Yustoris

March 29, 2019
Tweet

Transcript

  1. Scene Text Detection and Recognition:
 The Deep Learning Era 4IBOHCBOH-POH

    9JO)F $POH:BP !ZVTUPSJTPOBS9JW5JNFT 
  2. ֓ཁ w ৘ܠจࣈೝࣝ 4DFOF5FYU3FDPHOJUJPO ʹ͓͚Δ
 ਂ૚ֶशϕʔεͷख๏ʹର͢ΔαʔϕΠ w ྺ࢙ΛৼΓฦΓͭͭख๏ͷτϨϯυ͔Βσʔληοτ·Ͱɺ
 แׅతʹѻ͍ͬͯΔ

  3. 1. Introduction +
 2. Methodology Before the Deep Learning Era

  4. w ଟ༷ੑ
 ݴޠɾܗ ࣈମɾࣈܗɾॻܗ ɾํ޲ɾ৭ɾॎԣൺ͕ଟ༷ w എܠͷଘࡏ
 എܠͷܗঢ়͕จࣈͱۃ୺ʹࣅ͍ͯΔ৔߹ɺѱӨڹ͕େ͖͍ w ը࣭ͷӨڹ


    ը࣭͕ѱ͍ͱจࣈ෦෼ͷ௵Ε΍ᕷΈ͕େ͖͘ͳΓɺѱӨڹ͕େ͖͍ ৘ܠจࣈೝࣝͷ೉͠͞ <>IUUQTXXXNPSJTBXBDPKQDVMUVSFEJDUJPOBSZΑΓൈਮ <>
  5. ਂ૚ֶशҎલͷ৘ܠจࣈೝࣝ w ಛ௃ྔநग़
 ˠจࣈ୯ҐͰͷநग़
 ˠߦݕग़
 ˠ຋ࣈ w ༷ʑͳϞσϧΛ૊Έ߹ΘͤͨQJQMJOF ࿦จ'JH

  6. 3. Methodology in the Deep Learning Era

  7. ख๏ͷτϨϯυ w 4UFQT
 ݕग़ %FUFDUJPO  ೝࣝ 3FDPHOJUJPO ͷஈ֊ w

    %FUFDUJPOʜจࣈྖҬͷநग़ w 3FDPHOJUJPOʜநग़ͨ͠จࣈྖҬʹؚ·ΕΔ಺༰ͷ຋ࣈ 5SBOTDSJQUJPO  w &OEUPFOE
 %FUFDUJPOͱ3FDPHOJUJPOΛҰؾ௨؏Ͱߦ͏ ࿦จ'JH
  8. ख๏ͷτϨϯυछผ ࿦จ'JH

  9. ख๏ͷτϨϯυछผ ࿦จ'JH %FUFDUJPO͸
 Ұൠ෺ମݕग़ͷख๏Λجຊͱ͠ɺ
 จࣈྖҬʹ͋Γ͕ͪͳಛ௃ FHํ޲ɾΞεϖΫτͷଟ༷ੑ ʹ߹Θ֦ͤͯு

  10. ख๏ͷτϨϯυछผ ࿦จ'JH 3FDPHOJUJPO͸
 $POOFDUJPOJTU5FNQPSBM$MBTTJpDBUJPO $5$ ͱ"UUFOUJPOͷڧ

  11. ख๏ͷτϨϯυछผ ࿦จ'JH &OEUP&OE͸
 %FUFDUJPOͱ3FDPHOJUJPOͷ྆ϞσϧΛ݁߹

  12. ख๏ͷτϨϯυछผ ࿦จ'JH पลٕज़ "VYJMJBSZ5FDIOPMPHJFT ͷϝΠϯ͸ w ਓ޻σʔλͷੜ੒ w จࣈɾ୯ޠྖҬͷΞϊςʔγϣϯͷ൒ڭࢣ͋Γֶश

  13. 3.1 Detection

  14. ֓ཁ w Ұൠ෺ମݕग़༻ͷϞσϧΛ֦ு͢Δͷ͕جຊ
 େ͖͘"ODIPSCBTFEͱ3FHJPOQSPQPTBMʹ෼ྨͰ͖Δ w ݕग़ཻ౓͸େ͖͘ύλʔϯ  ςΩετશମΛ#PVOEJOH#PY ## Ͱݕग़

     ΑΓࡉ͔͍୯ҐͰ ୯ޠͳͲͰ ݕग़͠ɺޙͰ݁߹ 4FH-JOL<4J >࿦จͷը૾͔Β
 ൈਮɾҰ෦Ճ޻
  15. ྖҬݕग़ͷجຊํ਑ w "ODIPSCBTFE w ೖྗը૾Λݻఆͷ(SJEʹ෼ׂ͠ɺ֤(SJEதͷ఺Λத৺ͱ͢Δ## "ODIPS Λෳ਺ਪఆ
 ##ީิ͸ݻఆΞεϖΫτΛ࠾༻  w

    :0-0<3FENPO > ΍44%<-JV > ͳͲ͕ϕʔεϞσϧ w 3FHJPOQSPQPTBM w ೖྗը૾ʹରͯ͠ɺಛ௃ྔͳͲ͔ΒจࣈྖҬީิ 3FHJPOQSPQPTBM Λਪఆ͠ɺ
 ͦΕͧΕͷީิʹରͯ͠จࣈྖҬ͔Ͳ͏͔Λ൑ఆ w 3$//<(JSTIJDL > ͳͲ͕ϕʔεϞσϧ χϡʔϥϧωοτϫʔΫͰྖҬݕग़ˠޙॲཧ
  16. "ODIPSCBTFE (SJE #PVOEJOH#PY ## 
 ͜͜Ͱ͸ͭ ޙஈʹߦ͘΄Ͳ(SJE෼ׂ਺͕ݮΓɺ
 "ODIPS͕େ͖͘ͳΔ ##ݕग़ཻ౓Λௐ੔ QPPMJOHͰ##৘ใΛಘΔ

    ଛࣦ͸ɺਪఆ##ͱਖ਼ղ##ͱͷҐஔޡࠩͱΫϥε֬৴౓ͷࠩ෼ ྫ5FYU#PYFT<-JP >
  44%ϕʔε :0-0࿦จͷը૾͔Β
 ൈਮɾҰ෦Ճ޻
  17. 3FHJPO1SPQPTBM 'BTUFS3$//ʹՃ͑ͯɺ3FHJPOQSPQPTBMநग़ͷࡍɺ3FHJPOͷճసΛߟྀ͍ͯ͠Δ 3FHJPOQSPQPTBMΛநग़ ྫ<.B >
  'BTUFS3$//ϕʔε എܠ͔จࣈྖҬ͔ͷ෼ྨ

  18. 5FYUTQFDJpD.FUIPET w ςΩετશମΛճͰݕग़ͤͣɺখ୯ҐͰݕग़ͨ͠ޙʹ݁߹ w จࣈྖҬ͸Ұൠ෺ମΑΓํ޲ͳͲ͕༷ʑͳͨΊɺ
 ͭͷ##Λ͍͖ͳΓݕग़͢Δͷ͸ෆద੾ͳ৔߹͕͋Δ w ୯Ґ͸จࣈྖҬͷখ෦෼ $PNQPOFOUT ͱϐΫηϧ

    1JYFM ͕͋Δ
  19. $PNQPOFOUT-FWFM 4FH-JOL࿦จ'JHVSF ྫ4FH-JOL

  20. 1JYFM-FWFM 1JYFM-JOL࿦จ'JHVSF 1JYFM-JOL࿦จ'JHVSF ྫ1JYFM-JOL<%FOH > w ֤ϐΫηϧͰɺྡ઀͢ΔͭͷϐΫηϧ͕
 ಉ͡จࣈྖҬʹଐ͢Δ͔Λ൑ఆ w ࣄલͷ##ਪఆ͕͍Βͣɺۙ઀͢ΔจࣈྖҬ΋औΓ΍͍͢

  21. 4QFDJpD5BSHFUT w ؃൘ͳͲʹ͋Γ͕ͪͳɺۃ୺ͳΞεϖΫτൺɾ࿪Έɾ࿷ۂɾಛघϑΥϯτ ΁ͷରԠ͕ϝΠϯ w ྫ͑͹ɺจࣈͷ࿷ۂʹରͯ͠͸5FYU4OBLF<-POH > ͕##୯ҐͰͳ͘ԁΛ ϕʔεͱͨ͠ྖҬநग़ΛࢼΈ͍ͯΔ

  22. 3.2 Recognition

  23. ֓ཁ w %FUFDUJPOͰநग़ͨ͠จࣈྖҬʹରͯ͠຋ࣈΛߦ͏ w 3//ϕʔεͷख๏͕΄ͱΜͲͰɺͦͷதͰ΋
 $5$ $POOFDUJPOJTU5FNQPSBM$MBTTJpDBUJPO ͱ"UUFOUJPO͕
 ଟ͘ར༻͞Ε͍ͯΔ

  24. $5$ <(SBWFT > w @ ۭന ΛؚΊͨจࣈ୯ҐͰͷੜ੒֬཰ΛٻΊΔͨΊͷଛࣦؔ਺ w ೖྗͱग़ྗͷBMJHONFOU΋ಉ࣌ʹߦ͑ΔͨΊɺ
 ೖྗ௕ͱग़ྗ௕ͷҧ͍Λߟ͑ͳͯ͘Α͍

    HHHH_eell_lloo_ Hello ೖྗ௕ ग़ྗ௕
  25. $3// <4IJ > w ಛ௃ϕΫτϧΛೖྗͱͨ͠
 CJ-45. $5$Ͱ຋ࣈΛߦ͏ w 3$//ͱ໊લ͕ࠞಉͦ͠͏ʜʜ $5$Λར༻

    ಛ௃ϕΫτϧΛ-45.ͷ લஈͰநग़ ʨ
  26. "UUFOUJPO w ػց຋༁ʹ͓͚Δ"UUFOUJPO<#BIEBOBV  -VPOH > Λԉ༻ w ೖྗը૾ʹରͯ͠લஈͰ৞ΈࠐΈͳͲʹΑΓ
 %FDPEFS΁ͷೖྗͱͳΔಛ௃ϕΫτϧΛநग़͓ͯ͘͠


    <"SCJUSBSJMZPSJFOUFEUFYUSFDPHOJUJPO  $IFOH > w %FDPEFS΁ͷิॿೖྗͱͯ͠ɺจࣈ୯Ґͷ##Λ
 ༩͑ΔͳͲͷ޻෉͕औΒΕΔ৔߹΋͋Δ
 <'PDVTJOHBUUFOUJPO5PXBSETBDDVSBUFUFYUSFDPHOJUJPOJOOBUVSBMJNBHFT  $IFOH > ೖྗͷಛ௃ϕΫτϧ ࿦จ'JH
  27. 3.3 End-to-end System

  28. ֓ཁ w %FUFDUJPOͱ3FDPHOJUJPOͷϞσϧΛͦͷ··݁߹͢Δ
 %FUFDUJPOϞσϧͰݕग़ͨ͠จࣈྖҬ͕3FDPHOJUJPOϞσϧͷೖྗͱͳΔ
 
 
 w 3FDPHOJUJPOʹ͸ಛ௃Ϛοϓ͚ͩ౉͢Α͏ʹ͢Δ ࿦จ'JH 4&&<#BSU[

    >ͳͲ ࿦จ'JH
  29. 3.4 Auxiliary Technologies

  30. "VYJMJBSZ5FDIOPMPHJFT w ਓ޻σʔλͷੜ੒ 4ZOUIFUJD%BUB  w ΄ͱΜͲͷਓखͰΞϊςʔγϣϯ͞Εͨσʔλͷن໛͸਺ઍఔ౓ w എܠը૾ʹରͯ͠ɺΑΓࣗવʹจࣈྖҬΛॏͶΔ͜ͱΛ໨ඪͱ͢Δ w

    ϒʔτετϥοϐϯά #PPUTUSBQQJOH  w ൒ڭࢣ͋ΓֶशʹΑΔΞϊςʔγϣϯίετͷܰݮ w গྔͷΞϊςʔγϣϯʹΑΓֶशͨ͠ϞσϧͰྖҬநग़
 ˠείΞͰ଍੾Γˠநग़ͨ͠ྖҬΛڭࢣͱͯ͠࠶౓ֶशˠʜɹͷ܁Γฦ͠
  31. 4.1 Benchmark Datasets

  32. #FODINBSL%BUBTFU w 4ZOUIFUJD%BUB w #PPUTUSBQQJOH

  33. #FODINBSL%BUBTFU w 4ZOUIFUJD%BUB w #PPUTUSBQQJOH

  34. Performance on Dataset (Detection)

  35. Performance on Dataset (Recognition) &SSBUBʹΑΔͱͱΒ͍͠

  36. Performance on Dataset (End-to-End) w8PSE4QPUUJOH
 ର৅ͱͳΔޠኮͷ຋ࣈੑೳ w&OEUP&OE
 ର৅ޠኮҎ֎ͷจશମͷ຋ࣈੑೳ

  37. 6. Conclusion

  38. 4UBUVT2VPBOE'VUVSF5SFOET w σʔληοτ΍Ϟσϧͷଟ༷ੑʹର͢Δؤ݈ੑ w ۂ͕ͬͨ DVSWFE จࣈͳͲɺಛघͳέʔεΛؚΉσʔληοτ͸গͳ͍ w Ϟσϧ΋σʔληοτͷΈʹ࠷దԽͨ͠ධՁ͕ଟ͍ w

    ଟݴޠରԠ
 Ϟσϧ΋σʔληοτ΋ෳ਺ݴޠΛಉ࣌ʹѻ͏͜ͱΛ૝ఆ͍ͯ͠ͳ͍ w ߴ଎Խ
 ਓ͕ؒͻͱ໨ݟͯจࣈΛೝࣝͰ͖Δͷʹରͯ͠ɺ·ͩ·ͩ஗͍
 '14తʹ͸ఔ౓্͕ݶ