Subtitle 30pt / Arial / Normal Rapid Improvements with Deep Learning Clova OCR Detect small, vertical, multilingual texts Japanese vertical Very small Rotated
Agenda 1. Things to Know When Applying Recent OCR Approaches to Japanese 3. Clova OCR Text Recognizer ICCV 2019 2. Clova OCR Text Detector CVPR 2019 4. Full Pipelines for Clova OCR
How to Deal with Long Sentences? Existing papers are based on word / line unit detection Word/Line Level We detect them character by character and combine them! Character Level
Introduction | Why is it So Difficult? Extreme aspect ratio Variety of character sizes Shape distortion à Caused by line/word detection! How about character-level detection?
CRAFT | Definition of Model Outputs Character Region Awareness for Text detection Image (h×w×3) Region score (h/2×w/2×1) Affinity score (h/2×w/2×1) The probability that the given pixel is the center of the character The center probability of the space between adjacent characters
CRAFT | Definition of Model Outputs Character Region Awareness for Text detection Image (h×w×3) Region score (h/2×w/2×1) Affinity score (h/2×w/2×1) The probability that the given pixel is the center of the character The center probability of the space between adjacent characters to find individual character areas to locate line/word level areas
CRAFT | Definition of Model Outputs Ground Truth Label Generation Region Score GT Character Boxes When annotations of character-level are provided Create 2D gaussian distribution for each rectangular shape
CRAFT | Definition of Model Outputs Ground Truth Label Generation Region Score GT Character Boxes Affinity Box Generation Center of a character box Center of a triangle Character box Affinity box When annotations of character-level are provided Create 2D gaussian distribution for each rectangular shape Generate affinity box from adjacent two character-boxes
CRAFT | Definition of Model Outputs Ground Truth Label Generation Region Score GT Character Boxes Affinity Box Generation Center of a character box Center of a triangle Character box Affinity box Affinity Score GT Affinity Boxes When annotations of character-level are provided Create 2D gaussian distribution for each rectangular shape Generate affinity box from adjacent two character-boxes
CRAFT | Training Weakly Supervised Learning Real Image Synthetic Image Loss Loss Train with Real Image Train with Synthetic Image Synthetic GT ? Word/line level annotations Character level annotation
CRAFT | Training Weakly Supervised Learning Real Image Synthetic Image Cropped Splitting Characters Loss Loss Generate Pseudo-GT Train with Real Image Train with Synthetic Image (6/6) (5/7) (5/6) Synthetic GT Word/line level annotations Character level annotation ? Region scores
CRAFT | Training Weakly Supervised Learning Real Image Synthetic Image Cropped Splitting Characters Loss Loss Confidence map Generate Pseudo-GT Train with Real Image Train with Synthetic Image (6/6) (5/7) (5/6) Synthetic GT Pseudo GT Objective function: Word/line level annotations Character level annotation Region scores
Full Pipeline Text Detection Compensate rotation for each box Boxes Angles Text Recognition One Model for Horizontal/Vertical in JPN/KOR/ENG Welcome to JAPAN