Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to using Google Mobile Vision API for OCR

Avatar for Yanina Yanina
December 20, 2016

Intro to using Google Mobile Vision API for OCR

Avatar for Yanina

Yanina

December 20, 2016
Tweet

Other Decks in Technology

Transcript

  1. Agenda •  1) Defini;on •  2) Steps of implementa;on • 

    3) Customiza;ons ? •  4) Advantages and drawbacks •  5) Alterna;ves
  2. What is Mobile Vision API The Mobile Vision API provides

    a framework for finding objects in photos and video. The framework includes detectors, which locate and describe visual objects in images or video frames, and an event driven API that tracks the posi;on of those objects in video. hRps://developers.google.com/vision/ introduc;on
  3. Capabili;es of Mobile Vision API •  The vision package includes

    a framework of common base func7onality, and subpackages for specific detector implementa7ons: •  Common func7onality: com.google.android.gms.vision •  Face detector: com.google.android.gms.vision.face •  Barcode detector: com.google.android.gms.vision.barcode •  Text detector: com.google.android.gms.vision.text
  4. OCR with Google Mobile Vision API •  Op;cal Character Recogni;on

    (OCR) gives a computer the ability to read text that appears in an image, leZng applica;ons make sense of signs, ar;cles, flyers, pages of text, menus, or any other place that text appears as part of an image. •  hRps://codelabs.developers.google.com/ codelabs/mobile-vision-ocr/index.html? index=..%2F..%2Fio2016#0
  5. What it takes •  1) specify dependency •  2)check play

    services •  3) init TextRecognizer •  4) check storage space - If there is low storage, the na1ve library will not be downloaded, so detec1on will not become opera1onal. •  5) check for Camera permission •  6) set camera source •  7) ini;alize OcrDetectorProcessor •  7) result?
  6. Add Google Play Services dependencies •  dependencies { •  …

    •  compile 'com.google.android.gms:play- services-vision:9.0.0+' •  }
  7. Check that play services is present •  int code =

    GoogleApiAvailability.getInstance().isGooglePl ayServicesAvailable( getApplica;onContext()); if (code != Connec;onResult.SUCCESS) { … }
  8. Check or request camera permission •  int rc = Ac;vityCompat.checkSelfPermission(this,

    Manifest.permission.CAMERA); if (rc != PackageManager.PERMISSION_GRANTED) { requestCameraPermission(); }
  9. Set Camera Source CameraSource cameraSource = new CameraSource.Builder(context, ocrDetector) .setAutoFocusEnabled(true)

    .setRequestedFps(REQUESTED_FPS) .setRequestedPreviewSize(PREVIEW_WIDTH, PREVIEW_HEIGHT) .build(); cam
  10. Ini;lize OcrDetectorProcessor public class OcrDetectorProcessor implements Detector.Processor<TextBlock> { @Override public

    void receiveDetec;ons(Detector.Detec;ons<TextBlock> detec;ons) { SparseArray<TextBlock> items = detec;ons.getDetectedItems(); … } @Override public void release() {..} } textRecognizer.setProcessor(new OcrDetectorProcessor();
  11. How do you configure it for a certain language or

    regex? •  hRps://developers.google.com/vision/text- overview
  12. Advantages and drawbacks •  Advantages: •  1) easy to implement

    •  2) good quality •  Drawbacks •  1) Google once shut it down •  2) Space •  4) you have to have play services on your device