Seeing is Believing: Mobile Vision API Deep Dive

Seeing is Believing: Mobile Vision API Deep Dive

Did you know that your mobile device can recognize objects in photos and videos? It can detect faces, barcodes and text in an image and even detect whether those faces are smiling or winking! All this is possible because of the Mobile Vision API which finds objects in photos and videos on mobile devices, using real-time on-device vision technology. This API will enable you edit photos and videos, automate text data entry, embed barcode scanners into your applications and even build your own version of a Stories application with lots of filters. In this talk, you’ll learn how to use the Mobile Vision API to get more out of your photos and videos. You’ll get a jumpstart on implementing Machine Learning in your apps with the power of this API!

E803718649600ddffc1bc625d957e786?s=128

Moyinoluwa Adeyemi

September 25, 2017
Tweet

Transcript

  1. Seeing is Believing Mobile Vision API Deep Dive

  2. Moyinoluwa Adeyemi Off Grid Electric @moyheen

  3. Mobile Vision API • Finds objects in photos and videos

  4. • Finds objects in photos and videos • Includes face,

    barcode and text detectors Mobile Vision API
  5. • Finds objects in photos and videos • Includes face,

    barcode and text detectors • Easy to set up Mobile Vision API
  6. • Finds objects in photos and videos • Includes face,

    barcode and text detectors • Easy to set up • Works locally Mobile Vision API
  7. • Finds objects in photos and videos • Includes face,

    barcode and text detectors • Easy to set up • Works locally • Available offline Mobile Vision API
  8. • Finds objects in photos and videos • Includes face,

    barcode and text detectors • Easy to set up • Works locally • Available offline • Free Mobile Vision API
  9. None
  10. Face Detection

  11. Detects human faces Sorry dawg, humans only.

  12. Euler X - up/down Euler Y - left/right Euler Z

    - rotated/slated Understands faces positioned at different angles https://developers.google.com/vision/face-detection-concepts
  13. https://pixabay.com/en/woman-stylish-fashion-view-101542/ Detects landmarks Left eye - 4 Right eye -

    10 Nose base - 6 Left cheek - 1 Right cheek - 7 Left, Right and Bottom Mouth - 5, 11, 0
  14. https://pixabay.com/en/woman-stylish-fashion-view-101542/ Understands facial expressions isSmilingProbability: 0.006698033

  15. https://pixabay.com/en/woman-stylish-fashion-view-101542/ Understands facial expressions isSmilingProbability: 0.006698033 isLeftEyeOpenProbability: 0.98714304

  16. https://pixabay.com/en/woman-stylish-fashion-view-101542/ Understands facial expressions isSmilingProbability: 0.006698033 isLeftEyeOpenProbability: 0.98714304 isRightEyeOpenProbability: 0.69178355

  17. Works on all skin colors

  18. Barcode Detection

  19. Detects barcodes in images and videos

  20. Works on both 1D and 2D barcodes https://www.adazonusa.com/blog/wp-content/uploads/2016/03/1D-barcode-vs-2D-barcodes.jpg

  21. Detects multiple barcodes in one image

  22. Even when they are upside down

  23. Text Detection

  24. Detects texts in image and video #TIL

  25. Segments text into block, lines and words https://developers.google.com/vision/images/text-structure.png

  26. Spanish English Hungarian Norwegian German Dutch French Catalan Portugese Romanian

    Polish Danish Finnish Italian Swedish Turkish Works only on Latin based languages
  27. Has a wide range of applications

  28. Getting Started

  29. None
  30. None
  31. None
  32. dependencies { compile 'com.google.android.gms:play-services-vision:11.2.2' } Update the gradle file

  33. <uses-feature android:name="android.hardware.camera" /> <uses-permission android:name="android.permission.CAMERA" /> Update the manifest file

  34. <uses-feature android:name="android.hardware.camera" /> <uses-permission android:name="android.permission.CAMERA" /> <meta-data android:name="com.google.android.gms.vision.DEPENDENCIES" android:value="face, barcode,

    text" /> Update the manifest file
  35. <uses-feature android:name="android.hardware.camera" /> <uses-permission android:name="android.permission.CAMERA" /> <meta-data android:name="com.google.android.gms.vision.DEPENDENCIES" android:value="face, barcode,

    text" /> <meta-data android:name="com.google.android.gms.version" android:value="@integer/google_play_services_version"/> Update the manifest file
  36. Run only on a background thread CAUTION DO NOT RUN

    ON UI THREAD
  37. faceDetector.release() barcodeDetector.release() textRecognizer.release() Release resources at the end

  38. GIVEN any photo WHEN any face is detected THEN overlay

    image and graphics
  39. Convert photo to a mutable bitmap val bitmapOptions = BitmapFactory.Options().apply

    { inMutable = true }
  40. Convert photo to a mutable bitmap val bitmapOptions = BitmapFactory.Options().apply

    { inMutable = true } val bitmap = BitmapFactory.decodeResource(resources, R.drawable.image, bitmapOptions)
  41. val tempBitmap = Bitmap.createBitmap(bitmap.width, bitmap.height, Bitmap.Config.RGB_565) Prepare the canvas

  42. val tempBitmap = Bitmap.createBitmap(bitmap.width, bitmap.height, Bitmap.Config.RGB_565) val paint = Paint().apply

    { strokeWidth = 5f color = Color.MAGENTA style = Paint.Style.STROKE } Prepare the canvas
  43. val tempBitmap = Bitmap.createBitmap(bitmap.width, bitmap.height, Bitmap.Config.RGB_565) val paint = Paint().apply

    { strokeWidth = 5f color = Color.MAGENTA style = Paint.Style.STROKE } val canvas = Canvas(tempBitmap) canvas.drawBitmap(bitmap, 0f, 0f, paint) Prepare the canvas
  44. val faceDetector = FaceDetector.Builder(this) .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS) .setTrackingEnabled(false) .setProminentFaceOnly(false) .setLandmarkType(FaceDetector.ALL_LANDMARKS) .setMode(FaceDetector.ACCURATE_MODE) .setMinFaceSize(0.2f)

    .build() Set up the face detector
  45. val faceDetector = FaceDetector.Builder(this) .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS) .setTrackingEnabled(false) .setProminentFaceOnly(false) .setLandmarkType(FaceDetector.ALL_LANDMARKS) .setMode(FaceDetector.ACCURATE_MODE) .setMinFaceSize(0.2f)

    .build() Set up the face detector
  46. val faceDetector = FaceDetector.Builder(this) .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS) .setTrackingEnabled(false) .setProminentFaceOnly(false) .setLandmarkType(FaceDetector.ALL_LANDMARKS) .setMode(FaceDetector.ACCURATE_MODE) .setMinFaceSize(0.2f)

    .build() Set up the face detector
  47. val faceDetector = FaceDetector.Builder(this) .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS) .setTrackingEnabled(false) .setProminentFaceOnly(false) .setLandmarkType(FaceDetector.ALL_LANDMARKS) .setMode(FaceDetector.ACCURATE_MODE) .setMinFaceSize(0.2f)

    .build() Set up the face detector
  48. val faceDetector = FaceDetector.Builder(this) .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS) .setTrackingEnabled(false) .setProminentFaceOnly(false) .setLandmarkType(FaceDetector.ALL_LANDMARKS) .setMode(FaceDetector.ACCURATE_MODE) .setMinFaceSize(0.2f)

    .build() Set up the face detector
  49. val faceDetector = FaceDetector.Builder(this) .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS) .setTrackingEnabled(false) .setProminentFaceOnly(false) .setLandmarkType(FaceDetector.ALL_LANDMARKS) .setMode(FaceDetector.ACCURATE_MODE) .setMinFaceSize(0.2f)

    .build() Set up the face detector
  50. val faceDetector = FaceDetector.Builder(this) .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS) .setTrackingEnabled(false) .setProminentFaceOnly(false) .setLandmarkType(FaceDetector.ALL_LANDMARKS) .setMode(FaceDetector.ACCURATE_MODE) .setMinFaceSize(0.2f)

    .build() Set up the face detector
  51. if (!faceDetector.isOperational) { toast("Face Detector could not be set up

    on your device :(") return } Confirm the operationality
  52. val frame = Frame.Builder().setBitmap(bitmap).build() val faceArray = faceDetector.detect(frame) Detect faces

  53. Calculate coordinates for (i in 0 until faceArray.size()) { val

    face = faceArray.valueAt(i) val left = face.position.x val top = face.position.y val right = left + face.width val bottom = top + face.height } https://pixabay.com/en/woman-stylish-fashion-view-101542/ (x, y) left, top (x + width, y + height) right, bottom
  54. Draw graphics on face for (i in 0 until faceArray.size())

    { val face = faceArray.valueAt(i) ... val bound = RectF(left, top, right, a bottom) canvas.drawRoundRect(bound, c cornerRadius, cornerRadius, paint) } https://pixabay.com/en/woman-stylish-fashion-view-101542/
  55. Draw on all landmarks for (landmark in face.landmarks) { val

    x = landmark.position.x val y = landmark.position.y val landmarkType = landmark.type.toString() C canvas.drawText(landmarkType, x, y, p paint) } https://pixabay.com/en/woman-stylish-fashion-view-101542/
  56. Draw on a specific landmark for (landmark in face.landmarks) {

    val x = landmark.position.x val y = landmark.position.y when (landmark.type) { 1, 7 -> canvas.drawCircle(x, y, r radius, paint) } } https://pixabay.com/en/woman-stylish-fashion-view-101542/
  57. Detect classification for (i in 0 until faceArray.size()) { val

    face = faceArray.valueAt(i) // face.isSmilingProbability // face.isLeftEyeOpenProbability // face.isRightEyeOpenProbability when (face.isSmilingProbability) { in 0.0f..0.49f -> // TODO in 0.5f..1f -> payForFriedChicken() } ... https://pixabay.com/en/woman-stylish-fashion-view-101542/
  58. Detect position for (i in 0 until faceArray.size()) { val

    face = faceArray.valueAt(i) // face.eulerY // face.eulerZ Log.i(TAG, face.eulerY.toString()) Log.i(TAG, face.eulerZ.toString()) } https://pixabay.com/en/woman-stylish-fashion-view-101542/
  59. Update the image and release resources imageView.setImageDrawable(BitmapDrawable(resources, tempBitmap)) faceDetector.release()

  60. GIVEN any photo WHEN any barcode or text is detected

    THEN retrieve information
  61. Create an immutable bitmap (since we don’t intend to draw

    on it) val bitmap = BitmapFactory.decodeResource(resources, R.drawable.image)
  62. val barcodeDetector = BarcodeDetector.Builder(this) .setBarcodeFormats(Barcode.ALL_FORMATS) .build() Set up the detectors

  63. val barcodeDetector = BarcodeDetector.Builder(this) .setBarcodeFormats(Barcode.ALL_FORMATS) .build() Set up the detectors

  64. val barcodeDetector = BarcodeDetector.Builder(this) .setBarcodeFormats(Barcode.ALL_FORMATS) .build() val textRecognizer = TextRecognizer.Builder(this).build()

    Set up the detectors
  65. if (!barcodeDetector.isOperational || !textRecognizer.isOperational) { return } Confirm the operationality

  66. val frame = Frame.Builder().setBitmap(bitmap).build() val barcodeArray = barcodeDetector.detect(frame) val textArray

    = textRecognizer.detect(frame) Detect barcodes and text
  67. Retrieve info - barcode for (i in 0 until barcodeArray.size())

    { val barcodeData = barcodeArray.valueAt(i) }
  68. Retrieve info - barcode for (i in 0 until barcodeArray.size())

    { val barcodeData = barcodeArray.valueAt(i).wifi }
  69. Retrieve info - text for (i in 0 until barcodeArray.size())

    { val barcodeData = barcodeArray.valueAt(i).wifi } (0 until textArray.size()) .map { textArray.valueAt(it) } .filterNot { it.value.isNullOrBlank() } .forEach { println(it.value) }
  70. Retrieve info - text for (i in 0 until barcodeArray.size())

    { val barcodeData = barcodeArray.valueAt(i).wifi } (0 until textArray.size()) .map { textArray.valueAt(it) } .filterNot { it.value.isNullOrBlank() } .forEach { println(it.value) }
  71. Retrieve info - text for (i in 0 until barcodeArray.size())

    { val barcodeData = barcodeArray.valueAt(i).wifi } (0 until textArray.size()) .map { textArray.valueAt(it) } .filterNot { it.value.isNullOrBlank() } .forEach { println(it.value) }
  72. Retrieve info - text for (i in 0 until barcodeArray.size())

    { val barcodeData = barcodeArray.valueAt(i).wifi } (0 until textArray.size()) .map { textArray.valueAt(it) } .filterNot { it.value.isNullOrBlank() } .forEach { println(it.value) }
  73. Release resources barcodeDetector.release() textRecognizer.release()

  74. GIVEN any back camera WHEN any face is detected THEN

    overlay graphics and images
  75. None
  76. Update the UI • Include a CameraPreview

  77. Update the UI • Include a CameraPreview • Include an

    overlay class
  78. Set up a face tracker class FaceTracker internal constructor( private

    val overlay: GraphicOverlay ) : Tracker<Face>() { private val faceGraphic: FaceGraphic = FaceGraphic(overlay) ... }
  79. Set up a face tracker override fun onNewItem override fun

    onUpdate override fun onMissing override fun onDone
  80. Set up a face tracker factory class FaceTrackerFactory : MultiProcessor.Factory<Face>

    { override fun create(face: Face): Tracker<Face> = FaceTracker(overlay) }
  81. Set up a face detector val detector = FaceDetector.Builder(this) .setTrackingEnabled(true)...

    detector.setProcessor( MultiProcessor.Builder(faceTrackerFactory).build()) if (!detector.isOperational) { //... }
  82. Create the camera source cameraSource = CameraSource.Builder(this, detector) .setRequestedPreviewSize(640, 480)

    .setFacing(CameraSource.CAMERA_FACING_BACK) .setRequestedFps(30.0f) .build()
  83. GIVEN any back camera WHEN any object is detected THEN

    overlay graphics and images
  84. Update the UI • Include a CameraPreview

  85. Update the UI • Include a CameraPreview • Include the

    overlay classes - FaceGraphic.kt, BarcodeGraphic.kt, TextGraphic.kt
  86. Set up the respective object trackers class FaceTracker internal constructor(

    private val overlay: GraphicOverlay ) : Tracker<Face>() { private val faceGraphic: FaceGraphic = FaceGraphic(overlay) ... }
  87. Set up the respective object trackers class BarcodeTracker internal constructor(

    private val overlay: GraphicOverlay ) : Tracker<Barcode>() { private val barcodeGraphic: BarcodeGraphic = BarcodeGraphic(overlay) ... }
  88. As well as the tracker factories class FaceTrackerFactory : MultiProcessor.Factory<Face>

    { override fun create(face: Face): Tracker<Face> = FaceTracker(overlay) } class BarcodeTrackerFactory : MultiProcessor.Factory<Barcode> { override fun create(barcode: Barcode): Tracker<Barcode> = BarcodeTracker(overlay) }
  89. Set up the detectors val faceDetector = FaceDetector.Builder(this).build() val barcodeDetector

    = BarcodeDetector.Builder(this).build()
  90. Set up the detectors val faceDetector = FaceDetector.Builder(this).build() val barcodeDetector

    = BarcodeDetector.Builder(this).build() faceDetector.setProcessor( MultiProcessor.Builder(faceTrackerFactory).build()) barcodeDetector.setProcessor( MultiProcessor.Builder(barcodeTrackerFactory).build())
  91. Wrap the detectors with a MultiDetector val multiDetector = MultiDetector.Builder()

    .add(faceDetector) .add(barcodeDetector) .build() if (!multiDetector.isOperational) { //... }
  92. Create the camera source cameraSource = CameraSource.Builder(this, multiDetector) .setRequestedPreviewSize(640, 480)

    .setFacing(CameraSource.CAMERA_FACING_BACK) .setRequestedFps(30.0f) .build()
  93. References https://developers.google.com/vision/ https://codelabs.developers.google.com/codelabs/face-detection/ind ex.html https://github.com/googlesamples/android-vision/tree/master/vision Samples Illustrations The extremely talented

    Virginia Poltrack - @VPoltrack
  94. Moyinoluwa Adeyemi Off Grid Electric @moyheen Thank you!