$30 off During Our Annual Pro Sale. View Details »

Seeing is Believing: Mobile Vision API Deep Dive

Seeing is Believing: Mobile Vision API Deep Dive

Did you know that your mobile device can recognize objects in photos and videos? It can detect faces, barcodes and text in an image and even detect whether those faces are smiling or winking! All this is possible because of the Mobile Vision API which finds objects in photos and videos on mobile devices, using real-time on-device vision technology. This API will enable you edit photos and videos, automate text data entry, embed barcode scanners into your applications and even build your own version of a Stories application with lots of filters. In this talk, you’ll learn how to use the Mobile Vision API to get more out of your photos and videos. You’ll get a jumpstart on implementing Machine Learning in your apps with the power of this API!

Moyinoluwa Adeyemi

September 25, 2017
Tweet

More Decks by Moyinoluwa Adeyemi

Other Decks in Technology

Transcript

  1. Seeing is Believing
    Mobile Vision API Deep Dive

    View Slide

  2. Moyinoluwa Adeyemi
    Off Grid Electric
    @moyheen

    View Slide

  3. Mobile Vision API
    ● Finds objects in photos and videos

    View Slide

  4. ● Finds objects in photos and videos
    ● Includes face, barcode and text detectors
    Mobile Vision API

    View Slide

  5. ● Finds objects in photos and videos
    ● Includes face, barcode and text detectors
    ● Easy to set up
    Mobile Vision API

    View Slide

  6. ● Finds objects in photos and videos
    ● Includes face, barcode and text detectors
    ● Easy to set up
    ● Works locally
    Mobile Vision API

    View Slide

  7. ● Finds objects in photos and videos
    ● Includes face, barcode and text detectors
    ● Easy to set up
    ● Works locally
    ● Available offline
    Mobile Vision API

    View Slide

  8. ● Finds objects in photos and videos
    ● Includes face, barcode and text detectors
    ● Easy to set up
    ● Works locally
    ● Available offline
    ● Free
    Mobile Vision API

    View Slide

  9. View Slide

  10. Face Detection

    View Slide

  11. Detects human faces
    Sorry dawg, humans only.

    View Slide

  12. Euler X - up/down
    Euler Y - left/right
    Euler Z - rotated/slated
    Understands faces
    positioned at different
    angles
    https://developers.google.com/vision/face-detection-concepts

    View Slide

  13. https://pixabay.com/en/woman-stylish-fashion-view-101542/
    Detects landmarks
    Left eye - 4
    Right eye - 10
    Nose base - 6
    Left cheek - 1
    Right cheek - 7
    Left, Right and Bottom Mouth -
    5, 11, 0

    View Slide

  14. https://pixabay.com/en/woman-stylish-fashion-view-101542/
    Understands facial
    expressions
    isSmilingProbability:
    0.006698033

    View Slide

  15. https://pixabay.com/en/woman-stylish-fashion-view-101542/
    Understands facial
    expressions
    isSmilingProbability:
    0.006698033
    isLeftEyeOpenProbability:
    0.98714304

    View Slide

  16. https://pixabay.com/en/woman-stylish-fashion-view-101542/
    Understands facial
    expressions
    isSmilingProbability:
    0.006698033
    isLeftEyeOpenProbability:
    0.98714304
    isRightEyeOpenProbability:
    0.69178355

    View Slide

  17. Works on all skin colors

    View Slide

  18. Barcode Detection

    View Slide

  19. Detects barcodes in
    images and videos

    View Slide

  20. Works on both 1D and 2D barcodes
    https://www.adazonusa.com/blog/wp-content/uploads/2016/03/1D-barcode-vs-2D-barcodes.jpg

    View Slide

  21. Detects multiple barcodes in one image

    View Slide

  22. Even when they are upside down

    View Slide

  23. Text Detection

    View Slide

  24. Detects texts in image
    and video
    #TIL

    View Slide

  25. Segments text into block, lines and words
    https://developers.google.com/vision/images/text-structure.png

    View Slide

  26. Spanish
    English
    Hungarian
    Norwegian
    German
    Dutch
    French
    Catalan
    Portugese
    Romanian
    Polish
    Danish
    Finnish
    Italian
    Swedish
    Turkish
    Works only on Latin based languages

    View Slide

  27. Has a wide range of applications

    View Slide

  28. Getting Started

    View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. dependencies {
    compile 'com.google.android.gms:play-services-vision:11.2.2'
    }
    Update the gradle file

    View Slide



  33. Update the manifest file

    View Slide



  34. android:name="com.google.android.gms.vision.DEPENDENCIES"
    android:value="face, barcode, text" />
    Update the manifest file

    View Slide



  35. android:name="com.google.android.gms.vision.DEPENDENCIES"
    android:value="face, barcode, text" />
    android:value="@integer/google_play_services_version"/>
    Update the manifest file

    View Slide

  36. Run only on a background thread
    CAUTION
    DO NOT RUN ON
    UI THREAD

    View Slide

  37. faceDetector.release()
    barcodeDetector.release()
    textRecognizer.release()
    Release resources at the end

    View Slide

  38. GIVEN any photo
    WHEN any face is detected
    THEN overlay image and graphics

    View Slide

  39. Convert photo to a mutable bitmap
    val bitmapOptions = BitmapFactory.Options().apply {
    inMutable = true
    }

    View Slide

  40. Convert photo to a mutable bitmap
    val bitmapOptions = BitmapFactory.Options().apply {
    inMutable = true
    }
    val bitmap = BitmapFactory.decodeResource(resources,
    R.drawable.image,
    bitmapOptions)

    View Slide

  41. val tempBitmap = Bitmap.createBitmap(bitmap.width, bitmap.height,
    Bitmap.Config.RGB_565)
    Prepare the canvas

    View Slide

  42. val tempBitmap = Bitmap.createBitmap(bitmap.width, bitmap.height,
    Bitmap.Config.RGB_565)
    val paint = Paint().apply {
    strokeWidth = 5f
    color = Color.MAGENTA
    style = Paint.Style.STROKE
    }
    Prepare the canvas

    View Slide

  43. val tempBitmap = Bitmap.createBitmap(bitmap.width, bitmap.height,
    Bitmap.Config.RGB_565)
    val paint = Paint().apply {
    strokeWidth = 5f
    color = Color.MAGENTA
    style = Paint.Style.STROKE
    }
    val canvas = Canvas(tempBitmap)
    canvas.drawBitmap(bitmap, 0f, 0f, paint)
    Prepare the canvas

    View Slide

  44. val faceDetector = FaceDetector.Builder(this)
    .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS)
    .setTrackingEnabled(false)
    .setProminentFaceOnly(false)
    .setLandmarkType(FaceDetector.ALL_LANDMARKS)
    .setMode(FaceDetector.ACCURATE_MODE)
    .setMinFaceSize(0.2f)
    .build()
    Set up the face detector

    View Slide

  45. val faceDetector = FaceDetector.Builder(this)
    .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS)
    .setTrackingEnabled(false)
    .setProminentFaceOnly(false)
    .setLandmarkType(FaceDetector.ALL_LANDMARKS)
    .setMode(FaceDetector.ACCURATE_MODE)
    .setMinFaceSize(0.2f)
    .build()
    Set up the face detector

    View Slide

  46. val faceDetector = FaceDetector.Builder(this)
    .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS)
    .setTrackingEnabled(false)
    .setProminentFaceOnly(false)
    .setLandmarkType(FaceDetector.ALL_LANDMARKS)
    .setMode(FaceDetector.ACCURATE_MODE)
    .setMinFaceSize(0.2f)
    .build()
    Set up the face detector

    View Slide

  47. val faceDetector = FaceDetector.Builder(this)
    .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS)
    .setTrackingEnabled(false)
    .setProminentFaceOnly(false)
    .setLandmarkType(FaceDetector.ALL_LANDMARKS)
    .setMode(FaceDetector.ACCURATE_MODE)
    .setMinFaceSize(0.2f)
    .build()
    Set up the face detector

    View Slide

  48. val faceDetector = FaceDetector.Builder(this)
    .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS)
    .setTrackingEnabled(false)
    .setProminentFaceOnly(false)
    .setLandmarkType(FaceDetector.ALL_LANDMARKS)
    .setMode(FaceDetector.ACCURATE_MODE)
    .setMinFaceSize(0.2f)
    .build()
    Set up the face detector

    View Slide

  49. val faceDetector = FaceDetector.Builder(this)
    .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS)
    .setTrackingEnabled(false)
    .setProminentFaceOnly(false)
    .setLandmarkType(FaceDetector.ALL_LANDMARKS)
    .setMode(FaceDetector.ACCURATE_MODE)
    .setMinFaceSize(0.2f)
    .build()
    Set up the face detector

    View Slide

  50. val faceDetector = FaceDetector.Builder(this)
    .setClassificationType(FaceDetector.ALL_CLASSIFICATIONS)
    .setTrackingEnabled(false)
    .setProminentFaceOnly(false)
    .setLandmarkType(FaceDetector.ALL_LANDMARKS)
    .setMode(FaceDetector.ACCURATE_MODE)
    .setMinFaceSize(0.2f)
    .build()
    Set up the face detector

    View Slide

  51. if (!faceDetector.isOperational) {
    toast("Face Detector could not be set up on your device :(")
    return
    }
    Confirm the operationality

    View Slide

  52. val frame = Frame.Builder().setBitmap(bitmap).build()
    val faceArray = faceDetector.detect(frame)
    Detect faces

    View Slide

  53. Calculate coordinates
    for (i in 0 until faceArray.size()) {
    val face = faceArray.valueAt(i)
    val left = face.position.x
    val top = face.position.y
    val right = left + face.width
    val bottom = top + face.height
    }
    https://pixabay.com/en/woman-stylish-fashion-view-101542/
    (x, y)
    left, top
    (x + width, y + height)
    right, bottom

    View Slide

  54. Draw graphics on face
    for (i in 0 until faceArray.size()) {
    val face = faceArray.valueAt(i)
    ...
    val bound = RectF(left, top, right,
    a bottom)
    canvas.drawRoundRect(bound,
    c cornerRadius, cornerRadius, paint)
    }
    https://pixabay.com/en/woman-stylish-fashion-view-101542/

    View Slide

  55. Draw on all landmarks
    for (landmark in face.landmarks) {
    val x = landmark.position.x
    val y = landmark.position.y
    val landmarkType =
    landmark.type.toString()
    C canvas.drawText(landmarkType, x, y,
    p paint)
    }
    https://pixabay.com/en/woman-stylish-fashion-view-101542/

    View Slide

  56. Draw on a specific landmark
    for (landmark in face.landmarks) {
    val x = landmark.position.x
    val y = landmark.position.y
    when (landmark.type) {
    1, 7 -> canvas.drawCircle(x, y,
    r radius, paint)
    }
    }
    https://pixabay.com/en/woman-stylish-fashion-view-101542/

    View Slide

  57. Detect classification
    for (i in 0 until faceArray.size()) {
    val face = faceArray.valueAt(i)
    // face.isSmilingProbability
    // face.isLeftEyeOpenProbability
    // face.isRightEyeOpenProbability
    when (face.isSmilingProbability) {
    in 0.0f..0.49f -> // TODO
    in 0.5f..1f -> payForFriedChicken()
    } ...
    https://pixabay.com/en/woman-stylish-fashion-view-101542/

    View Slide

  58. Detect position
    for (i in 0 until faceArray.size()) {
    val face = faceArray.valueAt(i)
    // face.eulerY
    // face.eulerZ
    Log.i(TAG, face.eulerY.toString())
    Log.i(TAG, face.eulerZ.toString())
    }
    https://pixabay.com/en/woman-stylish-fashion-view-101542/

    View Slide

  59. Update the image and release resources
    imageView.setImageDrawable(BitmapDrawable(resources, tempBitmap))
    faceDetector.release()

    View Slide

  60. GIVEN any photo
    WHEN any barcode or text is detected
    THEN retrieve information

    View Slide

  61. Create an immutable bitmap
    (since we don’t intend to draw on it)
    val bitmap = BitmapFactory.decodeResource(resources,
    R.drawable.image)

    View Slide

  62. val barcodeDetector = BarcodeDetector.Builder(this)
    .setBarcodeFormats(Barcode.ALL_FORMATS)
    .build()
    Set up the detectors

    View Slide

  63. val barcodeDetector = BarcodeDetector.Builder(this)
    .setBarcodeFormats(Barcode.ALL_FORMATS)
    .build()
    Set up the detectors

    View Slide

  64. val barcodeDetector = BarcodeDetector.Builder(this)
    .setBarcodeFormats(Barcode.ALL_FORMATS)
    .build()
    val textRecognizer = TextRecognizer.Builder(this).build()
    Set up the detectors

    View Slide

  65. if (!barcodeDetector.isOperational || !textRecognizer.isOperational) {
    return
    }
    Confirm the operationality

    View Slide

  66. val frame = Frame.Builder().setBitmap(bitmap).build()
    val barcodeArray = barcodeDetector.detect(frame)
    val textArray = textRecognizer.detect(frame)
    Detect barcodes and text

    View Slide

  67. Retrieve info - barcode
    for (i in 0 until barcodeArray.size()) {
    val barcodeData = barcodeArray.valueAt(i)
    }

    View Slide

  68. Retrieve info - barcode
    for (i in 0 until barcodeArray.size()) {
    val barcodeData = barcodeArray.valueAt(i).wifi
    }

    View Slide

  69. Retrieve info - text
    for (i in 0 until barcodeArray.size()) {
    val barcodeData = barcodeArray.valueAt(i).wifi
    }
    (0 until textArray.size())
    .map { textArray.valueAt(it) }
    .filterNot { it.value.isNullOrBlank() }
    .forEach { println(it.value) }

    View Slide

  70. Retrieve info - text
    for (i in 0 until barcodeArray.size()) {
    val barcodeData = barcodeArray.valueAt(i).wifi
    }
    (0 until textArray.size())
    .map { textArray.valueAt(it) }
    .filterNot { it.value.isNullOrBlank() }
    .forEach { println(it.value) }

    View Slide

  71. Retrieve info - text
    for (i in 0 until barcodeArray.size()) {
    val barcodeData = barcodeArray.valueAt(i).wifi
    }
    (0 until textArray.size())
    .map { textArray.valueAt(it) }
    .filterNot { it.value.isNullOrBlank() }
    .forEach { println(it.value) }

    View Slide

  72. Retrieve info - text
    for (i in 0 until barcodeArray.size()) {
    val barcodeData = barcodeArray.valueAt(i).wifi
    }
    (0 until textArray.size())
    .map { textArray.valueAt(it) }
    .filterNot { it.value.isNullOrBlank() }
    .forEach { println(it.value) }

    View Slide

  73. Release resources
    barcodeDetector.release()
    textRecognizer.release()

    View Slide

  74. GIVEN any back camera
    WHEN any face is detected
    THEN overlay graphics and images

    View Slide

  75. View Slide

  76. Update the UI
    ● Include a CameraPreview

    View Slide

  77. Update the UI
    ● Include a CameraPreview
    ● Include an overlay class

    View Slide

  78. Set up a face tracker
    class FaceTracker internal constructor(
    private val overlay: GraphicOverlay
    ) : Tracker() {
    private val faceGraphic: FaceGraphic = FaceGraphic(overlay)
    ...
    }

    View Slide

  79. Set up a face tracker
    override fun onNewItem
    override fun onUpdate
    override fun onMissing
    override fun onDone

    View Slide

  80. Set up a face tracker factory
    class FaceTrackerFactory : MultiProcessor.Factory {
    override fun create(face: Face): Tracker =
    FaceTracker(overlay)
    }

    View Slide

  81. Set up a face detector
    val detector = FaceDetector.Builder(this)
    .setTrackingEnabled(true)...
    detector.setProcessor(
    MultiProcessor.Builder(faceTrackerFactory).build())
    if (!detector.isOperational) {
    //...
    }

    View Slide

  82. Create the camera source
    cameraSource = CameraSource.Builder(this, detector)
    .setRequestedPreviewSize(640, 480)
    .setFacing(CameraSource.CAMERA_FACING_BACK)
    .setRequestedFps(30.0f)
    .build()

    View Slide

  83. GIVEN any back camera
    WHEN any object is detected
    THEN overlay graphics and images

    View Slide

  84. Update the UI
    ● Include a CameraPreview

    View Slide

  85. Update the UI
    ● Include a CameraPreview
    ● Include the overlay classes - FaceGraphic.kt,
    BarcodeGraphic.kt, TextGraphic.kt

    View Slide

  86. Set up the respective object trackers
    class FaceTracker internal constructor(
    private val overlay: GraphicOverlay
    ) : Tracker() {
    private val faceGraphic: FaceGraphic = FaceGraphic(overlay)
    ...
    }

    View Slide

  87. Set up the respective object trackers
    class BarcodeTracker internal constructor(
    private val overlay: GraphicOverlay
    ) : Tracker() {
    private val barcodeGraphic: BarcodeGraphic =
    BarcodeGraphic(overlay)
    ...
    }

    View Slide

  88. As well as the tracker factories
    class FaceTrackerFactory : MultiProcessor.Factory {
    override fun create(face: Face): Tracker =
    FaceTracker(overlay)
    }
    class BarcodeTrackerFactory : MultiProcessor.Factory {
    override fun create(barcode: Barcode): Tracker =
    BarcodeTracker(overlay)
    }

    View Slide

  89. Set up the detectors
    val faceDetector = FaceDetector.Builder(this).build()
    val barcodeDetector = BarcodeDetector.Builder(this).build()

    View Slide

  90. Set up the detectors
    val faceDetector = FaceDetector.Builder(this).build()
    val barcodeDetector = BarcodeDetector.Builder(this).build()
    faceDetector.setProcessor(
    MultiProcessor.Builder(faceTrackerFactory).build())
    barcodeDetector.setProcessor(
    MultiProcessor.Builder(barcodeTrackerFactory).build())

    View Slide

  91. Wrap the detectors with a MultiDetector
    val multiDetector = MultiDetector.Builder()
    .add(faceDetector)
    .add(barcodeDetector)
    .build()
    if (!multiDetector.isOperational) {
    //...
    }

    View Slide

  92. Create the camera source
    cameraSource = CameraSource.Builder(this, multiDetector)
    .setRequestedPreviewSize(640, 480)
    .setFacing(CameraSource.CAMERA_FACING_BACK)
    .setRequestedFps(30.0f)
    .build()

    View Slide

  93. References
    https://developers.google.com/vision/
    https://codelabs.developers.google.com/codelabs/face-detection/ind
    ex.html
    https://github.com/googlesamples/android-vision/tree/master/vision
    Samples
    Illustrations
    The extremely talented Virginia Poltrack - @VPoltrack

    View Slide

  94. Moyinoluwa Adeyemi
    Off Grid Electric
    @moyheen
    Thank you!

    View Slide