Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Smarter Apps: ML Kit for Android

Building Smarter Apps: ML Kit for Android

Every day we see how our little machines have been empowered to help with tasks and decision making. There is always a constant need to make our apps and products smarter to deliver more value to our users.

Google, in a bid to make Android developers’ lives easier, recently released the MLKit to give a soft landing to Machine Learning.

In my talk, I will demonstrate how you can build on the MLKit to solve everyday problems like face detection, barcode scanning, object detection and tracking, text recognition, and its full capabilities.

Omolara Adejuwon

October 08, 2020
Tweet

More Decks by Omolara Adejuwon

Other Decks in Programming

Transcript

  1. Building Smarter
    Apps:ML Kit for
    Android

    View full-size slide

  2. Lead Android Engineer, Kudi
    @_larikraun
    Omolara Adejuwon

    View full-size slide

  3. Introduction
    ABOUT ML Kit
    Easy to use Vision and Natural
    Language APIs to solve everyday
    challenges in your apps or create
    brand-new user experiences.

    View full-size slide

  4. On-Device Processing
    Optimized for mobile
    Works offline
    Rests on Google’s existing ML models
    Actively being worked on
    Why ML Kit?
    Some APIs already supports custom models

    View full-size slide

  5. Vision APIs
    ABOUT ML Kit
    Face Detection
    Barcode Scanning
    Text Recognition
    Image Labeling
    Object Detection and Tracking
    Entity Extraction (still in Early Access
    Program)
    Pose Detection

    View full-size slide

  6. Language Identification
    Text Translation
    Smart Replies
    Natural Language APIs

    View full-size slide

  7. Face Detection
    Landmarks
    Eyes, Nose, Mouth, Ears
    Contours
    Facial features, angles, corners
    Classifications
    Eyes opened or closed, Smiling or not

    View full-size slide

  8. Face Orientation
    Euler X - +ve is facing Up and -ve
    is facing Down
    Euler Y - +ve is facing right and
    -ve is facing left
    Euler Z - +ve is rotated counter
    clockwise

    View full-size slide

  9. Single face
    boundingBox=Rect(250, 107 - 469, 326)
    rightEyeOpenProbability=0.93697304
    leftEyeOpenProbability=0.98428077
    smileProbability=0.9918384,
    eulerX=5.657005
    eulerY=-11.135679
    eulerZ=-11.309123

    View full-size slide

  10. Single faces with glasses
    and facial hair
    boundingBox=Rect(243, 74 - 506, 337),
    rightEyeOpenProbability=0.989656,
    leftEyeOpenProbability=0.9913695,
    smileProbability=0.0145743145,
    eulerX=-0.452067,
    eulerY=-8.546794,
    eulerZ=-0.58841664

    View full-size slide

  11. ..and even with makeup
    boundingBox=Rect(-59, 73 - 556, 689)
    rightEyeOpenProbability=0.9847822
    leftEyeOpenProbability=0.60155135
    smileProbability=0.996382
    eulerX=-6.9748883
    eulerY=3.8424606
    eulerZ=-3.00418

    View full-size slide

  12. Multiple faces

    View full-size slide

  13. Barcode Scanning
    Automatically detects the format
    Reads a lot of formats
    • Linear, 2D
    Automatically extracts a wide range of
    information types of structured data
    • URLs, email addresses, phone numbers,
    SMS message prompts, ISBNs, WiFi
    information etc
    Works even when the barcode is upside down

    View full-size slide

  14. Text Recognition
    Works on Latin-based languages
    Breaks Large Texts down into units
    • Blocks contains lines
    • Lines contains elements

    View full-size slide

  15. Language Recognition
    Works on Latin-based languages
    Identifies over 100 languages

    View full-size slide

  16. Smart Replies
    Only English is supported at the
    moment
    Intended for casual conversations
    Generates suggestions based on your conversation

    View full-size slide

  17. Initial Problems
    • Unusual Objects
    • Impersonation
    • Still Images
    Usecase

    View full-size slide

  18. Goal
    • Reduce fraud on financial transactions.
    • Validate our users by comparing government
    identification apparatus with live images gotten
    from our users.
    Usecase

    View full-size slide

  19. Solution
    • Detect blinking and smiling - Liveliness Check
    • Euler X,Y,Z values for face positions
    • Compare government IDs + current picture
    • Depending on score, allow certain actions.
    Usecase

    View full-size slide

  20. August September
    50
    40
    30
    20
    10
    0
    Success Report

    View full-size slide

  21. SHOW US THE CODE
    (Face Detection as Case Study)

    View full-size slide

  22. app/build.gradle
    dependencies {
    // ...
    implementation 'com.google.mlkit:face-detection:16.0.2'
    }

    View full-size slide

  23. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    View full-size slide

  24. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)
    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL
    .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)

    View full-size slide

  25. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)
    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL
    .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)
    // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL
    .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)

    View full-size slide

  26. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)
    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL
    .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)
    // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL
    .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)
    // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE
    .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL)

    View full-size slide

  27. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)
    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL
    .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)
    // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL
    .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)
    // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE
    .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL)
    .build()

    View full-size slide

  28. Process the image
    val faceDetector = FaceDetection.getClient(options)

    View full-size slide

  29. Process the image
    val faceDetector = FaceDetection.getClient(options)
    //using bitmap images e.g from storage
    val image = InputImage.fromBitmap(bitmap, 0)
    //using Camera1
    val image = InputImage.fromByteBuffer(byterBuffer,width,height,rotation,imageFormat)
    //process output from using CameraX
    val image = InputImage.fromMediaImage(image.image!!, image.imageInfo.rotationDegrees)

    View full-size slide

  30. Process the image
    val faceDetector = FaceDetection.getClient(options)
    ...
    val result = faceDetector.process(image)
    .addOnSuccessListener { faces ->
    // Task completed successfully
    // ...
    }
    .addOnFailureListener { e ->
    // Task failed with an exception
    // ...
    }

    View full-size slide

  31. Face properties
    //Bounding Box
    val bounds = face.boundingBox

    View full-size slide

  32. Face properties
    //Bounding Box
    val bounds = face.boundingBox
    //Euler Values
    val rotY = face.headEulerAngleY
    val rotZ = face.headEulerAngleZ

    View full-size slide

  33. Face properties
    //Bounding Box
    val bounds = face.boundingBox
    //Euler Values
    val rotY = face.headEulerAngleY
    val rotZ = face.headEulerAngleZ
    //Landmarks
    val leftEar = face.getLandmark(FaceLandmark.LEFT_EAR)
    val mouthBottom = face.getLandmark(FaceLandmark.MOUTH_BOTTOM)
    val rightEye = face.getLandmark(FaceLandmark.RIGHT_EYE)
    val noseBase = face.getLandmark(FaceLandmark.NOSE_BASE)

    View full-size slide

  34. Face properties
    //Contours
    val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points
    val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points

    View full-size slide

  35. Face properties
    //Contours
    val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points
    val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points
    //Classifications
    val smileProb = face.smilingProbability
    val rightEyeOpenProb = face.rightEyeOpenProbability

    View full-size slide

  36. Face properties
    //Contours
    val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points
    val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points
    //Classifications
    val smileProb = face.smilingProbability
    val rightEyeOpenProb = face.rightEyeOpenProbability
    //Tracking Id
    val id = face.trackingId

    View full-size slide

  37. Using other APIs
    // Object Detection
    implementation 'com.google.mlkit:object-detection:16.2.1'
    // Takes in InputImage
    val tasks = ObjectDetection.getClient(options).process(inputImage)
    // Returns Task>

    View full-size slide

  38. Using other APIs
    //Image Labelling
    // Takes in InputImage
    val tasks = ImageLabeling.getClient(options).process(inputImage)
    // Returns Task>

    View full-size slide

  39. Using other APIs
    //Barcode Scanning
    // Takes in Input Image
    val tasks = BarcodeScanning.getClient(options).process(inputImage)
    // Returns Task>

    View full-size slide

  40. Using other APIs
    // Language Recognition
    // Takes in Text
    val tasks = LanguageIdentification.getClient().identifyLanguage(text)
    // Returns Task>

    View full-size slide

  41. Using other APIs
    // Smart Replies
    // Takes in a List
    val tasks = SmartReply.getClient().suggestReplies(conversation)
    // Returns Task

    View full-size slide

  42. Processing must not be done on
    the main thread
    Resources (e.g Camera) must be
    released when activity is destroyed
    Extra Notes

    View full-size slide

  43. Official Documentation
    Code Samples
    Resources

    View full-size slide

  44. QUESTIONS?
    www.kudi.com
    Hiring @
    [email protected]

    View full-size slide