Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Smarter Apps: ML Kit for Android

Building Smarter Apps: ML Kit for Android

Every day we see how our little machines have been empowered to help with tasks and decision making. There is always a constant need to make our apps and products smarter to deliver more value to our users.

Google, in a bid to make Android developers’ lives easier, recently released the MLKit to give a soft landing to Machine Learning.

In my talk, I will demonstrate how you can build on the MLKit to solve everyday problems like face detection, barcode scanning, object detection and tracking, text recognition, and its full capabilities.

Omolara Adejuwon

October 08, 2020
Tweet

More Decks by Omolara Adejuwon

Other Decks in Programming

Transcript

  1. Building Smarter
    Apps:ML Kit for
    Android

    View Slide

  2. Lead Android Engineer, Kudi
    @_larikraun
    Omolara Adejuwon

    View Slide

  3. Introduction
    ABOUT ML Kit
    Easy to use Vision and Natural
    Language APIs to solve everyday
    challenges in your apps or create
    brand-new user experiences.

    View Slide

  4. On-Device Processing
    Optimized for mobile
    Works offline
    Rests on Google’s existing ML models
    Actively being worked on
    Why ML Kit?
    Some APIs already supports custom models

    View Slide

  5. Vision APIs
    ABOUT ML Kit
    Face Detection
    Barcode Scanning
    Text Recognition
    Image Labeling
    Object Detection and Tracking
    Entity Extraction (still in Early Access
    Program)
    Pose Detection

    View Slide

  6. Language Identification
    Text Translation
    Smart Replies
    Natural Language APIs

    View Slide

  7. Face Detection
    Landmarks
    Eyes, Nose, Mouth, Ears
    Contours
    Facial features, angles, corners
    Classifications
    Eyes opened or closed, Smiling or not

    View Slide

  8. Face Orientation
    Euler X - +ve is facing Up and -ve
    is facing Down
    Euler Y - +ve is facing right and
    -ve is facing left
    Euler Z - +ve is rotated counter
    clockwise

    View Slide

  9. Single face
    boundingBox=Rect(250, 107 - 469, 326)
    rightEyeOpenProbability=0.93697304
    leftEyeOpenProbability=0.98428077
    smileProbability=0.9918384,
    eulerX=5.657005
    eulerY=-11.135679
    eulerZ=-11.309123

    View Slide

  10. Single faces with glasses
    and facial hair
    boundingBox=Rect(243, 74 - 506, 337),
    rightEyeOpenProbability=0.989656,
    leftEyeOpenProbability=0.9913695,
    smileProbability=0.0145743145,
    eulerX=-0.452067,
    eulerY=-8.546794,
    eulerZ=-0.58841664

    View Slide

  11. ..and even with makeup
    boundingBox=Rect(-59, 73 - 556, 689)
    rightEyeOpenProbability=0.9847822
    leftEyeOpenProbability=0.60155135
    smileProbability=0.996382
    eulerX=-6.9748883
    eulerY=3.8424606
    eulerZ=-3.00418

    View Slide

  12. Multiple faces

    View Slide

  13. Barcode Scanning
    Automatically detects the format
    Reads a lot of formats
    • Linear, 2D
    Automatically extracts a wide range of
    information types of structured data
    • URLs, email addresses, phone numbers,
    SMS message prompts, ISBNs, WiFi
    information etc
    Works even when the barcode is upside down

    View Slide

  14. Text Recognition
    Works on Latin-based languages
    Breaks Large Texts down into units
    • Blocks contains lines
    • Lines contains elements

    View Slide

  15. Language Recognition
    Works on Latin-based languages
    Identifies over 100 languages

    View Slide

  16. Smart Replies
    Only English is supported at the
    moment
    Intended for casual conversations
    Generates suggestions based on your conversation

    View Slide

  17. Initial Problems
    • Unusual Objects
    • Impersonation
    • Still Images
    Usecase

    View Slide

  18. Goal
    • Reduce fraud on financial transactions.
    • Validate our users by comparing government
    identification apparatus with live images gotten
    from our users.
    Usecase

    View Slide

  19. Solution
    • Detect blinking and smiling - Liveliness Check
    • Euler X,Y,Z values for face positions
    • Compare government IDs + current picture
    • Depending on score, allow certain actions.
    Usecase

    View Slide

  20. View Slide

  21. August September
    50
    40
    30
    20
    10
    0
    Success Report

    View Slide

  22. SHOW US THE CODE
    (Face Detection as Case Study)

    View Slide

  23. app/build.gradle
    dependencies {
    // ...
    implementation 'com.google.mlkit:face-detection:16.0.2'
    }

    View Slide

  24. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    View Slide

  25. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)
    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL
    .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)

    View Slide

  26. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)
    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL
    .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)
    // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL
    .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)

    View Slide

  27. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)
    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL
    .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)
    // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL
    .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)
    // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE
    .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL)

    View Slide

  28. Set options
    val options = FaceDetectorOptions.Builder()
    //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST)
    .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)
    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL
    .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)
    // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL
    .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)
    // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE
    .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL)
    .build()

    View Slide

  29. Process the image
    val faceDetector = FaceDetection.getClient(options)

    View Slide

  30. Process the image
    val faceDetector = FaceDetection.getClient(options)
    //using bitmap images e.g from storage
    val image = InputImage.fromBitmap(bitmap, 0)
    //using Camera1
    val image = InputImage.fromByteBuffer(byterBuffer,width,height,rotation,imageFormat)
    //process output from using CameraX
    val image = InputImage.fromMediaImage(image.image!!, image.imageInfo.rotationDegrees)

    View Slide

  31. Process the image
    val faceDetector = FaceDetection.getClient(options)
    ...
    val result = faceDetector.process(image)
    .addOnSuccessListener { faces ->
    // Task completed successfully
    // ...
    }
    .addOnFailureListener { e ->
    // Task failed with an exception
    // ...
    }

    View Slide

  32. Face properties
    //Bounding Box
    val bounds = face.boundingBox

    View Slide

  33. Face properties
    //Bounding Box
    val bounds = face.boundingBox
    //Euler Values
    val rotY = face.headEulerAngleY
    val rotZ = face.headEulerAngleZ

    View Slide

  34. Face properties
    //Bounding Box
    val bounds = face.boundingBox
    //Euler Values
    val rotY = face.headEulerAngleY
    val rotZ = face.headEulerAngleZ
    //Landmarks
    val leftEar = face.getLandmark(FaceLandmark.LEFT_EAR)
    val mouthBottom = face.getLandmark(FaceLandmark.MOUTH_BOTTOM)
    val rightEye = face.getLandmark(FaceLandmark.RIGHT_EYE)
    val noseBase = face.getLandmark(FaceLandmark.NOSE_BASE)

    View Slide

  35. Face properties
    //Contours
    val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points
    val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points

    View Slide

  36. Face properties
    //Contours
    val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points
    val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points
    //Classifications
    val smileProb = face.smilingProbability
    val rightEyeOpenProb = face.rightEyeOpenProbability

    View Slide

  37. Face properties
    //Contours
    val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points
    val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points
    //Classifications
    val smileProb = face.smilingProbability
    val rightEyeOpenProb = face.rightEyeOpenProbability
    //Tracking Id
    val id = face.trackingId

    View Slide

  38. Using other APIs
    // Object Detection
    implementation 'com.google.mlkit:object-detection:16.2.1'
    // Takes in InputImage
    val tasks = ObjectDetection.getClient(options).process(inputImage)
    // Returns Task>

    View Slide

  39. Using other APIs
    //Image Labelling
    // Takes in InputImage
    val tasks = ImageLabeling.getClient(options).process(inputImage)
    // Returns Task>

    View Slide

  40. Using other APIs
    //Barcode Scanning
    // Takes in Input Image
    val tasks = BarcodeScanning.getClient(options).process(inputImage)
    // Returns Task>

    View Slide

  41. Using other APIs
    // Language Recognition
    // Takes in Text
    val tasks = LanguageIdentification.getClient().identifyLanguage(text)
    // Returns Task>

    View Slide

  42. Using other APIs
    // Smart Replies
    // Takes in a List
    val tasks = SmartReply.getClient().suggestReplies(conversation)
    // Returns Task

    View Slide

  43. Processing must not be done on
    the main thread
    Resources (e.g Camera) must be
    released when activity is destroyed
    Extra Notes

    View Slide

  44. Official Documentation
    Code Samples
    Resources

    View Slide

  45. QUESTIONS?
    www.kudi.com
    Hiring @
    [email protected]

    View Slide