Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Smarter Apps: ML Kit for Android

Building Smarter Apps: ML Kit for Android

Every day we see how our little machines have been empowered to help with tasks and decision making. There is always a constant need to make our apps and products smarter to deliver more value to our users.

Google, in a bid to make Android developers’ lives easier, recently released the MLKit to give a soft landing to Machine Learning.

In my talk, I will demonstrate how you can build on the MLKit to solve everyday problems like face detection, barcode scanning, object detection and tracking, text recognition, and its full capabilities.

Omolara Adejuwon

October 08, 2020
Tweet

More Decks by Omolara Adejuwon

Other Decks in Programming

Transcript

  1. Introduction ABOUT ML Kit Easy to use Vision and Natural

    Language APIs to solve everyday challenges in your apps or create brand-new user experiences.
  2. On-Device Processing Optimized for mobile Works offline Rests on Google’s

    existing ML models Actively being worked on Why ML Kit? Some APIs already supports custom models
  3. Vision APIs ABOUT ML Kit Face Detection Barcode Scanning Text

    Recognition Image Labeling Object Detection and Tracking Entity Extraction (still in Early Access Program) Pose Detection
  4. Face Detection Landmarks Eyes, Nose, Mouth, Ears Contours Facial features,

    angles, corners Classifications Eyes opened or closed, Smiling or not
  5. Face Orientation Euler X - +ve is facing Up and

    -ve is facing Down Euler Y - +ve is facing right and -ve is facing left Euler Z - +ve is rotated counter clockwise
  6. Single faces with glasses and facial hair boundingBox=Rect(243, 74 -

    506, 337), rightEyeOpenProbability=0.989656, leftEyeOpenProbability=0.9913695, smileProbability=0.0145743145, eulerX=-0.452067, eulerY=-8.546794, eulerZ=-0.58841664
  7. ..and even with makeup boundingBox=Rect(-59, 73 - 556, 689) rightEyeOpenProbability=0.9847822

    leftEyeOpenProbability=0.60155135 smileProbability=0.996382 eulerX=-6.9748883 eulerY=3.8424606 eulerZ=-3.00418
  8. Barcode Scanning Automatically detects the format Reads a lot of

    formats • Linear, 2D Automatically extracts a wide range of information types of structured data • URLs, email addresses, phone numbers, SMS message prompts, ISBNs, WiFi information etc Works even when the barcode is upside down
  9. Text Recognition Works on Latin-based languages Breaks Large Texts down

    into units • Blocks contains lines • Lines contains elements
  10. Smart Replies Only English is supported at the moment Intended

    for casual conversations Generates suggestions based on your conversation
  11. Goal • Reduce fraud on financial transactions. • Validate our

    users by comparing government identification apparatus with live images gotten from our users. Usecase
  12. Solution • Detect blinking and smiling - Liveliness Check •

    Euler X,Y,Z values for face positions • Compare government IDs + current picture • Depending on score, allow certain actions. Usecase
  13. Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)
  14. Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL) // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL)
  15. Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL) // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL) .build()
  16. Process the image val faceDetector = FaceDetection.getClient(options) //using bitmap images

    e.g from storage val image = InputImage.fromBitmap(bitmap, 0) //using Camera1 val image = InputImage.fromByteBuffer(byterBuffer,width,height,rotation,imageFormat) //process output from using CameraX val image = InputImage.fromMediaImage(image.image!!, image.imageInfo.rotationDegrees)
  17. Process the image val faceDetector = FaceDetection.getClient(options) ... val result

    = faceDetector.process(image) .addOnSuccessListener { faces -> // Task completed successfully // ... } .addOnFailureListener { e -> // Task failed with an exception // ... }
  18. Face properties //Bounding Box val bounds = face.boundingBox //Euler Values

    val rotY = face.headEulerAngleY val rotZ = face.headEulerAngleZ
  19. Face properties //Bounding Box val bounds = face.boundingBox //Euler Values

    val rotY = face.headEulerAngleY val rotZ = face.headEulerAngleZ //Landmarks val leftEar = face.getLandmark(FaceLandmark.LEFT_EAR) val mouthBottom = face.getLandmark(FaceLandmark.MOUTH_BOTTOM) val rightEye = face.getLandmark(FaceLandmark.RIGHT_EYE) val noseBase = face.getLandmark(FaceLandmark.NOSE_BASE)
  20. Face properties //Contours val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour =

    face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points //Classifications val smileProb = face.smilingProbability val rightEyeOpenProb = face.rightEyeOpenProbability
  21. Face properties //Contours val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour =

    face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points //Classifications val smileProb = face.smilingProbability val rightEyeOpenProb = face.rightEyeOpenProbability //Tracking Id val id = face.trackingId
  22. Using other APIs // Object Detection implementation 'com.google.mlkit:object-detection:16.2.1' // Takes

    in InputImage val tasks = ObjectDetection.getClient(options).process(inputImage) // Returns Task<List<DetectedObject>>
  23. Using other APIs //Image Labelling // Takes in InputImage val

    tasks = ImageLabeling.getClient(options).process(inputImage) // Returns Task<List<ImageLabel>>
  24. Using other APIs //Barcode Scanning // Takes in Input Image

    val tasks = BarcodeScanning.getClient(options).process(inputImage) // Returns Task<List<Barcode>>
  25. Using other APIs // Language Recognition // Takes in Text

    val tasks = LanguageIdentification.getClient().identifyLanguage(text) // Returns Task<List<Text>>
  26. Using other APIs // Smart Replies // Takes in a

    List<TextMessage> val tasks = SmartReply.getClient().suggestReplies(conversation) // Returns Task<SmartReplySuggestionResult>
  27. Processing must not be done on the main thread Resources

    (e.g Camera) must be released when activity is destroyed Extra Notes