Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Smarter Apps: ML Kit for Android

Building Smarter Apps: ML Kit for Android

Every day we see how our little machines have been empowered to help with tasks and decision making. There is always a constant need to make our apps and products smarter to deliver more value to our users.

Google, in a bid to make Android developers’ lives easier, recently released the MLKit to give a soft landing to Machine Learning.

In my talk, I will demonstrate how you can build on the MLKit to solve everyday problems like face detection, barcode scanning, object detection and tracking, text recognition, and its full capabilities.

Bcf368379f7010d91cd12d2ffa3427f8?s=128

Omolara Adejuwon

October 08, 2020
Tweet

Transcript

  1. Building Smarter Apps:ML Kit for Android

  2. Lead Android Engineer, Kudi @_larikraun Omolara Adejuwon

  3. Introduction ABOUT ML Kit Easy to use Vision and Natural

    Language APIs to solve everyday challenges in your apps or create brand-new user experiences.
  4. On-Device Processing Optimized for mobile Works offline Rests on Google’s

    existing ML models Actively being worked on Why ML Kit? Some APIs already supports custom models
  5. Vision APIs ABOUT ML Kit Face Detection Barcode Scanning Text

    Recognition Image Labeling Object Detection and Tracking Entity Extraction (still in Early Access Program) Pose Detection
  6. Language Identification Text Translation Smart Replies Natural Language APIs

  7. Face Detection Landmarks Eyes, Nose, Mouth, Ears Contours Facial features,

    angles, corners Classifications Eyes opened or closed, Smiling or not
  8. Face Orientation Euler X - +ve is facing Up and

    -ve is facing Down Euler Y - +ve is facing right and -ve is facing left Euler Z - +ve is rotated counter clockwise
  9. Single face boundingBox=Rect(250, 107 - 469, 326) rightEyeOpenProbability=0.93697304 leftEyeOpenProbability=0.98428077 smileProbability=0.9918384,

    eulerX=5.657005 eulerY=-11.135679 eulerZ=-11.309123
  10. Single faces with glasses and facial hair boundingBox=Rect(243, 74 -

    506, 337), rightEyeOpenProbability=0.989656, leftEyeOpenProbability=0.9913695, smileProbability=0.0145743145, eulerX=-0.452067, eulerY=-8.546794, eulerZ=-0.58841664
  11. ..and even with makeup boundingBox=Rect(-59, 73 - 556, 689) rightEyeOpenProbability=0.9847822

    leftEyeOpenProbability=0.60155135 smileProbability=0.996382 eulerX=-6.9748883 eulerY=3.8424606 eulerZ=-3.00418
  12. Multiple faces

  13. Barcode Scanning Automatically detects the format Reads a lot of

    formats • Linear, 2D Automatically extracts a wide range of information types of structured data • URLs, email addresses, phone numbers, SMS message prompts, ISBNs, WiFi information etc Works even when the barcode is upside down
  14. Text Recognition Works on Latin-based languages Breaks Large Texts down

    into units • Blocks contains lines • Lines contains elements
  15. Language Recognition Works on Latin-based languages Identifies over 100 languages

  16. Smart Replies Only English is supported at the moment Intended

    for casual conversations Generates suggestions based on your conversation
  17. Initial Problems • Unusual Objects • Impersonation • Still Images

    Usecase
  18. Goal • Reduce fraud on financial transactions. • Validate our

    users by comparing government identification apparatus with live images gotten from our users. Usecase
  19. Solution • Detect blinking and smiling - Liveliness Check •

    Euler X,Y,Z values for face positions • Compare government IDs + current picture • Depending on score, allow certain actions. Usecase
  20. None
  21. August September 50 40 30 20 10 0 Success Report

  22. SHOW US THE CODE (Face Detection as Case Study)

  23. app/build.gradle dependencies { // ... implementation 'com.google.mlkit:face-detection:16.0.2' }

  24. Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

  25. Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)
  26. Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)
  27. Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL) // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL)
  28. Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

    // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL) // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL) .build()
  29. Process the image val faceDetector = FaceDetection.getClient(options)

  30. Process the image val faceDetector = FaceDetection.getClient(options) //using bitmap images

    e.g from storage val image = InputImage.fromBitmap(bitmap, 0) //using Camera1 val image = InputImage.fromByteBuffer(byterBuffer,width,height,rotation,imageFormat) //process output from using CameraX val image = InputImage.fromMediaImage(image.image!!, image.imageInfo.rotationDegrees)
  31. Process the image val faceDetector = FaceDetection.getClient(options) ... val result

    = faceDetector.process(image) .addOnSuccessListener { faces -> // Task completed successfully // ... } .addOnFailureListener { e -> // Task failed with an exception // ... }
  32. Face properties //Bounding Box val bounds = face.boundingBox

  33. Face properties //Bounding Box val bounds = face.boundingBox //Euler Values

    val rotY = face.headEulerAngleY val rotZ = face.headEulerAngleZ
  34. Face properties //Bounding Box val bounds = face.boundingBox //Euler Values

    val rotY = face.headEulerAngleY val rotZ = face.headEulerAngleZ //Landmarks val leftEar = face.getLandmark(FaceLandmark.LEFT_EAR) val mouthBottom = face.getLandmark(FaceLandmark.MOUTH_BOTTOM) val rightEye = face.getLandmark(FaceLandmark.RIGHT_EYE) val noseBase = face.getLandmark(FaceLandmark.NOSE_BASE)
  35. Face properties //Contours val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour =

    face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points
  36. Face properties //Contours val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour =

    face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points //Classifications val smileProb = face.smilingProbability val rightEyeOpenProb = face.rightEyeOpenProbability
  37. Face properties //Contours val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour =

    face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points //Classifications val smileProb = face.smilingProbability val rightEyeOpenProb = face.rightEyeOpenProbability //Tracking Id val id = face.trackingId
  38. Using other APIs // Object Detection implementation 'com.google.mlkit:object-detection:16.2.1' // Takes

    in InputImage val tasks = ObjectDetection.getClient(options).process(inputImage) // Returns Task<List<DetectedObject>>
  39. Using other APIs //Image Labelling // Takes in InputImage val

    tasks = ImageLabeling.getClient(options).process(inputImage) // Returns Task<List<ImageLabel>>
  40. Using other APIs //Barcode Scanning // Takes in Input Image

    val tasks = BarcodeScanning.getClient(options).process(inputImage) // Returns Task<List<Barcode>>
  41. Using other APIs // Language Recognition // Takes in Text

    val tasks = LanguageIdentification.getClient().identifyLanguage(text) // Returns Task<List<Text>>
  42. Using other APIs // Smart Replies // Takes in a

    List<TextMessage> val tasks = SmartReply.getClient().suggestReplies(conversation) // Returns Task<SmartReplySuggestionResult>
  43. Processing must not be done on the main thread Resources

    (e.g Camera) must be released when activity is destroyed Extra Notes
  44. Official Documentation Code Samples Resources

  45. QUESTIONS? www.kudi.com Hiring @ careers@kudi.com