Building Smarter Apps: ML Kit for Android

Building Smarter Apps:ML Kit for Android

Lead Android Engineer, Kudi @_larikraun Omolara Adejuwon

Introduction ABOUT ML Kit Easy to use Vision and Natural
Language APIs to solve everyday challenges in your apps or create brand-new user experiences.

On-Device Processing Optimized for mobile Works offline Rests on Google’s
existing ML models Actively being worked on Why ML Kit? Some APIs already supports custom models

Vision APIs ABOUT ML Kit Face Detection Barcode Scanning Text
Recognition Image Labeling Object Detection and Tracking Entity Extraction (still in Early Access Program) Pose Detection

Language Identification Text Translation Smart Replies Natural Language APIs

Face Detection Landmarks Eyes, Nose, Mouth, Ears Contours Facial features,
angles, corners Classifications Eyes opened or closed, Smiling or not

Face Orientation Euler X - +ve is facing Up and
-ve is facing Down Euler Y - +ve is facing right and -ve is facing left Euler Z - +ve is rotated counter clockwise

Single face boundingBox=Rect(250, 107 - 469, 326) rightEyeOpenProbability=0.93697304 leftEyeOpenProbability=0.98428077 smileProbability=0.9918384,
eulerX=5.657005 eulerY=-11.135679 eulerZ=-11.309123

Single faces with glasses and facial hair boundingBox=Rect(243, 74 -
506, 337), rightEyeOpenProbability=0.989656, leftEyeOpenProbability=0.9913695, smileProbability=0.0145743145, eulerX=-0.452067, eulerY=-8.546794, eulerZ=-0.58841664

..and even with makeup boundingBox=Rect(-59, 73 - 556, 689) rightEyeOpenProbability=0.9847822
leftEyeOpenProbability=0.60155135 smileProbability=0.996382 eulerX=-6.9748883 eulerY=3.8424606 eulerZ=-3.00418

Multiple faces

Barcode Scanning Automatically detects the format Reads a lot of
formats • Linear, 2D Automatically extracts a wide range of information types of structured data • URLs, email addresses, phone numbers, SMS message prompts, ISBNs, WiFi information etc Works even when the barcode is upside down

Text Recognition Works on Latin-based languages Breaks Large Texts down
into units • Blocks contains lines • Lines contains elements

Language Recognition Works on Latin-based languages Identifies over 100 languages

Smart Replies Only English is supported at the moment Intended
for casual conversations Generates suggestions based on your conversation

Initial Problems • Unusual Objects • Impersonation • Still Images
Usecase

Goal • Reduce fraud on financial transactions. • Validate our
users by comparing government identification apparatus with live images gotten from our users. Usecase

Solution • Detect blinking and smiling - Liveliness Check •
Euler X,Y,Z values for face positions • Compare government IDs + current picture • Depending on score, allow certain actions. Usecase

August September 50 40 30 20 10 0 Success Report

SHOW US THE CODE (Face Detection as Case Study)

app/build.gradle dependencies { // ... implementation 'com.google.mlkit:face-detection:16.0.2' }

Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

// LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL)

// LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)

// LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL) // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL)

// LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL) // CONTOUR_MODE_ALL or CONTOUR_MODE_NONE .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL) .build()

Process the image val faceDetector = FaceDetection.getClient(options)

Process the image val faceDetector = FaceDetection.getClient(options) //using bitmap images
e.g from storage val image = InputImage.fromBitmap(bitmap, 0) //using Camera1 val image = InputImage.fromByteBuffer(byterBuffer,width,height,rotation,imageFormat) //process output from using CameraX val image = InputImage.fromMediaImage(image.image!!, image.imageInfo.rotationDegrees)

Process the image val faceDetector = FaceDetection.getClient(options) ... val result
= faceDetector.process(image) .addOnSuccessListener { faces -> // Task completed successfully // ... } .addOnFailureListener { e -> // Task failed with an exception // ... }

Face properties //Bounding Box val bounds = face.boundingBox

Face properties //Bounding Box val bounds = face.boundingBox //Euler Values
val rotY = face.headEulerAngleY val rotZ = face.headEulerAngleZ

Face properties //Bounding Box val bounds = face.boundingBox //Euler Values
val rotY = face.headEulerAngleY val rotZ = face.headEulerAngleZ //Landmarks val leftEar = face.getLandmark(FaceLandmark.LEFT_EAR) val mouthBottom = face.getLandmark(FaceLandmark.MOUTH_BOTTOM) val rightEye = face.getLandmark(FaceLandmark.RIGHT_EYE) val noseBase = face.getLandmark(FaceLandmark.NOSE_BASE)

Face properties //Contours val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour =
face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points

face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points //Classifications val smileProb = face.smilingProbability val rightEyeOpenProb = face.rightEyeOpenProbability

face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points //Classifications val smileProb = face.smilingProbability val rightEyeOpenProb = face.rightEyeOpenProbability //Tracking Id val id = face.trackingId

Using other APIs // Object Detection implementation 'com.google.mlkit:object-detection:16.2.1' // Takes
in InputImage val tasks = ObjectDetection.getClient(options).process(inputImage) // Returns Task<List<DetectedObject>>

Using other APIs //Image Labelling // Takes in InputImage val
tasks = ImageLabeling.getClient(options).process(inputImage) // Returns Task<List<ImageLabel>>

Using other APIs //Barcode Scanning // Takes in Input Image
val tasks = BarcodeScanning.getClient(options).process(inputImage) // Returns Task<List<Barcode>>

Using other APIs // Language Recognition // Takes in Text
val tasks = LanguageIdentification.getClient().identifyLanguage(text) // Returns Task<List<Text>>

Using other APIs // Smart Replies // Takes in a
List<TextMessage> val tasks = SmartReply.getClient().suggestReplies(conversation) // Returns Task<SmartReplySuggestionResult>

Processing must not be done on the main thread Resources
(e.g Camera) must be released when activity is destroyed Extra Notes

Official Documentation Code Samples Resources

QUESTIONS? www.kudi.com Hiring @ [email protected]

Building Smarter Apps: ML Kit for Android

Building Smarter Apps: ML Kit for Android

More Decks by Omolara Adejuwon

Other Decks in Programming

Featured

Transcript