Building Smarter Apps: ML Kit for Android

Slide 1

Slide 1 text

Building Smarter Apps:ML Kit for Android

Slide 2

Slide 2 text

Lead Android Engineer, Kudi @_larikraun Omolara Adejuwon

Slide 3

Slide 3 text

Introduction ABOUT ML Kit Easy to use Vision and Natural Language APIs to solve everyday challenges in your apps or create brand-new user experiences.

Slide 4

Slide 4 text

On-Device Processing Optimized for mobile Works offline Rests on Google’s existing ML models Actively being worked on Why ML Kit? Some APIs already supports custom models

Slide 5

Slide 5 text

Vision APIs ABOUT ML Kit Face Detection Barcode Scanning Text Recognition Image Labeling Object Detection and Tracking Entity Extraction (still in Early Access Program) Pose Detection

Slide 6

Slide 6 text

Language Identification Text Translation Smart Replies Natural Language APIs

Slide 7

Slide 7 text

Face Detection Landmarks Eyes, Nose, Mouth, Ears Contours Facial features, angles, corners Classifications Eyes opened or closed, Smiling or not

Slide 8

Slide 8 text

Face Orientation Euler X - +ve is facing Up and -ve is facing Down Euler Y - +ve is facing right and -ve is facing left Euler Z - +ve is rotated counter clockwise

Slide 9

Slide 9 text

Single face boundingBox=Rect(250, 107 - 469, 326) rightEyeOpenProbability=0.93697304 leftEyeOpenProbability=0.98428077 smileProbability=0.9918384, eulerX=5.657005 eulerY=-11.135679 eulerZ=-11.309123

Slide 10

Slide 10 text

Single faces with glasses and facial hair boundingBox=Rect(243, 74 - 506, 337), rightEyeOpenProbability=0.989656, leftEyeOpenProbability=0.9913695, smileProbability=0.0145743145, eulerX=-0.452067, eulerY=-8.546794, eulerZ=-0.58841664

Slide 11

Slide 11 text

..and even with makeup boundingBox=Rect(-59, 73 - 556, 689) rightEyeOpenProbability=0.9847822 leftEyeOpenProbability=0.60155135 smileProbability=0.996382 eulerX=-6.9748883 eulerY=3.8424606 eulerZ=-3.00418

Slide 12

Slide 12 text

Multiple faces

Slide 13

Slide 13 text

Barcode Scanning Automatically detects the format Reads a lot of formats • Linear, 2D Automatically extracts a wide range of information types of structured data • URLs, email addresses, phone numbers, SMS message prompts, ISBNs, WiFi information etc Works even when the barcode is upside down

Slide 14

Slide 14 text

Text Recognition Works on Latin-based languages Breaks Large Texts down into units • Blocks contains lines • Lines contains elements

Slide 15

Slide 15 text

Language Recognition Works on Latin-based languages Identifies over 100 languages

Slide 16

Slide 16 text

Smart Replies Only English is supported at the moment Intended for casual conversations Generates suggestions based on your conversation

Slide 17

Slide 17 text

Initial Problems • Unusual Objects • Impersonation • Still Images Usecase

Slide 18

Slide 18 text

Goal • Reduce fraud on financial transactions. • Validate our users by comparing government identification apparatus with live images gotten from our users. Usecase

Slide 19

Slide 19 text

Solution • Detect blinking and smiling - Liveliness Check • Euler X,Y,Z values for face positions • Compare government IDs + current picture • Depending on score, allow certain actions. Usecase

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

August September 50 40 30 20 10 0 Success Report

Slide 22

Slide 22 text

SHOW US THE CODE (Face Detection as Case Study)

Slide 23

Slide 23 text

app/build.gradle dependencies { // ... implementation 'com.google.mlkit:face-detection:16.0.2' }

Slide 24

Slide 24 text

Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE)

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Set options val options = FaceDetectorOptions.Builder() //PERFORMANCE_MODE_ACCURATE or PERFORMANCE_MODE_FAST) .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE) // LANDMARK_MODE_NONE or LANDMARK_MODE_ALL .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) // CLASSIFICATION_MODE_ALL or CLASSIFICATION_MODE_ALL .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL)

Slide 27

Slide 27 text

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Process the image val faceDetector = FaceDetection.getClient(options)

Slide 30

Slide 30 text

Process the image val faceDetector = FaceDetection.getClient(options) //using bitmap images e.g from storage val image = InputImage.fromBitmap(bitmap, 0) //using Camera1 val image = InputImage.fromByteBuffer(byterBuffer,width,height,rotation,imageFormat) //process output from using CameraX val image = InputImage.fromMediaImage(image.image!!, image.imageInfo.rotationDegrees)

Slide 31

Slide 31 text

Process the image val faceDetector = FaceDetection.getClient(options) ... val result = faceDetector.process(image) .addOnSuccessListener { faces -> // Task completed successfully // ... } .addOnFailureListener { e -> // Task failed with an exception // ... }

Slide 32

Slide 32 text

Face properties //Bounding Box val bounds = face.boundingBox

Slide 33

Slide 33 text

Face properties //Bounding Box val bounds = face.boundingBox //Euler Values val rotY = face.headEulerAngleY val rotZ = face.headEulerAngleZ

Slide 34

Slide 34 text

Face properties //Bounding Box val bounds = face.boundingBox //Euler Values val rotY = face.headEulerAngleY val rotZ = face.headEulerAngleZ //Landmarks val leftEar = face.getLandmark(FaceLandmark.LEFT_EAR) val mouthBottom = face.getLandmark(FaceLandmark.MOUTH_BOTTOM) val rightEye = face.getLandmark(FaceLandmark.RIGHT_EYE) val noseBase = face.getLandmark(FaceLandmark.NOSE_BASE)

Slide 35

Slide 35 text

Face properties //Contours val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points

Slide 36

Slide 36 text

Face properties //Contours val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points //Classifications val smileProb = face.smilingProbability val rightEyeOpenProb = face.rightEyeOpenProbability

Slide 37

Slide 37 text

Slide 38

Slide 38 text

Using other APIs // Object Detection implementation 'com.google.mlkit:object-detection:16.2.1' // Takes in InputImage val tasks = ObjectDetection.getClient(options).process(inputImage) // Returns Task>

Slide 39

Slide 39 text

Using other APIs //Image Labelling // Takes in InputImage val tasks = ImageLabeling.getClient(options).process(inputImage) // Returns Task>

Slide 40

Slide 40 text

Using other APIs //Barcode Scanning // Takes in Input Image val tasks = BarcodeScanning.getClient(options).process(inputImage) // Returns Task>

Slide 41

Slide 41 text

Using other APIs // Language Recognition // Takes in Text val tasks = LanguageIdentification.getClient().identifyLanguage(text) // Returns Task>

Slide 42

Slide 42 text

Using other APIs // Smart Replies // Takes in a List val tasks = SmartReply.getClient().suggestReplies(conversation) // Returns Task

Slide 43

Slide 43 text

Processing must not be done on the main thread Resources (e.g Camera) must be released when activity is destroyed Extra Notes