Understanding MLKit offerings in Android & IOS

Slide 1

Slide 1 text

Understanding ML Kit offerings in Android/Ios Gaurav Bhatnagar @bhatnagar_g

Slide 2

Slide 2 text

Beautiful Crete !!!!

Slide 3

Slide 3 text

´ Background ´ ML Kit Stack and Ingredients ´ Base APIs ´ Custom Models & AutoML ´ Recap Agenda

Slide 4

Slide 4 text

Significance of Machine Learning

Slide 5

Slide 5 text

Achievable – Hard On-Device ML challenges On-Cloud ML Drawbacks In-Flexibility in using custom TF-Lite models Underlying mathematics and data science Climbing The ML Mountain

Slide 6

Slide 6 text

Optimized for mobile Ready to use APIs Making Google ML Expertise easily accessible On-Device ML On-Cloud API Custom Models & Auto ML Making Machine Learning normal Introducing Firebase ML Kit SDK

Slide 7

Slide 7 text

Features of ML Kit SDK ´ Available for both Android & IOS devices. ´ Base APIs(Out of the box solutions) now in Vision & NLP ´ Included in Firebase Suite. ´ Flexibility of using custom TF lite models(via dynamic downloads) or building our own models with AutoML Vision Edge.

Slide 8

Slide 8 text

ML Kit Stack TensorFlow Lite Mobile Vision API Google Cloud Vision API Neural Networks iOS Metal API NLP API

Slide 9

Slide 9 text

ML Base Apis using Mobile Vision and Google Cloud Vision API Image Labelling Landmark Recognition Text Recognition Bar Code Scanning Face Detection ML Kit Ingredients Object Detection & Tracking

Slide 10

Slide 10 text

More Base APIs Language Identification Smart Reply Translation Base ML Apis using Natural Language Processing

Slide 11

Slide 11 text

On-Device On-Cloud Text Recognition Face Detection Barcode Scanning Image Labelling Landmark Recognition Smart Replies Language Identification Translation ODT Ease of switching for predefined ML capabilities

Slide 12

Slide 12 text

Create project in Firebase Download google- services.json to current project Enable Cloud API Provide Input Data Applying ML via base/custom models Basic Workflow of ML Kit App

Slide 13

Slide 13 text

Bitmap Image URI/ File Media. Image ByteArray ByteBuffer Firebase Vision Image Mechanisms to process the input in Vision

Slide 14

Slide 14 text

Capture Image via Camera Converting Image to Bitmap Firebase Vision Image FirebaseVisionText Recognizer Processing Firebase Vision Image FirebaseVision Text Lines & Blocks Text Recognition Workflow FirebaseVisionText – textBlocks -> line -> element -> text FirebaseVisionCloudText – page -> block -> paragraph -> word -> symbol

Slide 15

Slide 15 text

Using Cloud Vision Using On-Device Api Text Recognition Demo Google Code labs: https://goo.gl/hyEb3r

Slide 16

Slide 16 text

Capture Image via Camera Converting Image to Bitmap Firebase Vision Image FirebaseVisionLabel Detector Processing Firebase Vision Image FirebaseVision Labels Name,Confidence & Entity ID Image Labelling Workflow VisionLabel DetectorO ptions

Slide 17

Slide 17 text

On-Device Cloud Pricing Free Free for first 1000 requests every month. Label coverage 400+ Labels for each category 10000+ labels in many categories Knowledge graph entity ID support Available Available Features of On-Device and On-Cloud API

Slide 18

Slide 18 text

Image Labelling Recorded Demo GitHub Repo : https://bit.ly/2RITNsn

Slide 19

Slide 19 text

Original Pic Using Cloud Api Using onDevice Api Image Labelling Demo

Slide 20

Slide 20 text

Image courtesy : https://goo.gl/8Yg5LX Types of Barcode Supported

Slide 21

Slide 21 text

Barcode Scanning Workflow Firebase Vision Barcode TYPE_PHONE TYPE_SMS TYPE_URL TYPE_WIFI TYPE_CALENDER_EVENT TYPE_CONTACT_ INFO TYPE_DRIVER_LICENSE TYPE_EMAIL TYPE_GEO FirebaseVisionBarcode.SMS FirebaseVisionBarcode.URL FirebaseVisionBarcode.WIFI FirebaseVisionBarcode.CalenderEvent FirebaseVisionBarcode.ContactInfo FirebaseVisionBarcode.DriverLicense FirebaseVisionBarcode.Email FirebaseVisionBarcode.Geo FirebaseVisionBarcode.Phone

Slide 22

Slide 22 text

Landmark Detection Workflow Firebase Vision Image ByteArray ByteBuffer Image URI/File Bitmap Media.Image FirebaseVision CloudLandmark Detector getBoundingBox() getConfidence() getEntityId() getLandmark() FirebaseVisionCloudLandM ark Instances FirebaseVisio nCloudDete ctorOptions getLocations()

Slide 23

Slide 23 text

Landmark Detection Demo Original Pic Processed Pic

Slide 24

Slide 24 text

Landmark Detection Recorded Demo GitHub Repo : https://bit.ly/2RITNsn

Slide 25

Slide 25 text

Face Detection Face Tracking Landmarks Classification Face Detection Components

Slide 26

Slide 26 text

Face Detection Features FirebaseVisionFace.boundingBox FirebaseVisionFace.rightEyeOpenProb ability FirebaseVisionFaceLandmark.NOSE _BASE FirebaseVisionFaceLandmark.RIGHT_ MOUTH FirebaseVisionFace.leftEyeOpenP robability FirebaseVisionFace.smilingProbabil ity FirebaseVisionFaceLandmark.LEFT_ MOUTH FirebaseVisionFaceLandmark.BOTTO M_MOUTH This picture does not specify all face detection features

Slide 27

Slide 27 text

Face Detection Demo GitHub Repo : https://bit.ly/2RITNsn

Slide 28

Slide 28 text

Object Detection and Tracking ML Kit ODT Cloud Visual Search On-Device Cloud Detect Objects Track Objects Coarse Classification Cloud Vision Product Search (or any solution) Google Code labs(ODT): http://bit.ly/2kQrHji

Slide 29

Slide 29 text

Block Diagram for On-Device Component Localizer Classifier Tracker ML Kit ODT < 10 m sec 50 m sec 50 m sec Firebase Vision Object

Slide 30

Slide 30 text

Object Detection & Tracking Details VisionObjectDetectorOptions Detector Mode : Single Image / Stream Mode Single/Multiple Objects Classification Bounding Box Classifica tion Category Tracking ID (In Stream Mode) Firebase Vision Object The object returned after processing

Slide 31

Slide 31 text

Language Identification Base ML APIs Ø Recognizes text in 110 different languages Ø Fast and provides response within 1- 2 ms in Android or IOS Ø identifyLanguage() returns BCP-47 Language code for input text. Ø List of possible languages and their confidence level is also returned. Google Code labs: http://bit.ly/2koCGjI

Slide 32

Slide 32 text

Smart Replies Base APIs Ø This API provides suggestions based on last 10 messages in a conversation. (although it works based on 1 previous message as well) Ø Stateless API : fully runs on device. Ø No Message history kept in memory or on server. Ø Worked closely with textPlus to ensure this API is ready for production. Google Code labs: http://bit.ly/2kR9GkW

Slide 33

Slide 33 text

Translation API (On-Device 1/2) Ø Translations available between 59 languages Ø Same NMT models as Google Translate in offline mode. Ø Offered at no additional cost. Ø Fully functions on device. Google Code labs: http://bit.ly/2mjdyvf

Slide 34

Slide 34 text

Translation API (On-Device 2/2) Phrase based MT --> Neural Machine Translation Discreet Local Decision -> Continuous Global Decision Language packs downloads dynamically. To reduce language pairs we are using English as an intermediate language. Greek English Japenese Πότε σκοπεύετε να πάτε στην παραλία; -> When are you planning to go to the beach? -> “いつビーチに⾏く予定ですか︖” (Itsu bīchi ni iku yoteidesu ka?) )

Slide 35

Slide 35 text

Translation Workflow TranslatorO ptions (S & D Lang) Text in Source Language FirebaseTranslator Download Model if Needed Text in Destination Language FirebaseM odelDownl oadCondit ions

Slide 36

Slide 36 text

ML Kit + Material Design GitHub Repo : http://bit.ly/MLkitmaterial Object Detection & Tracking Barcode Scannig

Slide 37

Slide 37 text

User TF Lite Model for Inference Host with Firebase Convert TF Lite Model Build & Train Custom TF Model ´ Automatic Model Fallback ´ Automatic Model Updates ´ A/B Testing and specializing custom models using remote config (Dynamic Selection). Main Benefits Implementation Path Custom Models ML Kit acts as an API layer for your custom model.

Slide 38

Slide 38 text

Custom Models Demo Google Code labs: https://goo.gl/92n5dY

Slide 39

Slide 39 text

AutoML Vision Edge Create your own Data Set AutoML Vision Edge On-Demand Image Classification Models ML Kit Training the model TensorFlow Lite Models

Slide 40

Slide 40 text

Evaluating the Model

Slide 41

Slide 41 text

AutoML Vision Edge ´ Hence with combination of ML Kit & AutoML we can train, refine, evaluate and deploy the model in the mobile app to achieve our objective. ´ Almost 1.8 x faster than handcrafted models ´ Custom Image Classifier : http://bit.ly/2knilLB Google Code labs: http://bit.ly/2mikN6F

Slide 42

Slide 42 text

Recap Ø Main purpose for ML Kit SDK is to minimize the process from ideation to the final delivery of a specific use case. Ø More Base(Ready-to-use) API will continue to get added. Ø More scenarios in Vision, Speech & Text ecospace will continue be covered. Ø Collaboration of ML & Material Design also becomes an important reference point. Ø Simplify the use of mobile optimized custom models with AutoML & custom image classifier .

Slide 43

Slide 43 text

References ´ https://goo.gl/fovxvH – ML Kit Introduction ´ https://goo.gl/CHfMhU - Google IO 2018 ML Kit Video ´ https://goo.gl/v9iFWF - ML Kit series by Joe Birch ´ https://goo.gl/8pVYqj - ML series by Britt Barak ´ https://bit.ly/2OflaMF - ML series by Mark Allision ´ https://bit.ly/2ycT5LW - ML series by Ray Wenderlich ´ https://bit.ly/2Eeu3By - ML series by Harshit Dwivedi ´ http://bit.ly/fbasamples - Github repo for Android Samples ´ http://bit.ly/2m2lbGf - Google IO 2019 ML Kit Video ´ https://bit.ly/2RITNsn - GitHub Sample repo

Slide 44

Slide 44 text

“AI is probably the most important thing humanity has ever worked on. I think of it as something more profound than electricity or fire.” Sundar Pichai “Artificial Intelligence is the New Electricity” — Andrew Ng @bhatnagar_g