Slide 1

Slide 1 text

Bring Machine Learning to Android with ML Kit Julien Salvi - Android GDE | Android @ Aircall plDroid 2023 @JulienSalvi ML on Android made simple

Slide 2

Slide 2 text

Julien Salvi Lead Android Engineer @ Aircall Android GDE PAUG, Punk & IPAs! @JulienSalvi Bonjour !

Slide 3

Slide 3 text

“You don't need to be an expert in AI or ML to bring great features to your Android app.” Me - in my abstract (but it’s good to know some stuff 😅)

Slide 4

Slide 4 text

Quick introduction to Machine Learning A little bit of context… A little bit of context… Get ready to ML-earn some things!

Slide 5

Slide 5 text

“It is the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyse and draw inferences from patterns in data” OK Google What’s Machine Learning? Oxford Languages

Slide 6

Slide 6 text

“Machine learning is the use of mathematical models and statistical techniques to enable computers to automatically learn from data and improve over time without being explicitly programmed.” ChatGPT: What’s Machine Learning? ChatGPT

Slide 7

Slide 7 text

Gotcha! But could you explain a bit more?

Slide 8

Slide 8 text

OK… I am not a ML expert… But let’s see how it goes!

Slide 9

Slide 9 text

A ML overview Introduction ● It’s a type of Artificial Intelligence that allows machines to learn from data and make predictions or decisions based on that data. ● Machine learning is becoming increasingly important in various industries, such as healthcare, gaming, retail and more. ● With machine learning, developers can create intelligent systems that can solve complex problems and improve efficiency. Artificial Intelligence Machine Learning Deep Learning

Slide 10

Slide 10 text

Introduction Types of Machine Learning ● Supervised Learning: A model learns from labeled training data to make predictions on new, unlabeled data. ● Unsupervised Learning: A model learns from unlabeled data to find patterns and structures in the data. ● Reinforcement Learning: A model learns by interacting with an environment and receiving rewards or penalties for its actions.

Slide 11

Slide 11 text

Introduction Machine Learning process ● Define the problem and gather data ● Preprocess and clean the data ● Select a model and train it on the data ● Evaluate the model's performance on a test set ● Deploy the model to make predictions on new data

Slide 12

Slide 12 text

Introduction Common Machine Learning algorithms ● Linear Regression: Predict a continuous output based on one or more input features. ● Logistic Regression: Predict a binary or categorical output based on one or more input features. ● Decision Trees: Make decisions by recursively splitting data into subsets based on input features. A B E D C

Slide 13

Slide 13 text

Introduction Common Machine Learning algorithms ● Random Forest: A collection of decision trees that each make a prediction and are combined to make a final prediction. ● Neural Networks: A model inspired by the structure of the human brain, consisting of layers of interconnected nodes that learn to recognize patterns in data.

Slide 14

Slide 14 text

Introduction Application of Machine Learning ● Natural Language Processing (NLP): Text classification, sentiment analysis, language translation, chatbots... ● Computer Vision: Object detection, image segmentation, facial recognition, barcode scanning... ● Fraud Detection: Detecting fraudulent transactions or behavior. ● Recommender Systems: Suggesting products, services, or content based on user preferences.

Slide 15

Slide 15 text

And kids… That’s how I used machine to explain machine learning Thanks ChatGPT

Slide 16

Slide 16 text

Exploring ML Kit on Android A little bit of context… A little bit of context… Overview of ML Kit features

Slide 17

Slide 17 text

In a nutshell ML Kit on Android ● ML Kit brings powerful and easy-to-use ML features, optimized for Android and iOS with minimal coding and resource. ● It provides pre-built and customizable models for common use cases such as image and text recognition, face detection, barcode scanning… ● ML Kit also allows developers to train custom models using their own data.

Slide 18

Slide 18 text

Model installation ML Kit on Android ● Models in ML Kit APIs can be installed in one of 3 ways: ○ Unbundled: Models are downloaded and managed via Google Play Services. ○ Bundled: Models are statically linked to your app at build time. ○ Dynamically downloaded: Models are downloaded on demand. ● By using ML Kit you will increase your app size (2 to 10 MB per model).

Slide 19

Slide 19 text

Global performance tips ML Kit on Android ● Prefer using the camera2 or CameraX libraries: ○ Drop frames while ML Kit is processing. ○ Take advantage of the backpressure strategy (ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST) ● Consider processing images at lower resolution to improve the performances but keep in mind the requirements. ● Wait the ML Kit results before rendering the image.

Slide 20

Slide 20 text

ML Kit Vision Video and Image analysis

Slide 21

Slide 21 text

Text Recognition v2 (beta) Vision ● Text Recognition v2 allows us to extract text from images (camera or static images) ● Trained to recognize text in over 100 languages, including Latin-based scripts and non-Latin scripts such as Japanese or Chinese. ● The Text Recognizer segments text into blocks, lines, elements and symbols. Bundled model: +4 MB per architecture

Slide 22

Slide 22 text

Text Recognition v2 (beta) Vision ● Text Recognition v2 allows us to extract text from images (camera or static images) ● Trained to recognize text in over 100 languages, including Latin-based scripts and non-Latin scripts such as Japanese or Chinese. ● The Text Recognizer segments text into blocks, lines, elements and symbols. Line Block Element Line Element Element

Slide 23

Slide 23 text

dependencies { // To recognize Latin script implementation 'com.google.mlkit:text-recognition:16.0.0-beta6' // To recognize Chinese script implementation 'com.google.mlkit:text-recognition-chinese:16.0.0-beta6' // To recognize Devanagari script implementation 'com.google.mlkit:text-recognition-devanagari:16.0.0-beta6' // To recognize Japanese script implementation 'com.google.mlkit:text-recognition-japanese:16.0.0-beta6' // To recognize Korean script implementation 'com.google.mlkit:text-recognition-korean:16.0.0-beta6' }

Slide 24

Slide 24 text

// Init TextRecognition client (here latin languages) val recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS) // Load bitmpap image for instance val image = InputImage.fromBitmap(bitmap, 0) // Use the client to process the image val result = recognizer.process(image) .addOnSuccessListener { visionText -> // Get text from image and info where they are located val allText = visionText.text val blocks = visionText.textBlocks // ... } .addOnFailureListener { e -> // Task failed with an exception }

Slide 25

Slide 25 text

Face detection Vision ● Face detection to detect faces in images and video streams. ● Get the contours of detected faces and their eyes, eyebrows, lips and nose. ● Determine facial expressions like when someone is smiling or close their eyes. ● Key concepts: face tracking, contour, landmark or classification. Bundled model: 6.9 MB Unbundled model: 800 KB

Slide 26

Slide 26 text

dependencies { // Bundled library implementation 'com.google.mlkit:face-detection:16.1.5' // Bundled library implementation 'com.google.android.gms:play-services-mlkit-face-detection:17.1.0' } // If you chose the Play Services ...

Slide 27

Slide 27 text

// High-accuracy landmark detection and face classification val highAccuracyOpts = FaceDetectorOptions.Builder() .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE) .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL) .build() // Real-time contour detection val realTimeOpts = FaceDetectorOptions.Builder() .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL) .build() // Setup the face detection client val detector = FaceDetection.getClient(highAccuracyOpts) //or realTimeOpts here

Slide 28

Slide 28 text

// Setup the face detection client val detector = FaceDetection.getClient(options) // Process the image previously computed val result = detector.process(image) .addOnSuccessListener { faces -> // Do something! faces.forEach { face -> // If contour detection was enabled: val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points // If classification was enabled: if (face.smilingProbability != null) { val smileProb = face.smilingProbability } } } .addOnFailureListener { e -> // Something wrong happened }

Slide 29

Slide 29 text

// Setup the face detection client val detector = FaceDetection.getClient(options) // Process the image previously computed val result = detector.process(image) .addOnSuccessListener { faces -> // Do something! faces.forEach { face -> // If contour detection was enabled: val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points // If classification was enabled: if (face.smilingProbability != null) { val smileProb = face.smilingProbability } } } .addOnFailureListener { e -> // Something wrong happened }

Slide 30

Slide 30 text

Face mesh detection (beta) Vision ● Generate a high accuracy 3D mesh of your face. ● Get the bounding box for detected faces in a selfie-like picture or get the 468 3D points mesh for AR purposes for example. ● You can use static images or real-time video frames to generate the mesh Bundled model: 6.4 MB

Slide 31

Slide 31 text

dependencies { // Face mesh detection implementation 'com.google.mlkit:face-mesh-detection:16.0.0-beta1' }

Slide 32

Slide 32 text

// Default face mesh detection val defaultDetector = FaceMeshDetection.getClient( FaceMeshDetectorOptions.DEFAULT_OPTIONS) // Bounding box only val boundingBoxDetector = FaceMeshDetection.getClient( FaceMeshDetectorOptions.Builder() .setUseCase(UseCase.BOUNDING_BOX_ONLY) .build() )

Slide 33

Slide 33 text

// Default face mesh detection val defaultDetector = FaceMeshDetection.getClient( FaceMeshDetectorOptions.DEFAULT_OPTIONS) val result = detector.process(image) .addOnSuccessListener { faceMeshs -> // Mesh detected! faceMeshs.forEach { val bounds: Rect = faceMesh.boundingBox() // Gets all points val faceMeshpoints = faceMesh.allPoints faceMeshpoints.forEach { faceMeshpoint -> val index: Int = faceMeshpoints.index() val position = faceMeshpoint.position } } } .addOnFailureListener { e -> // Something wrong happened }

Slide 34

Slide 34 text

// Default face mesh detection val defaultDetector = FaceMeshDetection.getClient( FaceMeshDetectorOptions.DEFAULT_OPTIONS) val result = detector.process(image) .addOnSuccessListener { faceMeshs -> // Mesh detected! faceMeshs.forEach { val bounds: Rect = faceMesh.boundingBox() // Gets all points val faceMeshpoints = faceMesh.allPoints faceMeshpoints.forEach { faceMeshpoint -> val index: Int = faceMeshpoints.index() val position = faceMeshpoint.position } } } .addOnFailureListener { e -> // Something wrong happened }

Slide 35

Slide 35 text

Barcode scanning Vision ● Read and recognize most popular barcodes like codabar or QR Code. ● Automatic format detection. ● Run on device so no need to have an internet connection to perform the scans. Bundled model: 2.4 MB Unbundled model: 200 KB

Slide 36

Slide 36 text

dependencies { // Bundled library implementation 'com.google.mlkit:barcode-scanning:17.1.0' // Bundled library implementation 'com.google.android.gms:play-services-mlkit-barcode-scanning:18.2.0' } // If you chose the Play Services ...

Slide 37

Slide 37 text

val options = BarcodeScannerOptions.Builder() .setBarcodeFormats( Barcode.FORMAT_QR_CODE, Barcode.FORMAT_CODABAR // any format you want to support ) .enableAllPotentialBarcodes() // Optional. Starting from 17.1.0 .build() val scanner = BarcodeScanning.getClient(options)

Slide 38

Slide 38 text

val scanner = BarcodeScanning.getClient(options) // Use the image previously computed to perform the detection val result = scanner.process(image) .addOnSuccessListener { barcodes -> barcodes.forEach { barcode -> val valueType = barcode.valueType when (valueType) { Barcode.TYPE_WIFI -> { val ssid = barcode.wifi!!.ssid val password = barcode.wifi!!.password val type = barcode.wifi!!.encryptionType } Barcode.TYPE_URL -> { val title = barcode.url!!.title val url = barcode.url!!.url } else -> barcode.rawValue } } } .addOnFailureListener { // Error }

Slide 39

Slide 39 text

Digital ink recognition Vision ● Recognize handwritten text or drawn emojis and convert it into unicode text. ● Supports 300+ languages and 25+ writing systems. ● Dynamically download the language assets you want to use. Bundled model: 4.5 MB

Slide 40

Slide 40 text

dependencies { // Face mesh detection implementation 'com.google.mlkit:digital-ink-recognition:18.1.0' }

Slide 41

Slide 41 text

// Specify the recognition model for a language var modelIdentifier: DigitalInkRecognitionModelIdentifier try { modelIdentifier = DigitalInkRecognitionModelIdentifier.fromLanguageTag("en-US") } catch (e: MlKitException) { // language tag failed to parse, handle error. } var model: DigitalInkRecognitionModel = DigitalInkRecognitionModel.builder(modelIdentifier).build() // Get a recognizer for the language var recognizer: DigitalInkRecognizer = DigitalInkRecognition.getClient( DigitalInkRecognizerOptions.builder(model).build())

Slide 42

Slide 42 text

// Populate the ink builder with data collected when writting the text // You can use the onTouchEvent to achieve that var inkBuilder = Ink.builder() ... recognizer.recognize(ink) .addOnSuccessListener { result: RecognitionResult -> // `result` contains the recognizer's answers as a RecognitionResult. result.candidates.forEach { text -> Lod.i(TAG, "Text candidates: $text") } } .addOnFailureListener { e: Exception -> Log.e(TAG, "Error during recognition: $e") }

Slide 43

Slide 43 text

ML Kit Natural Language Natural Language Processing (NLP)

Slide 44

Slide 44 text

Language Identification Natural Language ● Identifies the language of a text. ● Can recognize over 100 languages, including Latin-based scripts and non-Latin scripts such as Japanese, Greek or Chinese. ● The LanguageIdentifier will return the best languages according to a given confidence threshold. Bundled model: 900 KB Unbundled model: 200 KB

Slide 45

Slide 45 text

dependencies { // Bundled library implementation 'com.google.mlkit:language-id:17.0.4' // Bundled library implementation 'com.google.android.gms:play-services-mlkit-language-id:17:0:0' } // If you chose the Play Services ...

Slide 46

Slide 46 text

val languageIdentifier = LanguageIdentification .getClient(LanguageIdentificationOptions.Builder() .setConfidenceThreshold(0.40f) .build()) languageIdentifier.identifyLanguage(text) .addOnSuccessListener { languageCode -> if (languageCode == "und") { // Can't identify language. } else { // Language identified! } } .addOnFailureListener { // Error }

Slide 47

Slide 47 text

val languageIdentifier = LanguageIdentification .getClient(LanguageIdentificationOptions.Builder() .setConfidenceThreshold(0.40f) .build()) languageIdentifier.identifyPossibleLanguages(text) .addOnSuccessListener { identifiedLanguages -> for (identifiedLanguage in identifiedLanguages) { val language = identifiedLanguage.languageTag val confidence = identifiedLanguage.confidence // Languages identified, do something :) } } .addOnFailureListener { // Error }

Slide 48

Slide 48 text

Translation Natural Language ● Offline translation powered by the same models used by Google Translate. ● Can translate more than 50 languages. ● Dynamically download the translation model you want on your device. ● Translations on-device aren’t the best ones, for more accurate translations, use the Cloud Translation API.

Slide 49

Slide 49 text

dependencies { // ML Kit translate implementation 'com.google.mlkit:translate:17.0.1' }

Slide 50

Slide 50 text

val options = TranslatorOptions.Builder() .setSourceLanguage(TranslateLanguage.ENGLISH) .setTargetLanguage(TranslateLanguage.FRENCH) .build() val translator = Translation.getClient(options) // Download the translation modal if needed var conditions = DownloadConditions.Builder() .requireWifi() .build() translator.downloadModelIfNeeded(conditions) .addOnSuccessListener { // Model downloaded successfully. Let's translate! } .addOnFailureListener { exception -> // Error }

Slide 51

Slide 51 text

val options = TranslatorOptions.Builder() .setSourceLanguage(TranslateLanguage.ENGLISH) .setTargetLanguage(TranslateLanguage.FRENCH) .build() val translator = Translation.getClient(options) translator.translate(text) .addOnSuccessListener { translatedText -> // Translation successful. Use it! println("This is my translated text: $translatedText") } .addOnFailureListener { exception -> // Error }

Slide 52

Slide 52 text

Smart Reply Natural Language ● Generates short replies or emojis from a list of max 10 messages. It generally gives you 3 suggestions according to the input. ● Works only with the English language ☹ ● On-device so no need to send the messages to a server. ● Smart Reply is more for casual conversation so it does not all type of conversation (work, business…) Bundled model: 5.7 MB Unbundled model: 200 KB

Slide 53

Slide 53 text

dependencies { // Bundled library implementation 'com.google.mlkit:smart-reply:17.0.2' // Bundled library implementation 'com.google.android.gms:play-services-mlkit-smart-reply:16:0:0-beta1' } // If you chose the Play Services ...

Slide 54

Slide 54 text

// conversation val conversation = mutableListOf() conversation.add(TextMessage.createForLocalUser( "Hey what's up! Ready for plDroid?", System.currentTimeMillis()) ) conversation.add(TextMessage.createForRemoteUser( "Yeah! Are you coming to the conference?", System.currentTimeMillis(), userId) ) // Generate replies val smartReplyGenerator = SmartReply.getClient() smartReply.suggestReplies(conversation) .addOnSuccessListener { result -> if (result.getStatus() == SmartReplySuggestionResult.STATUS_NOT_SUPPORTED_LANGUAGE) { // The conversation's language isn't supported, no suggestions. } else if (result.getStatus() == SmartReplySuggestionResult.STATUS_SUCCESS) { result.suggestions.forEach { suggestion -> // Do something val replyText = suggestion.text } } } .addOnFailureListener { // Error do something :) }

Slide 55

Slide 55 text

// conversation val conversation = mutableListOf() conversation.add(TextMessage.createForLocalUser( "Hey what's up! Ready for plDroid?", System.currentTimeMillis()) ) conversation.add(TextMessage.createForRemoteUser( "Yeah! Are you coming to the conference?", System.currentTimeMillis(), userId) ) // Generate replies val smartReplyGenerator = SmartReply.getClient() smartReply.suggestReplies(conversation) .addOnSuccessListener { result -> if (result.getStatus() == SmartReplySuggestionResult.STATUS_NOT_SUPPORTED_LANGUAGE) { // The conversation's language isn't supported, no suggestions. } else if (result.getStatus() == SmartReplySuggestionResult.STATUS_SUCCESS) { result.suggestions.forEach { suggestion -> // Do something val replyText = suggestion.text } } } .addOnFailureListener { // Error do something :) }

Slide 56

Slide 56 text

Entity extraction (beta) Natural Language ● Recognizes specific entities in a text. Entities can an email, address, phone number, date, flight number… ● Can recognize entities in 15 different languages like English, French, Polish or Arabic. ● Dynamically download the entity extraction model you want on your device. Bundled model: 5.6 MB

Slide 57

Slide 57 text

dependencies { // ML Kit entity extraction implementation 'com.google.mlkit:entity-extraction:16.0.0-beta4' }

Slide 58

Slide 58 text

private fun initExtractor() { entityExtractor = EntityExtraction.getClient( EntityExtractorOptions.Builder(EntityExtractorOptions.ENGLISH).build() ) entityExtractor .downloadModelIfNeeded() .addOnSuccessListener { _ -> /* Model downloading succeeded, you can call extraction API here. */ } .addOnFailureListener { _ -> /* Model downloading failed. */ } }

Slide 59

Slide 59 text

val params = EntityExtractionParams.Builder(message).build() entityExtractor.annotate(params) .addOnSuccessListener { annotations -> // Annotation process was successful, you can parse the EntityAnnotations list here. val sb = StringBuilder() val entities = mutableListOf() for (annotation in annotations) { entities.add(Entity(annotation.start, annotation.end)) sb.append(annotation.annotatedText + "=") for (entity in annotation.entities) { sb.appendLine(entity.toString()) } } entitiesViewState.postValue(sb.toString()) val newMessages = messagesViewState.value?.map { state -> if (state is MessageViewState && state.id == id) { state.copy(entities = entities) } else { state } } ?: emptyList() messagesViewState.postValue(newMessages) } .addOnFailureListener { // Check failure message here. }

Slide 60

Slide 60 text

ML Kit features in action! A little bit of context… A little bit of context… Live from the studio 🚀

Slide 61

Slide 61 text

x Let’s explore some real life examples 🚀

Slide 62

Slide 62 text

Learn more about ML Journal of Machine Learning Research https://www.jmlr.org/ Machine Learning Crash Course by Google https://developers.google.com/machine-learning/crash-course Machine Learning lessons https://www.coursera.org/browse/data-science/machine-learning

Slide 63

Slide 63 text

To deep dive into ML Kit… ML Kit documentation https://developers.google.com/ml-kit ML/AI Codelabs https://codelabs.developers.google.com/?category=aiandmachinelearning&product=android ML Kit samples (Android/iOS) https://github.com/googlesamples/mlkit/

Slide 64

Slide 64 text

Julien Salvi plDroid 2023 @JulienSalvi Dziękuję! Have fun with ML Kit!