Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bring machine learning to Android with ML Kit

Bring machine learning to Android with ML Kit

What if I tell you that you don't need to be an expert in Artificial Intelligence and Machine Learning to bring some of these great features to your Android application.

In this session, we'll explore what ML Kit has to offer to easily bring machine learning features that are optimized for mobile. In fun and practical code sample, we'll explore the main Vision and Natural Language APIs to potentially unlock new ideas and innovative features in your apps.

Join us as we explore the power of ML Kit and discover how it can help you take your Android app to the next level!

Julien Salvi

May 30, 2023
Tweet

More Decks by Julien Salvi

Other Decks in Programming

Transcript

  1. Bring Machine Learning to Android with ML Kit Julien Salvi

    - Android GDE | Android @ Aircall plDroid 2023 @JulienSalvi ML on Android made simple
  2. “You don't need to be an expert in AI or

    ML to bring great features to your Android app.” Me - in my abstract (but it’s good to know some stuff 😅)
  3. Quick introduction to Machine Learning A little bit of context…

    A little bit of context… Get ready to ML-earn some things!
  4. “It is the use and development of computer systems that

    are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyse and draw inferences from patterns in data” OK Google What’s Machine Learning? Oxford Languages
  5. “Machine learning is the use of mathematical models and statistical

    techniques to enable computers to automatically learn from data and improve over time without being explicitly programmed.” ChatGPT: What’s Machine Learning? ChatGPT
  6. A ML overview Introduction • It’s a type of Artificial

    Intelligence that allows machines to learn from data and make predictions or decisions based on that data. • Machine learning is becoming increasingly important in various industries, such as healthcare, gaming, retail and more. • With machine learning, developers can create intelligent systems that can solve complex problems and improve efficiency. Artificial Intelligence Machine Learning Deep Learning
  7. Introduction Types of Machine Learning • Supervised Learning: A model

    learns from labeled training data to make predictions on new, unlabeled data. • Unsupervised Learning: A model learns from unlabeled data to find patterns and structures in the data. • Reinforcement Learning: A model learns by interacting with an environment and receiving rewards or penalties for its actions.
  8. Introduction Machine Learning process • Define the problem and gather

    data • Preprocess and clean the data • Select a model and train it on the data • Evaluate the model's performance on a test set • Deploy the model to make predictions on new data
  9. Introduction Common Machine Learning algorithms • Linear Regression: Predict a

    continuous output based on one or more input features. • Logistic Regression: Predict a binary or categorical output based on one or more input features. • Decision Trees: Make decisions by recursively splitting data into subsets based on input features. A B E D C
  10. Introduction Common Machine Learning algorithms • Random Forest: A collection

    of decision trees that each make a prediction and are combined to make a final prediction. • Neural Networks: A model inspired by the structure of the human brain, consisting of layers of interconnected nodes that learn to recognize patterns in data.
  11. Introduction Application of Machine Learning • Natural Language Processing (NLP):

    Text classification, sentiment analysis, language translation, chatbots... • Computer Vision: Object detection, image segmentation, facial recognition, barcode scanning... • Fraud Detection: Detecting fraudulent transactions or behavior. • Recommender Systems: Suggesting products, services, or content based on user preferences.
  12. Exploring ML Kit on Android A little bit of context…

    A little bit of context… Overview of ML Kit features
  13. In a nutshell ML Kit on Android • ML Kit

    brings powerful and easy-to-use ML features, optimized for Android and iOS with minimal coding and resource. • It provides pre-built and customizable models for common use cases such as image and text recognition, face detection, barcode scanning… • ML Kit also allows developers to train custom models using their own data.
  14. Model installation ML Kit on Android • Models in ML

    Kit APIs can be installed in one of 3 ways: ◦ Unbundled: Models are downloaded and managed via Google Play Services. ◦ Bundled: Models are statically linked to your app at build time. ◦ Dynamically downloaded: Models are downloaded on demand. • By using ML Kit you will increase your app size (2 to 10 MB per model).
  15. Global performance tips ML Kit on Android • Prefer using

    the camera2 or CameraX libraries: ◦ Drop frames while ML Kit is processing. ◦ Take advantage of the backpressure strategy (ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST) • Consider processing images at lower resolution to improve the performances but keep in mind the requirements. • Wait the ML Kit results before rendering the image.
  16. Text Recognition v2 (beta) Vision • Text Recognition v2 allows

    us to extract text from images (camera or static images) • Trained to recognize text in over 100 languages, including Latin-based scripts and non-Latin scripts such as Japanese or Chinese. • The Text Recognizer segments text into blocks, lines, elements and symbols. Bundled model: +4 MB per architecture
  17. Text Recognition v2 (beta) Vision • Text Recognition v2 allows

    us to extract text from images (camera or static images) • Trained to recognize text in over 100 languages, including Latin-based scripts and non-Latin scripts such as Japanese or Chinese. • The Text Recognizer segments text into blocks, lines, elements and symbols. Line Block Element Line Element Element
  18. dependencies { // To recognize Latin script implementation 'com.google.mlkit:text-recognition:16.0.0-beta6' //

    To recognize Chinese script implementation 'com.google.mlkit:text-recognition-chinese:16.0.0-beta6' // To recognize Devanagari script implementation 'com.google.mlkit:text-recognition-devanagari:16.0.0-beta6' // To recognize Japanese script implementation 'com.google.mlkit:text-recognition-japanese:16.0.0-beta6' // To recognize Korean script implementation 'com.google.mlkit:text-recognition-korean:16.0.0-beta6' }
  19. // Init TextRecognition client (here latin languages) val recognizer =

    TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS) // Load bitmpap image for instance val image = InputImage.fromBitmap(bitmap, 0) // Use the client to process the image val result = recognizer.process(image) .addOnSuccessListener { visionText -> // Get text from image and info where they are located val allText = visionText.text val blocks = visionText.textBlocks // ... } .addOnFailureListener { e -> // Task failed with an exception }
  20. Face detection Vision • Face detection to detect faces in

    images and video streams. • Get the contours of detected faces and their eyes, eyebrows, lips and nose. • Determine facial expressions like when someone is smiling or close their eyes. • Key concepts: face tracking, contour, landmark or classification. Bundled model: 6.9 MB Unbundled model: 800 KB
  21. dependencies { // Bundled library implementation 'com.google.mlkit:face-detection:16.1.5' // Bundled library

    implementation 'com.google.android.gms:play-services-mlkit-face-detection:17.1.0' } // If you chose the Play Services <application ...> ... <meta-data android:name="com.google.mlkit.vision.DEPENDENCIES" android:value="face" > <!-- To use multiple models: android:value="langid,model2,model3" --> </application>
  22. // High-accuracy landmark detection and face classification val highAccuracyOpts =

    FaceDetectorOptions.Builder() .setPerformanceMode(FaceDetectorOptions.PERFORMANCE_MODE_ACCURATE) .setLandmarkMode(FaceDetectorOptions.LANDMARK_MODE_ALL) .setClassificationMode(FaceDetectorOptions.CLASSIFICATION_MODE_ALL) .build() // Real-time contour detection val realTimeOpts = FaceDetectorOptions.Builder() .setContourMode(FaceDetectorOptions.CONTOUR_MODE_ALL) .build() // Setup the face detection client val detector = FaceDetection.getClient(highAccuracyOpts) //or realTimeOpts here
  23. // Setup the face detection client val detector = FaceDetection.getClient(options)

    // Process the image previously computed val result = detector.process(image) .addOnSuccessListener { faces -> // Do something! faces.forEach { face -> // If contour detection was enabled: val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points // If classification was enabled: if (face.smilingProbability != null) { val smileProb = face.smilingProbability } } } .addOnFailureListener { e -> // Something wrong happened }
  24. // Setup the face detection client val detector = FaceDetection.getClient(options)

    // Process the image previously computed val result = detector.process(image) .addOnSuccessListener { faces -> // Do something! faces.forEach { face -> // If contour detection was enabled: val leftEyeContour = face.getContour(FaceContour.LEFT_EYE)?.points val upperLipBottomContour = face.getContour(FaceContour.UPPER_LIP_BOTTOM)?.points // If classification was enabled: if (face.smilingProbability != null) { val smileProb = face.smilingProbability } } } .addOnFailureListener { e -> // Something wrong happened }
  25. Face mesh detection (beta) Vision • Generate a high accuracy

    3D mesh of your face. • Get the bounding box for detected faces in a selfie-like picture or get the 468 3D points mesh for AR purposes for example. • You can use static images or real-time video frames to generate the mesh Bundled model: 6.4 MB
  26. // Default face mesh detection val defaultDetector = FaceMeshDetection.getClient( FaceMeshDetectorOptions.DEFAULT_OPTIONS)

    // Bounding box only val boundingBoxDetector = FaceMeshDetection.getClient( FaceMeshDetectorOptions.Builder() .setUseCase(UseCase.BOUNDING_BOX_ONLY) .build() )
  27. // Default face mesh detection val defaultDetector = FaceMeshDetection.getClient( FaceMeshDetectorOptions.DEFAULT_OPTIONS)

    val result = detector.process(image) .addOnSuccessListener { faceMeshs -> // Mesh detected! faceMeshs.forEach { val bounds: Rect = faceMesh.boundingBox() // Gets all points val faceMeshpoints = faceMesh.allPoints faceMeshpoints.forEach { faceMeshpoint -> val index: Int = faceMeshpoints.index() val position = faceMeshpoint.position } } } .addOnFailureListener { e -> // Something wrong happened }
  28. // Default face mesh detection val defaultDetector = FaceMeshDetection.getClient( FaceMeshDetectorOptions.DEFAULT_OPTIONS)

    val result = detector.process(image) .addOnSuccessListener { faceMeshs -> // Mesh detected! faceMeshs.forEach { val bounds: Rect = faceMesh.boundingBox() // Gets all points val faceMeshpoints = faceMesh.allPoints faceMeshpoints.forEach { faceMeshpoint -> val index: Int = faceMeshpoints.index() val position = faceMeshpoint.position } } } .addOnFailureListener { e -> // Something wrong happened }
  29. Barcode scanning Vision • Read and recognize most popular barcodes

    like codabar or QR Code. • Automatic format detection. • Run on device so no need to have an internet connection to perform the scans. Bundled model: 2.4 MB Unbundled model: 200 KB
  30. dependencies { // Bundled library implementation 'com.google.mlkit:barcode-scanning:17.1.0' // Bundled library

    implementation 'com.google.android.gms:play-services-mlkit-barcode-scanning:18.2.0' } // If you chose the Play Services <application ...> ... <meta-data android:name="com.google.mlkit.vision.DEPENDENCIES" android:value="barcode" > <!-- To use multiple models: android:value="langid,model2,model3" --> </application>
  31. val options = BarcodeScannerOptions.Builder() .setBarcodeFormats( Barcode.FORMAT_QR_CODE, Barcode.FORMAT_CODABAR // any format

    you want to support ) .enableAllPotentialBarcodes() // Optional. Starting from 17.1.0 .build() val scanner = BarcodeScanning.getClient(options)
  32. val scanner = BarcodeScanning.getClient(options) // Use the image previously computed

    to perform the detection val result = scanner.process(image) .addOnSuccessListener { barcodes -> barcodes.forEach { barcode -> val valueType = barcode.valueType when (valueType) { Barcode.TYPE_WIFI -> { val ssid = barcode.wifi!!.ssid val password = barcode.wifi!!.password val type = barcode.wifi!!.encryptionType } Barcode.TYPE_URL -> { val title = barcode.url!!.title val url = barcode.url!!.url } else -> barcode.rawValue } } } .addOnFailureListener { // Error }
  33. Digital ink recognition Vision • Recognize handwritten text or drawn

    emojis and convert it into unicode text. • Supports 300+ languages and 25+ writing systems. • Dynamically download the language assets you want to use. Bundled model: 4.5 MB
  34. // Specify the recognition model for a language var modelIdentifier:

    DigitalInkRecognitionModelIdentifier try { modelIdentifier = DigitalInkRecognitionModelIdentifier.fromLanguageTag("en-US") } catch (e: MlKitException) { // language tag failed to parse, handle error. } var model: DigitalInkRecognitionModel = DigitalInkRecognitionModel.builder(modelIdentifier).build() // Get a recognizer for the language var recognizer: DigitalInkRecognizer = DigitalInkRecognition.getClient( DigitalInkRecognizerOptions.builder(model).build())
  35. // Populate the ink builder with data collected when writting

    the text // You can use the onTouchEvent to achieve that var inkBuilder = Ink.builder() ... recognizer.recognize(ink) .addOnSuccessListener { result: RecognitionResult -> // `result` contains the recognizer's answers as a RecognitionResult. result.candidates.forEach { text -> Lod.i(TAG, "Text candidates: $text") } } .addOnFailureListener { e: Exception -> Log.e(TAG, "Error during recognition: $e") }
  36. Language Identification Natural Language • Identifies the language of a

    text. • Can recognize over 100 languages, including Latin-based scripts and non-Latin scripts such as Japanese, Greek or Chinese. • The LanguageIdentifier will return the best languages according to a given confidence threshold. Bundled model: 900 KB Unbundled model: 200 KB
  37. dependencies { // Bundled library implementation 'com.google.mlkit:language-id:17.0.4' // Bundled library

    implementation 'com.google.android.gms:play-services-mlkit-language-id:17:0:0' } // If you chose the Play Services <application ...> ... <meta-data android:name="com.google.mlkit.vision.DEPENDENCIES" android:value="langid" > <!-- To use multiple models: android:value="langid,model2,model3" --> </application>
  38. val languageIdentifier = LanguageIdentification .getClient(LanguageIdentificationOptions.Builder() .setConfidenceThreshold(0.40f) .build()) languageIdentifier.identifyLanguage(text) .addOnSuccessListener {

    languageCode -> if (languageCode == "und") { // Can't identify language. } else { // Language identified! } } .addOnFailureListener { // Error }
  39. val languageIdentifier = LanguageIdentification .getClient(LanguageIdentificationOptions.Builder() .setConfidenceThreshold(0.40f) .build()) languageIdentifier.identifyPossibleLanguages(text) .addOnSuccessListener {

    identifiedLanguages -> for (identifiedLanguage in identifiedLanguages) { val language = identifiedLanguage.languageTag val confidence = identifiedLanguage.confidence // Languages identified, do something :) } } .addOnFailureListener { // Error }
  40. Translation Natural Language • Offline translation powered by the same

    models used by Google Translate. • Can translate more than 50 languages. • Dynamically download the translation model you want on your device. • Translations on-device aren’t the best ones, for more accurate translations, use the Cloud Translation API.
  41. val options = TranslatorOptions.Builder() .setSourceLanguage(TranslateLanguage.ENGLISH) .setTargetLanguage(TranslateLanguage.FRENCH) .build() val translator =

    Translation.getClient(options) // Download the translation modal if needed var conditions = DownloadConditions.Builder() .requireWifi() .build() translator.downloadModelIfNeeded(conditions) .addOnSuccessListener { // Model downloaded successfully. Let's translate! } .addOnFailureListener { exception -> // Error }
  42. val options = TranslatorOptions.Builder() .setSourceLanguage(TranslateLanguage.ENGLISH) .setTargetLanguage(TranslateLanguage.FRENCH) .build() val translator =

    Translation.getClient(options) translator.translate(text) .addOnSuccessListener { translatedText -> // Translation successful. Use it! println("This is my translated text: $translatedText") } .addOnFailureListener { exception -> // Error }
  43. Smart Reply Natural Language • Generates short replies or emojis

    from a list of max 10 messages. It generally gives you 3 suggestions according to the input. • Works only with the English language ☹ • On-device so no need to send the messages to a server. • Smart Reply is more for casual conversation so it does not all type of conversation (work, business…) Bundled model: 5.7 MB Unbundled model: 200 KB
  44. dependencies { // Bundled library implementation 'com.google.mlkit:smart-reply:17.0.2' // Bundled library

    implementation 'com.google.android.gms:play-services-mlkit-smart-reply:16:0:0-beta1' } // If you chose the Play Services <application ...> ... <meta-data android:name="com.google.mlkit.vision.DEPENDENCIES" android:value="smart_reply" > <!-- To use multiple models: android:value="smart_reply,model2,model3" --> </application>
  45. // conversation val conversation = mutableListOf<TextMessage>() conversation.add(TextMessage.createForLocalUser( "Hey what's up!

    Ready for plDroid?", System.currentTimeMillis()) ) conversation.add(TextMessage.createForRemoteUser( "Yeah! Are you coming to the conference?", System.currentTimeMillis(), userId) ) // Generate replies val smartReplyGenerator = SmartReply.getClient() smartReply.suggestReplies(conversation) .addOnSuccessListener { result -> if (result.getStatus() == SmartReplySuggestionResult.STATUS_NOT_SUPPORTED_LANGUAGE) { // The conversation's language isn't supported, no suggestions. } else if (result.getStatus() == SmartReplySuggestionResult.STATUS_SUCCESS) { result.suggestions.forEach { suggestion -> // Do something val replyText = suggestion.text } } } .addOnFailureListener { // Error do something :) }
  46. // conversation val conversation = mutableListOf<TextMessage>() conversation.add(TextMessage.createForLocalUser( "Hey what's up!

    Ready for plDroid?", System.currentTimeMillis()) ) conversation.add(TextMessage.createForRemoteUser( "Yeah! Are you coming to the conference?", System.currentTimeMillis(), userId) ) // Generate replies val smartReplyGenerator = SmartReply.getClient() smartReply.suggestReplies(conversation) .addOnSuccessListener { result -> if (result.getStatus() == SmartReplySuggestionResult.STATUS_NOT_SUPPORTED_LANGUAGE) { // The conversation's language isn't supported, no suggestions. } else if (result.getStatus() == SmartReplySuggestionResult.STATUS_SUCCESS) { result.suggestions.forEach { suggestion -> // Do something val replyText = suggestion.text } } } .addOnFailureListener { // Error do something :) }
  47. Entity extraction (beta) Natural Language • Recognizes specific entities in

    a text. Entities can an email, address, phone number, date, flight number… • Can recognize entities in 15 different languages like English, French, Polish or Arabic. • Dynamically download the entity extraction model you want on your device. Bundled model: 5.6 MB
  48. private fun initExtractor() { entityExtractor = EntityExtraction.getClient( EntityExtractorOptions.Builder(EntityExtractorOptions.ENGLISH).build() ) entityExtractor

    .downloadModelIfNeeded() .addOnSuccessListener { _ -> /* Model downloading succeeded, you can call extraction API here. */ } .addOnFailureListener { _ -> /* Model downloading failed. */ } }
  49. val params = EntityExtractionParams.Builder(message).build() entityExtractor.annotate(params) .addOnSuccessListener { annotations -> //

    Annotation process was successful, you can parse the EntityAnnotations list here. val sb = StringBuilder() val entities = mutableListOf<Entity>() for (annotation in annotations) { entities.add(Entity(annotation.start, annotation.end)) sb.append(annotation.annotatedText + "=") for (entity in annotation.entities) { sb.appendLine(entity.toString()) } } entitiesViewState.postValue(sb.toString()) val newMessages = messagesViewState.value?.map { state -> if (state is MessageViewState && state.id == id) { state.copy(entities = entities) } else { state } } ?: emptyList() messagesViewState.postValue(newMessages) } .addOnFailureListener { // Check failure message here. }
  50. ML Kit features in action! A little bit of context…

    A little bit of context… Live from the studio 🚀
  51. Learn more about ML Journal of Machine Learning Research https://www.jmlr.org/

    Machine Learning Crash Course by Google https://developers.google.com/machine-learning/crash-course Machine Learning lessons https://www.coursera.org/browse/data-science/machine-learning
  52. To deep dive into ML Kit… ML Kit documentation https://developers.google.com/ml-kit

    ML/AI Codelabs https://codelabs.developers.google.com/?category=aiandmachinelearning&product=android ML Kit samples (Android/iOS) https://github.com/googlesamples/mlkit/