ML Kit in Action

7b5a07956eb0b62be7214d043821a987?s=47 jinqian
June 20, 2018

ML Kit in Action

Slides for MobileThings meetup: When Machine Learning meets Augmented Reality (ML Kit / Core ML + ARKit)

Demo repo can be found here: https://github.com/jinqian/MLKit-in-actions

7b5a07956eb0b62be7214d043821a987?s=128

jinqian

June 20, 2018
Tweet

Transcript

  1. ML Kit in Action (Android) Mobile Things S02E03 When machine

    learning meets augmented reality Qian JIN | @bonbonking | qjin@xebia.fr Image Credit: https://becominghuman.ai/part-1-migrate-deep-learning-training-onto-mobile-devices-c28029ffeb30
  2. ML Kit in Action • The building blocks of ML

    Kit • Vision APIs: text recognition, face detection, barcode scanning, image labeling, landmark recognition • Custom Models • Custom TensorFlow build • General feedbacks
  3. The building blocks of ML Kit

  4. Mobile Vision API TensorFlow Lite Neural Network API Google Cloud

    Vision API + ML Kit Vision APIs + ML Kit Custom Models / TF Lite Build = =
  5. None
  6. None
  7. None
  8. Vision APIs

  9. Vision: You talking to me?

  10. FirebaseVisionImage • fromBitmap • fromByteArray • fromByteBuffer • fromFilePath •

    fromMediaImage
  11. None
  12. Text Recognition On-device Cloud Pricing Free Free for first 1000

    uses of this feature per month Ideal use cases Real-time processing High-accuracy text recognition Document scanning Language support Latin characters A broad range of languages and special characters
  13. FirebaseVisionTextDetector FirebaseVisionImage INPUT FirebaseVisionText OUTPUT

  14. MobileThings: ML Kit in action for (FirebaseVisionText.Block block: firebaseVisionText.getBlocks()) {

    Rect boundingBox = block.getBoundingBox(); Point[] cornerPoints = block.getCornerPoints(); String text = block.getText(); for (FirebaseVisionText.Line line: block.getLines()) { // ... for (FirebaseVisionText.Element element: line.getElements()) { // ... } } }
  15. None
  16. None
  17. Face Detection: Key Capabilities • Recognise and locate facial features

    • Recognise facial expressions • Track faces across video frames • Process video frames in real time
  18. Face tracking Landmark Classification Face Orientation

  19. Face Orientation • Euler X • Euler Y • Euler

    Z
  20. Landmarks • A landmark is a point of interest within

    a face. The left eye, right eye, and nose base are all examples of landmarks
  21. Classification • 2 classifications are supported: Eye open (left &

    right eye) & Smiling • Inspiration: Android Things photo booth
  22. None
  23. None
  24. Face Detection Options FirebaseVisionFaceDetectorOptions options = new FirebaseVisionFaceDetectorOptions.Builder() .setModeType(FirebaseVisionFaceDetectorOptions.ACCURATE_MODE) .setLandmarkType(FirebaseVisionFaceDetectorOptions.ALL_LANDMARKS)

    .setClassificationType(FirebaseVisionFaceDetectorOptions.ALL_CLASSIFICATIONS) .setMinFaceSize(0.15f) .setTrackingEnabled(true) .build();
  25. FirebaseVisionFaceDetector FirebaseVisionImage List<FirebaseVisionFace> INPUT OUTPUT

  26. ➡ boundingBox: Rect ➡ trackingId: Int ➡ headEulerAngleY: Float ➡

    headEulerAngleZ: Float ➡ smilingProbability: Float ➡ leftEyeOpenProbability: Float ➡ rightEyeOpenProbability: Float !
  27. Feedback • Real-time application: pay attention to the image size

    /fr.xebia.mlkitinactions E/pittpatt: online_face_detector.cc:236] inconsistent image dimensions detector.cc:220] inconsistent image dimensions /fr.xebia.mlkitinactions E/NativeFaceDetectorImpl: Native face detection failed java.lang.RuntimeException: Error detecting faces. at com.google.android.gms.vision.face.NativeFaceDetectorImpl.detectFacesJni(Native Method)
  28. None
  29. None
  30. None
  31. Image Labeling On-device Cloud Pricing Free Free for first 1000

    uses of this feature per month Label coverage 400+ labels that cover the most commonly-found concepts in photos. See below. 10,000+ labels in many categories. See below. Also, try the Cloud Vision API demo to see what labels can be found for an image you provide. Knowledge Graph entity ID support
  32. FirebaseVisionLabelDetector FirebaseVisionImage List<FirebaseVisionLabel> INPUT OUTPUT

  33. ➡ label: String ➡ confidence: Float ➡ entityId: String

  34. None
  35. None
  36. Landmark Recognition • Still in preview, using Cloud Vision API

    instead • Recognizes well-known landmarks • Get Google Knowledge Graph entity IDs • Low-volume use free (first 1000 images)
  37. Custom Model

  38. Some night towards the end of 2016

  39. None
  40. !40

  41. Android SDK (Java) Android NDK (C++) Classifier Implementation TensorFlow JNI

    wrapper Image (Bitmap) Trained Model top_results Classifications + Confidence input_tensor 1 2 3 4 Camera Preview Ref: https://jalammar.github.io/Supercharging-android-apps-using-tensorflow/ Overlay Display
  42. Magritte Ceci n’est pas une pomme.

  43. None
  44. Android Makers Paris, April 2017

  45. None
  46. Model Size All weights are stored as they are (64-bit

    floats) => 80MB !46
  47. ~80MB -> ~20MB !47 Weights Quantization 6.372638493746383 => 6.4 Source:

    https://www.tensorflow.org/performance/quantization
  48. Model Inception V3 Optimized & Quantized

  49. Google I/O, May 2017

  50. None
  51. Google AI Blog, June 2017

  52. MobileNet Mobile-first computer vision models for TensorFlow !52 Image credit

    : https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
  53. None
  54. ~80Mb => ~20Mb => ~1-5Mb Source: https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html

  55. DevFest Nantes, October 2017

  56. None
  57. Model MobileNets_0.25_224

  58. Google I/O, May 2018

  59. Custom Model: Key capabilities • TensorFlow lite model hosting •

    On-device ML inference • Automatic model fallback • Automatic model updates
  60. Convert model to TF Lite (model.lite) Host your TF Lite

    model on Firebase Use the TF Lite model for inference Train your TF model (model.pb) TOCO (TensorFlow Lite Optimizing Converter)
  61. How to train your dragon model?

  62. Train your model Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py \ --bottleneck_dir=tf_files/bottlenecks

    \ --how_many_training_steps=500 \ --model_dir=tf_files/models/ \ --summaries_dir=tf_files/training_summaries/ \ --output_graph=tf_files/retrained_graph.pb \ --output_labels=tf_files/retrained_labels.txt \ —architecture=mobilenet_0.50_224 \ --image_dir=tf_files/fruit_photos
  63. TF Lite conversion for retrained quantized model is currently unavailable.

    Firebase quickstart ML Kit sample only aimes quantized model.
  64. Convert to tflite format Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ bazel run --config=opt \

    //tensorflow/contrib/lite/toco:toco -- \ --input_file=/tmp/magritte_retrained_graph.pb \ --output_file=/tmp/magritte_graph.tflite \ --inference_type=FLOAT \ --input_shape=1,224,224,3 \ --input_array=input \ --output_array=final_result \ --mean_value=128 \ --std_value=128 \ --default_ranges_min=0 \ --default_ranges_max=6
  65. None
  66. Do you need custom bob models?

  67. Use custom models if • Specific needs CAN NOT be

    met by general purpose APIs • Need high matching precision • You are an experienced ML developer (or you know Yoann Benoit) Let me train your model!
  68. FirebaseModelInterpreter FirebaseModelInputs FirebaseModelOutputs INPUT OUTPUT

  69. // input & output options for non-quantized model val inputDims

    = intArrayOf(DIM_BATCH_SIZE, DIM_IMG_SIZE_X, DIM_IMG_SIZE_Y, DIM_PIXEL_SIZE) val outputDims = intArrayOf(1, labelList.size) inputOutputOptions = FirebaseModelInputOutputOptions.Builder() .setInputFormat(0, FirebaseModelDataType.FLOAT32, inputDims) .setOutputFormat(0, FirebaseModelDataType.FLOAT32, outputDims) .build()
  70. // input & output options for non-quantized model val inputDims

    = intArrayOf(DIM_BATCH_SIZE, DIM_IMG_SIZE_X, DIM_IMG_SIZE_Y, DIM_PIXEL_SIZE) val outputDims = intArrayOf(1, labelList.size) inputOutputOptions = FirebaseModelInputOutputOptions.Builder() .setInputFormat(0, FirebaseModelDataType.FLOAT32, inputDims) .setOutputFormat(0, FirebaseModelDataType.FLOAT32, outputDims) .build()
  71. ByteBuffer FirebaseModelInputs INPUT

  72. FirebaseModelOutputs OUTPUT

  73. Performance Benchmarks

  74. Model MobileNets_1.0_224

  75. Model MobileNets_1.0_224

  76. None
  77. • No callback or other feedback for model downloading •

    Model downloading seems to be blocking => do not use on main thread • Lack of documentations at this point (e.g. how to stop the interpreter?) • Slight performance loss comparing to TensorFlow Lite • A/B test your machine learning model!
  78. HowTo: Face Recognition Model • Trained with Keras + FaceNet

    • Converted to TensorFlow • Then converted to TensorFlow lite • Then we got stuck…
  79. Custom TensorFlow Lite build

  80. Custom TF Lite build • ML Kit uses a pre-built

    TensorFlow Lite library • Build your own AAR with bazel • Add custom ops for example
  81. Takeaway

  82. ML Kit: State of the art • Lack of high

    quality demos (e.g. firebase mlkit quickstart, bugs, deprecated camera API, deformed camera preview) • Lack of high level guidelines / best practises • Performance issue on old devices
  83. The best is yet to come • Face contours: 100

    data points • Smart Reply: conversation model • Online model compression
  84. References

  85. References • Talk Magritte for DroidCon London • Medium article:

    Android meets Machine Learning • Github Repo for demo • Joe Birch: Exploring Firebase MLKit on Android: Introducing MLKit (Part one) • Joe Birch: Exploring Firebase MLKit on Android: Face Detection (Part Two) • Merci Sandra ;)
  86. Questions?