Upgrade to Pro — share decks privately, control downloads, hide ads and more …

On-device ML with TFLite - AI Nextcon

On-device ML with TFLite - AI Nextcon

Talk at AI Nextcon.

Margaret Maynard-Reid

February 12, 2020
Tweet

More Decks by Margaret Maynard-Reid

Other Decks in Technology

Transcript

  1. @margaretmz | #ML | #GDE Topics • Why on-device ML?

    • On-device ML options • E2E tf.Keras to TFLite to Android ◦ train a model from scratch ◦ convert to TFLite ◦ deploy to mobile and IoT • TFLite on microcontroller & Coral Edge TPU 2
  2. @margaretmz | #ML | #GDE 3 Intro Why On-device ML?

    • Access to more data • Faster user interaction • Preserve privacy Unique constraints: • Less compute power • Limited memory • Battery consumption
  3. @margaretmz | #ML | #GDE TensorFlow for mobile & edge

    devices 4 2015 TF open sourced 2016 TF mobile 2017 TF Lite developer preview 2018 ML Kit 2019 - New ML Kit features - TF Mobile deprecated - New TFLite features!!!
  4. @margaretmz | #ML | #GDE TensorFlow Lite • Converter -

    convert to TFLite file format • Interpreter - execute inference & optimized for small devices • Ops/Kernel - limited ops • Interface to hardware acceleration ◦ NN API ◦ Edge TPU 7
  5. Optimization 1. Reduce model size TFLite model optimization toolkit •

    Quantization - convert 32 bit floating point to fixed point (e.g. 8-bit int) ◦ Post-training quantization ◦ Quantization-aware training • Pruning - eliminating unnecessary values in the weight tensor 8 2. Speed up inference On Android: • GPU delegate • Android NNAPI
  6. @margaretmz | #ML | #GDE On-device ML Options 10 What

    / how Who Where Native Android (iOS) apps • Direct deploy to Android • With ML Kit • With MediaPipe • Fritz.ai Android (or iOS) developers React Native Web developers TFLite / TF micro Embedded Microcontrollers Edge TPUs
  7. @margaretmz | #ML | #GDE React Native Support • Use

    TF.js ML directly inside React Native with WebGL acceleration • Load models from the web, or compile into your application Link to demo video | Link to github 11
  8. @margaretmz | #ML | #GDE Base APIs (Out of the

    box) Custom models • Dynamic model downloads • A/B testing (via Firebase remote Configuration) • Model conversion (from TensorFlow to TFLite) Learn more about ML Kit g.co/mlkit Image labelling OCR Face detection Barcode scanning Landmark detection Smart reply Object detection & Tracking Translation (56 languages) AutoML Google ML Kit 12
  9. @margaretmz | #ML | #GDE Why use ML Kit? 13

    Convert to Bytebuffer/bit map Calibration Java Native Frame Scheduler (Image Timestamp) Convert to byte array Output Results Pipeline config Convert to Grayscale Resize/Rotate Tracker Frame Selection Convert to RGB/Resize/R otate Detector (TF Lite model) Object Manager Image Validation Resize Pipeline Classifier ( TF Lite model) Source: ML Kit team
  10. @margaretmz | #ML | #GDE • Firebase console • AutoML

    - train model • Download TFLite • Mobile & edge https://firebase.google.com/docs/ml-kit/automl-image-labeling Google ML Kit - AutoML 14
  11. @margaretmz | #ML | #GDE MediaPipe A cross-platform AI pipeline

    framework by Google Research: • TensorFlow & TFLite • Desktop, web, mobile, Coral Edge TPUs • Fast & realtime • GPU • WebGL 15 Source: MediaPipe Github
  12. @margaretmz | #ML | #GDE Two talks on Media Pipe

    @AI Nextcon 2/13 1PM @Google Seattle 2/13 5PM • Google MediaPipe @Seattle by Ming Yong 16
  13. @margaretmz | #ML | #GDE Fritz.ai Mobile ML made easy...

    • Supports Android & iOS • Features: Image labelling & segmentation, object detection, style transfer, pose estimation… • Analytics, custom model hosting, perf monitoring… • Free up to certain usage 17 Source: Embrace your new look with Fritz Hair Segmentation
  14. Datasets Train model (Convert to TFLite) Deploy for inference End

    to End Model training to inference With TensorFlow 2.0 18
  15. @margaretmz | #ML | #GDE End to end: model training

    to inference in TF 2.0 19 Model • tf.Keras (TensorFlow) • Python libraries: Numpy, Matplotlib etc SavedModel or Keras model Serving • Cloud • Web • Mobile • IoT • Micro controllers • Edge TPU Training Inference Data
  16. @margaretmz | #ML | #GDE Data • Existing datasets ◦

    Part of the deep learning framework: ▪ MNIST, CIFAR10, FASHION_MNIST, IMDB movie reviews etc ◦ Open datasets: ▪ MNIST, MS-COCO, IMAGENet, CelebA etc ◦ Kaggle datasets: https://www.kaggle.com/datasets ◦ Google Dataset search tool: https://toolbox.google.com/datasetsearch ◦ TF 2.0: TFDS • Collect your own data 20
  17. @margaretmz | #ML | #GDE Models Options of getting a

    model: • Download a pre-trained model (here): Inception-v3, mobilenet etc. • Transfer learning with a pre-trained model ◦ Feature extraction or fine tuning on pre-trained model ◦ TensorFlow hub (https://www.tensorflow.org/hub/) • Train your own model from scratch (example in this talk) 21
  18. @margaretmz | #ML | #GDE Model saving, conversion, deployment •

    Model saving - SavedModel or Keras model • Model conversion ◦ Convert the model to tflite format ◦ Validate the converted model before deploy • Deploy TFLite for inference 22
  19. @margaretmz | #ML | #GDE MNIST dataset • 60,000 train

    set and 10,000 test set • 28x28x1 grayscale images • 10 classes: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 • Popular for computer vision ◦ “hello world” tutorial or ◦ benchmarking ML algorithms 24
  20. @margaretmz | #ML | #GDE Training the model in Colab

    Launch sample code on Colab → mnist_tfkeras_to_tflite.ipynb 1. Import data 2. Define model architecture 3. Train the model 4. Model saving & conversion ◦ Save a Keras model ◦ convert to tflite format 25
  21. @margaretmz | #ML | #GDE A typical CNN model architecture

    MNIST example: • Convolutional layer (definition) • Pooling layer (definition) • Dense (fully-connected layer) definition 26 input conv pool conv pool conv pool Dense 0 1 2 3 4 5 6 7 8 9
  22. @margaretmz | #ML | #GDE Inspect the model - in

    python code In python code, after defining the model architecture, use model.summary() to show the model architecture 27
  23. @margaretmz | #ML | #GDE Virtualize model Use a visualization

    tool: • TensorBoard • Netron (https://github.com/lutzroeder/Netron) Drop the .tflite model into Netron and see the model visually Note: model metadata a new TFLite tool (to be launched) will allow you to inspect the model & modify the metadata 28
  24. @margaretmz | #ML | #GDE Model saving When to save

    as SavedModel or a Keras model? Note: In TensorFlow 2.0 , tf.keras.Model.save() and tf.keras.models.save_model() default to the SavedModel format (not HDF5). (link to doc) 29 SavedModel Keras Model Share pre-trained models and model pieces on TensorFlow Hub Train with tf.Keras and you know your deploy your target When you don’t know the deploy target
  25. @margaretmz | #ML | #GDE Model conversion (with TFLite converter)

    30 Command line Python code (recommended) SavedModel tflite_convert \ --saved_model_dir=/tmp/my_saved_model \ --output_file=/tmp/my_model.tflite Keras Model --keras_model_file=/tmp/my_keras_model.h5 \ --output_file=/tmp/my_model.tflite # Create a converter converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file(keras_model) from_keras_model(model) # Set quantize to true (optional) converter.post_training_quantize=True # Convert the model tflite_model = converter.convert() # Create the tflite model file tflite_model_name = "my_model.tflite" open(tflite_model_name, "wb").write(tflite_model)
  26. @margaretmz | #ML | #GDE Validate TFLite model after conversion

    31 Protip: validate the tflite model in python after conversion - 31 TensorFlow result TFLite result Compare results # Test the TensorFlow model on random Input data. tf_result = model(tf.constant(input_data)) # Load TFLite model and allocate tensors. interpreter = tf.lite.Interpreter(model_path="converted_model.tflite") interpreter.allocate_tensors() # Get input and output tensors. input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() # Test model on random input data. input_shape = input_details[0]['shape'] input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() tflite_result = interpreter.get_tensor(output_details[0]['index']) # Compare the result. for tf_result, tflite_result in zip(tf_result, tflite_result): np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)
  27. @margaretmz | #ML | #GDE Tflite on Android Android sample

    code DigitRecognizer, step by step: • Place tf.lite model under assets folder • Update build.gradle dependencies • Input image - custom view, gallery or camera • Data preprocessing • Classify with the model • Post processing • Display result in UI 32
  28. @margaretmz | #ML | #GDE Dependencies Update build.gradle to include

    tensorflow lite android { // Make sure model doesn't get compressed when app is compiled aaptOptions { noCompress "tflite" } } dependencies { …. // Add dependency for TensorFlow Lite compile 'org.tensorflow:tensorflow-lite:[version-number]’ } Place the mnist.tflite model file under /assets folder 33
  29. @margaretmz | #ML | #GDE Input - image data Input

    to the classifier is an image, your options: • Draw on canvas from custom View • Get image from Gallery or a 3rd party camera • Live frames from Camera2 API Make sure the image dimensions (shape) matches what your classifier expects • 28x28x1- MNIST or FASHION_MNIST gray scale image • 299x299x3 - Inception V3 • 256x256x3 - MobileNet 34
  30. @margaretmz | #ML | #GDE Image preprocessing • Convert Bitmap

    to ByteBuffer • Normalize pixel values to be a certain range • Convert from color to grayscale, if needed 35
  31. @margaretmz | #ML | #GDE Run inference Load the model

    file located under the assets folder Use the TensorFlow Lite interpreter to run inference on the input image 36
  32. @margaretmz | #ML | #GDE Post processing The output is

    an array of probabilities, each correspond to a category Find the category with the highest probability and output result to UI 37
  33. @margaretmz | #ML | #GDE Summary • Training with tf.Keras

    is easy • Model conversion to TFLite is easier • Android implementation is getting better: ◦ Validate tflite model before deploy to Android ◦ Image pre-processing ◦ Input tensor shape? ◦ Color or grayscale? ◦ Post processing My blog post: E2E tf.Keras to TFLite to Android 38
  34. @margaretmz | #ML | #GDE New TFLite features Announced at

    TensorFlow World: 1. New TFLite support library (link) 2. Model metadata (not yet launched) 3. Model repository pre-converted to tflite format (link to models w/ examples | link to hosted models) 4. Transfer learning made easy - model customization API (link) 5. Ready to use end-to-end tutorials and full example apps (link) 6. TFLite course on Udacity (link) 39
  35. @margaretmz | #ML | #GDE TFLite classification demo app Check

    out the classification Demo app in TensorFlow repo 40
  36. @margaretmz | #ML | #GDE Inference with GPU • Face

    contour detection • Link to blog post: TensorFlow Lite Now Faster with Mobile GPUs 41
  37. @margaretmz | #ML | #GDE Posenet example • PoseNet model

    on Android • Camera live frames • Display key body parts in real time • Link to blog post: Track human poses in real-time on Android with TensorFlow Lite 42
  38. @margaretmz | #ML | #GDE On device ML training is

    finally here! • Train with ~20 images • Use transfer learning • Quantized MobileNetV2 • Android device (5.0+) Link to blog | Android sample 44
  39. @margaretmz | #ML | #GDE TFLite on microcontroller • Tiny

    models on tiny computers • Consumes much less power than CPUs - days on a coin battery • Tiny RAM and Flash available • Opens up voice interface to devs More info here - • Doc - https://www.tensorflow.org/lite/guide/microcontroller • Code lab - https://g.co/codelabs/sparkfunTF • Purchase - https://www.sparkfun.com/products/15170 45
  40. @margaretmz | #ML | #GDE Coral edge TPU (beta) -

    hardware for on-device ML acceleration Link to codelab: https://codelabs.developers.google.com/codelabs/edgetpu-classifier/index.html#0 • Dev board (+ camera module) • USB Accelerator (+ camera module + Raspberry Pi) Coral Edge TPU 46
  41. @margaretmz | #ML | #GDE Coral Edge TPU MobileNet SSD

    model running on TPU Inference time: < ~20 ms > ~60 fps 47
  42. @margaretmz | #ML | #GDE Coral Edge TPU demo MobileNet

    SSD model running on CPU Inference time > ~390ms ~ 3fps 48
  43. @margaretmz | #ML | #GDE On-device ML trends • Why

    the future of machine learning is tiny? - Pete Warden • Deploying to mobile and IoT will get much easier • TFLite will have many more features • Federated learning • On device training 49
  44. @margaretmz | #ML | #GDE Thank you! 51 Follow me

    on Twitter, Medium or GitHub to learn more about deep learning, TensorFlow and on-device ML @margaretmz @margaretmz margaretmz