Upgrade to Pro — share decks privately, control downloads, hide ads and more …

On-Device Model Deployment-CV Study Jam

On-Device Model Deployment-CV Study Jam

I gave a talk on how to deploy ML models to mobile and edge devices at a Computer Vision Study Jam organized by Seattle Data/Analytics/ML.

Margaret Maynard-Reid

October 12, 2019
Tweet

More Decks by Margaret Maynard-Reid

Other Decks in Technology

Transcript

  1. @margaretmz Model deployment & production Computer Vision ML Study Jam

    @ Seattle DAML Margaret Maynard-Reid, 10/12/2019
  2. @margaretmz | #MachineLearning #GDE Topics • Intro to TF 2.0

    & tf.Keras • On-device ML options • E2E tf.Keras to TFLite to Android ◦ train a model from scratch ◦ convert to TFLite ◦ deploy to mobile and IoT • TFLite models on microcontroller & Coral Edge TPU 2
  3. @margaretmz | #MachineLearning #GDE Examples of computer vision 5 Generative

    Adversarial Networks (GANs) Generating new images Image classification Is this a cat? Object detection Drawing bounding boxes around the objects Dance Like @I/O Segmentation, pose, GPU on-device Other examples: - Photos enhancement - Style transfer - OCR - Face keypoints
  4. @margaretmz | #MachineLearning #GDE TensorFlow model building APIs TensorFlow is

    a deep learning framework for both research & production Write TensorFlow code in C++, Python, Java, R, Go, SWIFT, JavaScript Deploy to CPU, GPU, TPU, Mobile, Android Things, Raspberry Pi tf.* tf.layers tf.keras Custom Estimator Premade Estimator ← Low level ← Mid level (moving to tf.keras in TF 2.0) ← High level ← Model in a box ← Distributed execution, tf serving 7 TensorFlow 2.0 Beta just got announced! | My Notes on TensorFlow 2.0
  5. @margaretmz | #MachineLearning #GDE tf.Keras vs Keras No 1:1 mapping

    between tf.Keras and Keras. Transition to tf.Keras 8 tf.keras - part of the TensorFlow core APIs import tensorflow as tf # import TensorFlow from tensorflow import keras # import Keras Multi-backend Keras 2.3 just released and will be the last major release TensorFlow (Protip: use tf.keras, instead of Keras + TF as backend) • TensorFlow • Theano • CNTK...
  6. @margaretmz | #MachineLearning #GDE tf.Keras model building APIs • Sequential

    - the easiest way • Functional - more flexibility • Model subclassing - extend a Model class Learn more in Josh Gordon’s blog: What are Symbolic and Imperative APIs in TensorFlow 2.0? 9
  7. @margaretmz | #MachineLearning #GDE Blog.tensorflow.org TensorFlow and ML learning resources

    Tensorflow.org Deep learning with Python by Francois Chollet TensorFlow on Youtube TensorFlow on Twitter #AskTensorFlow #TensorFlowMeets Collection of interactive ML examples (blogpost | website) 10 Interested in learning about TensorFlow 2.0 and try it out? Read My Notes on TensorFlow 2.0 TensorFlow Dev Summit 2019 By Aurélien Géron
  8. @margaretmz | #MachineLearning #GDE Anaconda, TensorFlow & Keras Why use

    a virtual environment? Ease of upgrade/downgrade of tensorflow • Download anaconda here • Create a new virtual environment $ conda create -n [my-env-name] • Activate the virtual environment you created $ conda activate [my-env-name] • Install TensorFlow beta $ pip install tensorflow==2.0.0-beta1 My blog post Anaconda, Jupyter Notebook, TensorFlow, Keras 11
  9. @margaretmz | #MachineLearning #GDE Google Colab What is Google Colab?

    • Jupyter Notebook ◦ stored on Google Drive ◦ running on Google’s VM in the cloud • Free GPU and TPU! • TensorFlow is already installed • Save and share from your Drive • Save directly to GitHub 12 Check out these learning resources • My blog on Colab • TF team’s blog on Colab • Laurence’ Video Build a deep neural network in 4 mins with TensorFlow in Colab • Paige’s video How to take advantage of GPUs & TPUs for your ML project • Sam’s blog Keras on TPUs in Colab Launch Colab from colab.research.google.com/
  10. @margaretmz | #MachineLearning #GDE TensorFlow for edge devices 16 2015

    TF open sourced 2016 TF mobile 2017 TF Lite developer preview 2018 ML Kit 2019 TF Mobile deprecated ML Kit improves TF Lite exits dev preview More than just mobile apps: • Microcontrollers • Edge TPUs
  11. @margaretmz | #MachineLearning #GDE TensorFlow Lite • For deploying to

    edge devices • Works with Inception & MobileNet • May not support all operations • Supports ◦ Mobile: Android & IOS ◦ Android Things ◦ Raspberry Pi ◦ Microcontroller ◦ Edge TPU 17
  12. @margaretmz | #MachineLearning #GDE Optimization TFLite model optimization toolkit •

    Quantization - convert 32 bit floating point to fixed point (e.g. 8-bit int) ◦ Post-training quantization ◦ Quantization-aware training • Pruning - eliminating unnecessary values in the weight tensor Android: • GPU delegate • Android NNAPI 18
  13. @margaretmz | #MachineLearning #GDE Image labelling OCR Face detection Barcode

    scanning Landmark detection Smart reply (coming soon) Object detection & Tracking Translation (56 languages) AutoML ML Kit 19 Brings Google’s ML expertise to mobile developers in a powerful and easy-to-use package. Powered by TF Lite and hosted on Firebase Base APIs: Custom models • Dynamic model downloads • A/B testing (via Firebase remote Configuration) • Model compression & conversion (from TensorFlow to TF Lite)
  14. @margaretmz | #MachineLearning #GDE Android ML with TensorFlow Your options:

    • With ML Kit ◦ (Out of the box) Base APIs ◦ Custom model • Direct deploy to Android ◦ Custom model 20 Custom Models • Download pre trained models • Retrain model • Train your own from scratch ◦ data ◦ train ◦ convert ◦ inference Note: you can use AutoML to train but no easy implementation on mobile until recently
  15. @margaretmz | #MachineLearning #GDE End to end: model training to

    inference 21 Model • tf.Keras (TensorFlow) • Python libraries: Numpy, Matplotlib etc SavedModel or Keras model Serving • Cloud • Web • Mobile • IoT • Micro controllers • Edge TPU Training Inference Data
  16. @margaretmz | #MachineLearning #GDE Data • Existing datasets ◦ Part

    of the deep learning framework: ▪ MNIST, CIFAR10, FASHION_MNIST, IMDB movie reviews etc ◦ Open datasets: ▪ MNIST, MS-COCO, IMAGENet, CelebA etc ◦ Kaggle datasets: https://www.kaggle.com/datasets ◦ Google Dataset search tool: https://toolbox.google.com/datasetsearch ◦ TF 2.0: TFDS • Collect your own data 22
  17. @margaretmz | #MachineLearning #GDE Models Options of getting a model:

    • Download a pre-trained model (here): Inception-v3, mobilenet etc. • Transfer learning with a pre-trained model ◦ Feature extraction or fine tuning on pre-trained model ◦ TensorFlow hub (https://www.tensorflow.org/hub/) • Train your own model from scratch (example in this talk) 23
  18. Datasets Train model Convert to TFLite Deploy for inference End

    to End tf.Keras to TFLite to Android Train a model from scratch 25
  19. @margaretmz | #MachineLearning #GDE MNIST dataset • 60,000 train set

    and 10,000 test set • 28x28x1 grayscale images • 10 classes: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 • Popular for computer vision ◦ “hello world” tutorial or ◦ benchmarking ML algorithms 26
  20. @margaretmz | #MachineLearning #GDE Training the model in Colab Launch

    sample code on Colab → mnist_tfkeras_to_tflite.ipynb 1. Import data 2. Define model architecture 3. Train the model 4. Model saving & conversion ◦ Save a Keras model ◦ convert to tflite format 27
  21. @margaretmz | #MachineLearning #GDE A typical CNN model architecture MNIST

    example: • Convolutional layer (definition) • Pooling layer (definition) • Dense (fully-connected layer) definition 28 input conv pool conv pool conv pool Dense 0 1 2 3 4 5 6 7 8 9
  22. @margaretmz | #MachineLearning #GDE Inspect the model - in python

    code In python code, after defining the model architecture, use model.summary() to show the model architecture 29
  23. @margaretmz | #MachineLearning #GDE Virtualize model Use a visualization tool:

    • TensorBoard • Netron (https://github.com/lutzroeder/Netron) Drop the .tflite model into Netron and see the model visually 30
  24. @margaretmz | #MachineLearning #GDE TensorFlow Lite Converter Convert Keras model

    → a tflite model with the tflite converter There are two options: 1. Command line 2. Python API Note: • you can convert from SavedModel as well, • GraphDef and tf.Session are no longer supported in 2.0 for TFLite conversion. Read details on tflite converter on TF documentation here 31
  25. @margaretmz | #MachineLearning #GDE Tflite convert through command line To

    convert a tf.keras model to a tflite model: $ tflite_convert \ $--output_file=mymodel.tflite \ $ --keras_model_file=mymodel.h5 32
  26. @margaretmz | #MachineLearning #GDE Tflite convert through Python code Note:

    converter API is different between TF 1.13, 1.14, 2.0 Alpha & nightly # Create a converter converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file(keras_model) # Set quantize to true converter.post_training_quantize=True # Convert the model tflite_model = converter.convert() # Create the tflite model file tflite_model_name = "mymodel.tflite" open(tflite_model_name, "wb").write(tflite_model) 33
  27. @margaretmz | #MachineLearning #GDE Validate the tflite model Protip: validate

    the tflite model in python after conversion - 35 TensorFlow result TFLite result Compare results # Test the TensorFlow model on random Input data. tf_result = model(tf.constant(input_data)) # Load TFLite model and allocate tensors. interpreter = tf.lite.Interpreter(model_path="converted_model.tflite") interpreter.allocate_tensors() # Get input and output tensors. input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() # Test model on random input data. input_shape = input_details[0]['shape'] input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() tflite_result = interpreter.get_tensor(output_details[0]['index']) # Compare the result. for tf_result, tflite_result in zip(tf_result, tflite_result): np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)
  28. @margaretmz | #MachineLearning #GDE Tflite on Android Android sample code

    DigitRecognizer, step by step: • Place tf.lite model under assets folder • Update build.gradle dependencies • Input image - custom view, gallery or camera • Data preprocessing • Classify with the model • Post processing • Display result in UI 36
  29. @margaretmz | #MachineLearning #GDE Post processing The output is an

    array of probabilities, each correspond to a category Find the category with the highest probability and output result to UI 41
  30. @margaretmz | #MachineLearning #GDE Summary • Training with tf.Keras is

    easy • Model conversion to TFLite is easier • Android implementation is still challenging & error-prone: (Hopefully this gets improved in the future!) ◦ Validate tflite model before deploy to Android ◦ Image pre-processing ◦ Input tensor shape? ◦ Color or grayscale? ◦ Post processing My blog post: E2E tf.Keras to TFLite to Android 42
  31. @margaretmz | #MachineLearning #GDE TFLite demo app Check out the

    Demo app in TensorFlow repo Clone tensorflow project from github git clone https://www.github.com/tensorflow/tensorflow Then open the tflite Android demo from Android Studio /tensorflow/lite/java/demo 43
  32. @margaretmz | #MachineLearning #GDE Inference with GPU • Face contour

    detection • Link to blog post: TensorFlow Lite Now Faster with Mobile GPUs (Developer Preview) 45
  33. @margaretmz | #MachineLearning #GDE Posenet example • PoseNet model on

    Android • Camera live frames • Display key body parts in real time • Link to blog post: Track human poses in real-time on Android with TensorFlow Lite 46
  34. @margaretmz | #MachineLearning #GDE TFLite on microcontroller • Tiny models

    on tiny computers • Consumes much less power than CPUs - days on a coin battery • Tiny RAM and Flash available • Opens up voice interface to devs More info here - • Doc - https://www.tensorflow.org/lite/guide/microcontroller • Code lab - https://g.co/codelabs/sparkfunTF • Purchase - https://www.sparkfun.com/products/15170 47
  35. @margaretmz | #MachineLearning #GDE Coral edge TPU (beta) - hardware

    for on-device ML acceleration Link to codelab: https://codelabs.developers.google.com/codelabs/edgetpu-classifier/index.html#0 • Dev board (+ camera module) • USB Accelerator (+ camera module + Raspberry Pi) Coral Edge TPU 48
  36. @margaretmz | #MachineLearning #GDE Coral Edge TPU MobileNet SSD model

    running on TPU Inference time: < ~20 ms > ~60 fps 49
  37. @margaretmz | #MachineLearning #GDE Coral Edge TPU demo MobileNet SSD

    model running on CPU Inference time > ~390ms ~ 3fps 50
  38. @margaretmz | #MachineLearning #GDE Upcoming • Why the future of

    machine learning is tiny? - Pete Warden • Deploying to mobile and IoT will get much easier • TFLite will have many more features • Federated learning • On device training 51
  39. @margaretmz | #MachineLearning #GDE Thank you! 52 Follow me on

    Twitter, Medium or GitHub to learn more about Deep learning, TensorFlow and on-device ML @margaretmz @margaretmz margaretmz