On-device ML with TFLite - AI Nextcon

On-device ML with Lite Margaret Maynard-Reid, 2/12/2020 @margaretmz

@margaretmz | #ML | #GDE Topics • Why on-device ML?
• On-device ML options • E2E tf.Keras to TFLite to Android ◦ train a model from scratch ◦ convert to TFLite ◦ deploy to mobile and IoT • TFLite on microcontroller & Coral Edge TPU 2

@margaretmz | #ML | #GDE 3 Intro Why On-device ML?
• Access to more data • Faster user interaction • Preserve privacy Unique constraints: • Less compute power • Limited memory • Battery consumption

@margaretmz | #ML | #GDE TensorFlow for mobile & edge
devices 4 2015 TF open sourced 2016 TF mobile 2017 TF Lite developer preview 2018 ML Kit 2019 - New ML Kit features - TF Mobile deprecated - New TFLite features!!!

@margaretmz | #ML | #GDE TFLite on 3b+ devices! Source:
Tensorflow Lite team 5

@margaretmz | #ML | #GDE Dance Like @I/O 2019 Segmentation,
Pose, GPU on-device 6

@margaretmz | #ML | #GDE TensorFlow Lite • Converter -
convert to TFLite file format • Interpreter - execute inference & optimized for small devices • Ops/Kernel - limited ops • Interface to hardware acceleration ◦ NN API ◦ Edge TPU 7

Optimization 1. Reduce model size TFLite model optimization toolkit •
Quantization - convert 32 bit floating point to fixed point (e.g. 8-bit int) ◦ Post-training quantization ◦ Quantization-aware training • Pruning - eliminating unnecessary values in the weight tensor 8 2. Speed up inference On Android: • GPU delegate • Android NNAPI

On-device ML What are your options? Media Pipe 9

@margaretmz | #ML | #GDE On-device ML Options 10 What
/ how Who Where Native Android (iOS) apps • Direct deploy to Android • With ML Kit • With MediaPipe • Fritz.ai Android (or iOS) developers React Native Web developers TFLite / TF micro Embedded Microcontrollers Edge TPUs

@margaretmz | #ML | #GDE React Native Support • Use
TF.js ML directly inside React Native with WebGL acceleration • Load models from the web, or compile into your application Link to demo video | Link to github 11

@margaretmz | #ML | #GDE Base APIs (Out of the
box) Custom models • Dynamic model downloads • A/B testing (via Firebase remote Configuration) • Model conversion (from TensorFlow to TFLite) Learn more about ML Kit g.co/mlkit Image labelling OCR Face detection Barcode scanning Landmark detection Smart reply Object detection & Tracking Translation (56 languages) AutoML Google ML Kit 12

@margaretmz | #ML | #GDE Why use ML Kit? 13
Convert to Bytebuffer/bit map Calibration Java Native Frame Scheduler (Image Timestamp) Convert to byte array Output Results Pipeline config Convert to Grayscale Resize/Rotate Tracker Frame Selection Convert to RGB/Resize/R otate Detector (TF Lite model) Object Manager Image Validation Resize Pipeline Classifier ( TF Lite model) Source: ML Kit team

@margaretmz | #ML | #GDE • Firebase console • AutoML
- train model • Download TFLite • Mobile & edge https://firebase.google.com/docs/ml-kit/automl-image-labeling Google ML Kit - AutoML 14

@margaretmz | #ML | #GDE MediaPipe A cross-platform AI pipeline
framework by Google Research: • TensorFlow & TFLite • Desktop, web, mobile, Coral Edge TPUs • Fast & realtime • GPU • WebGL 15 Source: MediaPipe Github

@margaretmz | #ML | #GDE Two talks on Media Pipe
@AI Nextcon 2/13 1PM @Google Seattle 2/13 5PM • Google MediaPipe @Seattle by Ming Yong 16

@margaretmz | #ML | #GDE Fritz.ai Mobile ML made easy...
• Supports Android & iOS • Features: Image labelling & segmentation, object detection, style transfer, pose estimation… • Analytics, custom model hosting, perf monitoring… • Free up to certain usage 17 Source: Embrace your new look with Fritz Hair Segmentation

Datasets Train model (Convert to TFLite) Deploy for inference End
to End Model training to inference With TensorFlow 2.0 18

@margaretmz | #ML | #GDE End to end: model training
to inference in TF 2.0 19 Model • tf.Keras (TensorFlow) • Python libraries: Numpy, Matplotlib etc SavedModel or Keras model Serving • Cloud • Web • Mobile • IoT • Micro controllers • Edge TPU Training Inference Data

@margaretmz | #ML | #GDE Data • Existing datasets ◦
Part of the deep learning framework: ▪ MNIST, CIFAR10, FASHION_MNIST, IMDB movie reviews etc ◦ Open datasets: ▪ MNIST, MS-COCO, IMAGENet, CelebA etc ◦ Kaggle datasets: https://www.kaggle.com/datasets ◦ Google Dataset search tool: https://toolbox.google.com/datasetsearch ◦ TF 2.0: TFDS • Collect your own data 20

@margaretmz | #ML | #GDE Models Options of getting a
model: • Download a pre-trained model (here): Inception-v3, mobilenet etc. • Transfer learning with a pre-trained model ◦ Feature extraction or fine tuning on pre-trained model ◦ TensorFlow hub (https://www.tensorflow.org/hub/) • Train your own model from scratch (example in this talk) 21

@margaretmz | #ML | #GDE Model saving, conversion, deployment •
Model saving - SavedModel or Keras model • Model conversion ◦ Convert the model to tflite format ◦ Validate the converted model before deploy • Deploy TFLite for inference 22

@margaretmz | #ML | #GDE End to End: tf.Keras to
TFLite to Android 23

@margaretmz | #ML | #GDE MNIST dataset • 60,000 train
set and 10,000 test set • 28x28x1 grayscale images • 10 classes: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 • Popular for computer vision ◦ “hello world” tutorial or ◦ benchmarking ML algorithms 24

@margaretmz | #ML | #GDE Training the model in Colab
Launch sample code on Colab → mnist_tfkeras_to_tflite.ipynb 1. Import data 2. Define model architecture 3. Train the model 4. Model saving & conversion ◦ Save a Keras model ◦ convert to tflite format 25

@margaretmz | #ML | #GDE A typical CNN model architecture
MNIST example: • Convolutional layer (definition) • Pooling layer (definition) • Dense (fully-connected layer) definition 26 input conv pool conv pool conv pool Dense 0 1 2 3 4 5 6 7 8 9

@margaretmz | #ML | #GDE Inspect the model - in
python code In python code, after defining the model architecture, use model.summary() to show the model architecture 27

@margaretmz | #ML | #GDE Virtualize model Use a visualization
tool: • TensorBoard • Netron (https://github.com/lutzroeder/Netron) Drop the .tflite model into Netron and see the model visually Note: model metadata a new TFLite tool (to be launched) will allow you to inspect the model & modify the metadata 28

@margaretmz | #ML | #GDE Model saving When to save
as SavedModel or a Keras model? Note: In TensorFlow 2.0 , tf.keras.Model.save() and tf.keras.models.save_model() default to the SavedModel format (not HDF5). (link to doc) 29 SavedModel Keras Model Share pre-trained models and model pieces on TensorFlow Hub Train with tf.Keras and you know your deploy your target When you don’t know the deploy target

@margaretmz | #ML | #GDE Model conversion (with TFLite converter)
30 Command line Python code (recommended) SavedModel tflite_convert \ --saved_model_dir=/tmp/my_saved_model \ --output_file=/tmp/my_model.tflite Keras Model --keras_model_file=/tmp/my_keras_model.h5 \ --output_file=/tmp/my_model.tflite # Create a converter converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file(keras_model) from_keras_model(model) # Set quantize to true (optional) converter.post_training_quantize=True # Convert the model tflite_model = converter.convert() # Create the tflite model file tflite_model_name = "my_model.tflite" open(tflite_model_name, "wb").write(tflite_model)

@margaretmz | #ML | #GDE Validate TFLite model after conversion
31 Protip: validate the tflite model in python after conversion - 31 TensorFlow result TFLite result Compare results # Test the TensorFlow model on random Input data. tf_result = model(tf.constant(input_data)) # Load TFLite model and allocate tensors. interpreter = tf.lite.Interpreter(model_path="converted_model.tflite") interpreter.allocate_tensors() # Get input and output tensors. input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() # Test model on random input data. input_shape = input_details[0]['shape'] input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() tflite_result = interpreter.get_tensor(output_details[0]['index']) # Compare the result. for tf_result, tflite_result in zip(tf_result, tflite_result): np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)

@margaretmz | #ML | #GDE Tflite on Android Android sample
code DigitRecognizer, step by step: • Place tf.lite model under assets folder • Update build.gradle dependencies • Input image - custom view, gallery or camera • Data preprocessing • Classify with the model • Post processing • Display result in UI 32

@margaretmz | #ML | #GDE Dependencies Update build.gradle to include
tensorflow lite android { // Make sure model doesn't get compressed when app is compiled aaptOptions { noCompress "tflite" } } dependencies { …. // Add dependency for TensorFlow Lite compile 'org.tensorflow:tensorflow-lite:[version-number]’ } Place the mnist.tflite model file under /assets folder 33

@margaretmz | #ML | #GDE Input - image data Input
to the classifier is an image, your options: • Draw on canvas from custom View • Get image from Gallery or a 3rd party camera • Live frames from Camera2 API Make sure the image dimensions (shape) matches what your classifier expects • 28x28x1- MNIST or FASHION_MNIST gray scale image • 299x299x3 - Inception V3 • 256x256x3 - MobileNet 34

@margaretmz | #ML | #GDE Image preprocessing • Convert Bitmap
to ByteBuffer • Normalize pixel values to be a certain range • Convert from color to grayscale, if needed 35

@margaretmz | #ML | #GDE Run inference Load the model
file located under the assets folder Use the TensorFlow Lite interpreter to run inference on the input image 36

@margaretmz | #ML | #GDE Post processing The output is
an array of probabilities, each correspond to a category Find the category with the highest probability and output result to UI 37

@margaretmz | #ML | #GDE Summary • Training with tf.Keras
is easy • Model conversion to TFLite is easier • Android implementation is getting better: ◦ Validate tflite model before deploy to Android ◦ Image pre-processing ◦ Input tensor shape? ◦ Color or grayscale? ◦ Post processing My blog post: E2E tf.Keras to TFLite to Android 38

@margaretmz | #ML | #GDE New TFLite features Announced at
TensorFlow World: 1. New TFLite support library (link) 2. Model metadata (not yet launched) 3. Model repository pre-converted to tflite format (link to models w/ examples | link to hosted models) 4. Transfer learning made easy - model customization API (link) 5. Ready to use end-to-end tutorials and full example apps (link) 6. TFLite course on Udacity (link) 39

@margaretmz | #ML | #GDE TFLite classification demo app Check
out the classification Demo app in TensorFlow repo 40

@margaretmz | #ML | #GDE Inference with GPU • Face
contour detection • Link to blog post: TensorFlow Lite Now Faster with Mobile GPUs 41

@margaretmz | #ML | #GDE Posenet example • PoseNet model
on Android • Camera live frames • Display key body parts in real time • Link to blog post: Track human poses in real-time on Android with TensorFlow Lite 42

@margaretmz | #ML | #GDE More TFLite examples 43

@margaretmz | #ML | #GDE On device ML training is
finally here! • Train with ~20 images • Use transfer learning • Quantized MobileNetV2 • Android device (5.0+) Link to blog | Android sample 44

@margaretmz | #ML | #GDE TFLite on microcontroller • Tiny
models on tiny computers • Consumes much less power than CPUs - days on a coin battery • Tiny RAM and Flash available • Opens up voice interface to devs More info here - • Doc - https://www.tensorflow.org/lite/guide/microcontroller • Code lab - https://g.co/codelabs/sparkfunTF • Purchase - https://www.sparkfun.com/products/15170 45

@margaretmz | #ML | #GDE Coral edge TPU (beta) -
hardware for on-device ML acceleration Link to codelab: https://codelabs.developers.google.com/codelabs/edgetpu-classifier/index.html#0 • Dev board (+ camera module) • USB Accelerator (+ camera module + Raspberry Pi) Coral Edge TPU 46

@margaretmz | #ML | #GDE Coral Edge TPU MobileNet SSD
model running on TPU Inference time: < ~20 ms > ~60 fps 47

@margaretmz | #ML | #GDE Coral Edge TPU demo MobileNet
SSD model running on CPU Inference time > ~390ms ~ 3fps 48

@margaretmz | #ML | #GDE On-device ML trends • Why
the future of machine learning is tiny? - Pete Warden • Deploying to mobile and IoT will get much easier • TFLite will have many more features • Federated learning • On device training 49

@margaretmz | #ML | #GDE Awesome TFLite bit.ly/awesome-tflite - please
star ⭐ the repo if you find it useful! 50

@margaretmz | #ML | #GDE Thank you! 51 Follow me
on Twitter, Medium or GitHub to learn more about deep learning, TensorFlow and on-device ML @margaretmz @margaretmz margaretmz

On-device ML with TFLite - AI Nextcon

On-device ML with TFLite - AI Nextcon

More Decks by Margaret Maynard-Reid

Other Decks in Technology

Featured

Transcript