On-device ML with TFLite - AI Nextcon

Slide 1

Slide 1 text

On-device ML with Lite Margaret Maynard-Reid, 2/12/2020 @margaretmz

Slide 2

Slide 2 text

@margaretmz | #ML | #GDE Topics ● Why on-device ML? ● On-device ML options ● E2E tf.Keras to TFLite to Android ○ train a model from scratch ○ convert to TFLite ○ deploy to mobile and IoT ● TFLite on microcontroller & Coral Edge TPU 2

Slide 3

Slide 3 text

@margaretmz | #ML | #GDE 3 Intro Why On-device ML? ● Access to more data ● Faster user interaction ● Preserve privacy Unique constraints: ● Less compute power ● Limited memory ● Battery consumption

Slide 4

Slide 4 text

@margaretmz | #ML | #GDE TensorFlow for mobile & edge devices 4 2015 TF open sourced 2016 TF mobile 2017 TF Lite developer preview 2018 ML Kit 2019 - New ML Kit features - TF Mobile deprecated - New TFLite features!!!

Slide 5

Slide 5 text

@margaretmz | #ML | #GDE TFLite on 3b+ devices! Source: Tensorflow Lite team 5

Slide 6

Slide 6 text

@margaretmz | #ML | #GDE Dance Like @I/O 2019 Segmentation, Pose, GPU on-device 6

Slide 7

Slide 7 text

@margaretmz | #ML | #GDE TensorFlow Lite ● Converter - convert to TFLite file format ● Interpreter - execute inference & optimized for small devices ● Ops/Kernel - limited ops ● Interface to hardware acceleration ○ NN API ○ Edge TPU 7

Slide 8

Slide 8 text

Optimization 1. Reduce model size TFLite model optimization toolkit ● Quantization - convert 32 bit floating point to fixed point (e.g. 8-bit int) ○ Post-training quantization ○ Quantization-aware training ● Pruning - eliminating unnecessary values in the weight tensor 8 2. Speed up inference On Android: ● GPU delegate ● Android NNAPI

Slide 9

Slide 9 text

On-device ML What are your options? Media Pipe 9

Slide 10

Slide 10 text

@margaretmz | #ML | #GDE On-device ML Options 10 What / how Who Where Native Android (iOS) apps ● Direct deploy to Android ● With ML Kit ● With MediaPipe ● Fritz.ai Android (or iOS) developers React Native Web developers TFLite / TF micro Embedded Microcontrollers Edge TPUs

Slide 11

Slide 11 text

@margaretmz | #ML | #GDE React Native Support ● Use TF.js ML directly inside React Native with WebGL acceleration ● Load models from the web, or compile into your application Link to demo video | Link to github 11

Slide 12

Slide 12 text

@margaretmz | #ML | #GDE Base APIs (Out of the box) Custom models ● Dynamic model downloads ● A/B testing (via Firebase remote Configuration) ● Model conversion (from TensorFlow to TFLite) Learn more about ML Kit g.co/mlkit Image labelling OCR Face detection Barcode scanning Landmark detection Smart reply Object detection & Tracking Translation (56 languages) AutoML Google ML Kit 12

Slide 13

Slide 13 text

@margaretmz | #ML | #GDE Why use ML Kit? 13 Convert to Bytebuffer/bit map Calibration Java Native Frame Scheduler (Image Timestamp) Convert to byte array Output Results Pipeline config Convert to Grayscale Resize/Rotate Tracker Frame Selection Convert to RGB/Resize/R otate Detector (TF Lite model) Object Manager Image Validation Resize Pipeline Classifier ( TF Lite model) Source: ML Kit team

Slide 14

Slide 14 text

@margaretmz | #ML | #GDE ● Firebase console ● AutoML - train model ● Download TFLite ● Mobile & edge https://firebase.google.com/docs/ml-kit/automl-image-labeling Google ML Kit - AutoML 14

Slide 15

Slide 15 text

@margaretmz | #ML | #GDE MediaPipe A cross-platform AI pipeline framework by Google Research: ● TensorFlow & TFLite ● Desktop, web, mobile, Coral Edge TPUs ● Fast & realtime ● GPU ● WebGL 15 Source: MediaPipe Github

Slide 16

Slide 16 text

@margaretmz | #ML | #GDE Two talks on Media Pipe @AI Nextcon 2/13 1PM @Google Seattle 2/13 5PM ● Google MediaPipe @Seattle by Ming Yong 16

Slide 17

Slide 17 text

@margaretmz | #ML | #GDE Fritz.ai Mobile ML made easy... ● Supports Android & iOS ● Features: Image labelling & segmentation, object detection, style transfer, pose estimation… ● Analytics, custom model hosting, perf monitoring… ● Free up to certain usage 17 Source: Embrace your new look with Fritz Hair Segmentation

Slide 18

Slide 18 text

Datasets Train model (Convert to TFLite) Deploy for inference End to End Model training to inference With TensorFlow 2.0 18

Slide 19

Slide 19 text

@margaretmz | #ML | #GDE End to end: model training to inference in TF 2.0 19 Model ● tf.Keras (TensorFlow) ● Python libraries: Numpy, Matplotlib etc SavedModel or Keras model Serving ● Cloud ● Web ● Mobile ● IoT ● Micro controllers ● Edge TPU Training Inference Data

Slide 20

Slide 20 text

@margaretmz | #ML | #GDE Data ● Existing datasets ○ Part of the deep learning framework: ■ MNIST, CIFAR10, FASHION_MNIST, IMDB movie reviews etc ○ Open datasets: ■ MNIST, MS-COCO, IMAGENet, CelebA etc ○ Kaggle datasets: https://www.kaggle.com/datasets ○ Google Dataset search tool: https://toolbox.google.com/datasetsearch ○ TF 2.0: TFDS ● Collect your own data 20

Slide 21

Slide 21 text

@margaretmz | #ML | #GDE Models Options of getting a model: ● Download a pre-trained model (here): Inception-v3, mobilenet etc. ● Transfer learning with a pre-trained model ○ Feature extraction or fine tuning on pre-trained model ○ TensorFlow hub (https://www.tensorflow.org/hub/) ● Train your own model from scratch (example in this talk) 21

Slide 22

Slide 22 text

@margaretmz | #ML | #GDE Model saving, conversion, deployment ● Model saving - SavedModel or Keras model ● Model conversion ○ Convert the model to tflite format ○ Validate the converted model before deploy ● Deploy TFLite for inference 22

Slide 23

Slide 23 text

@margaretmz | #ML | #GDE End to End: tf.Keras to TFLite to Android 23

Slide 24

Slide 24 text

@margaretmz | #ML | #GDE MNIST dataset ● 60,000 train set and 10,000 test set ● 28x28x1 grayscale images ● 10 classes: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ● Popular for computer vision ○ “hello world” tutorial or ○ benchmarking ML algorithms 24

Slide 25

Slide 25 text

@margaretmz | #ML | #GDE Training the model in Colab Launch sample code on Colab → mnist_tfkeras_to_tflite.ipynb 1. Import data 2. Define model architecture 3. Train the model 4. Model saving & conversion ○ Save a Keras model ○ convert to tflite format 25

Slide 26

Slide 26 text

@margaretmz | #ML | #GDE A typical CNN model architecture MNIST example: ● Convolutional layer (definition) ● Pooling layer (definition) ● Dense (fully-connected layer) definition 26 input conv pool conv pool conv pool Dense 0 1 2 3 4 5 6 7 8 9

Slide 27

Slide 27 text

@margaretmz | #ML | #GDE Inspect the model - in python code In python code, after defining the model architecture, use model.summary() to show the model architecture 27

Slide 28

Slide 28 text

@margaretmz | #ML | #GDE Virtualize model Use a visualization tool: ● TensorBoard ● Netron (https://github.com/lutzroeder/Netron) Drop the .tflite model into Netron and see the model visually Note: model metadata a new TFLite tool (to be launched) will allow you to inspect the model & modify the metadata 28

Slide 29

Slide 29 text

@margaretmz | #ML | #GDE Model saving When to save as SavedModel or a Keras model? Note: In TensorFlow 2.0 , tf.keras.Model.save() and tf.keras.models.save_model() default to the SavedModel format (not HDF5). (link to doc) 29 SavedModel Keras Model Share pre-trained models and model pieces on TensorFlow Hub Train with tf.Keras and you know your deploy your target When you don’t know the deploy target

Slide 30

Slide 30 text

@margaretmz | #ML | #GDE Model conversion (with TFLite converter) 30 Command line Python code (recommended) SavedModel tflite_convert \ --saved_model_dir=/tmp/my_saved_model \ --output_file=/tmp/my_model.tflite Keras Model --keras_model_file=/tmp/my_keras_model.h5 \ --output_file=/tmp/my_model.tflite # Create a converter converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file(keras_model) from_keras_model(model) # Set quantize to true (optional) converter.post_training_quantize=True # Convert the model tflite_model = converter.convert() # Create the tflite model file tflite_model_name = "my_model.tflite" open(tflite_model_name, "wb").write(tflite_model)

Slide 31

Slide 31 text

@margaretmz | #ML | #GDE Validate TFLite model after conversion 31 Protip: validate the tflite model in python after conversion - 31 TensorFlow result TFLite result Compare results # Test the TensorFlow model on random Input data. tf_result = model(tf.constant(input_data)) # Load TFLite model and allocate tensors. interpreter = tf.lite.Interpreter(model_path="converted_model.tflite") interpreter.allocate_tensors() # Get input and output tensors. input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() # Test model on random input data. input_shape = input_details[0]['shape'] input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() tflite_result = interpreter.get_tensor(output_details[0]['index']) # Compare the result. for tf_result, tflite_result in zip(tf_result, tflite_result): np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)

Slide 32

Slide 32 text

@margaretmz | #ML | #GDE Tflite on Android Android sample code DigitRecognizer, step by step: ● Place tf.lite model under assets folder ● Update build.gradle dependencies ● Input image - custom view, gallery or camera ● Data preprocessing ● Classify with the model ● Post processing ● Display result in UI 32

Slide 33

Slide 33 text

@margaretmz | #ML | #GDE Dependencies Update build.gradle to include tensorflow lite android { // Make sure model doesn't get compressed when app is compiled aaptOptions { noCompress "tflite" } } dependencies { …. // Add dependency for TensorFlow Lite compile 'org.tensorflow:tensorflow-lite:[version-number]’ } Place the mnist.tflite model file under /assets folder 33

Slide 34

Slide 34 text

@margaretmz | #ML | #GDE Input - image data Input to the classifier is an image, your options: ● Draw on canvas from custom View ● Get image from Gallery or a 3rd party camera ● Live frames from Camera2 API Make sure the image dimensions (shape) matches what your classifier expects ● 28x28x1- MNIST or FASHION_MNIST gray scale image ● 299x299x3 - Inception V3 ● 256x256x3 - MobileNet 34

Slide 35

Slide 35 text

@margaretmz | #ML | #GDE Image preprocessing ● Convert Bitmap to ByteBuffer ● Normalize pixel values to be a certain range ● Convert from color to grayscale, if needed 35

Slide 36

Slide 36 text

@margaretmz | #ML | #GDE Run inference Load the model file located under the assets folder Use the TensorFlow Lite interpreter to run inference on the input image 36

Slide 37

Slide 37 text

@margaretmz | #ML | #GDE Post processing The output is an array of probabilities, each correspond to a category Find the category with the highest probability and output result to UI 37

Slide 38

Slide 38 text

@margaretmz | #ML | #GDE Summary ● Training with tf.Keras is easy ● Model conversion to TFLite is easier ● Android implementation is getting better: ○ Validate tflite model before deploy to Android ○ Image pre-processing ○ Input tensor shape? ○ Color or grayscale? ○ Post processing My blog post: E2E tf.Keras to TFLite to Android 38

Slide 39

Slide 39 text

@margaretmz | #ML | #GDE New TFLite features Announced at TensorFlow World: 1. New TFLite support library (link) 2. Model metadata (not yet launched) 3. Model repository pre-converted to tflite format (link to models w/ examples | link to hosted models) 4. Transfer learning made easy - model customization API (link) 5. Ready to use end-to-end tutorials and full example apps (link) 6. TFLite course on Udacity (link) 39

Slide 40

Slide 40 text

@margaretmz | #ML | #GDE TFLite classification demo app Check out the classification Demo app in TensorFlow repo 40

Slide 41

Slide 41 text

@margaretmz | #ML | #GDE Inference with GPU ● Face contour detection ● Link to blog post: TensorFlow Lite Now Faster with Mobile GPUs 41

Slide 42

Slide 42 text

@margaretmz | #ML | #GDE Posenet example ● PoseNet model on Android ● Camera live frames ● Display key body parts in real time ● Link to blog post: Track human poses in real-time on Android with TensorFlow Lite 42

Slide 43

Slide 43 text

@margaretmz | #ML | #GDE More TFLite examples 43

Slide 44

Slide 44 text

@margaretmz | #ML | #GDE On device ML training is finally here! ● Train with ~20 images ● Use transfer learning ● Quantized MobileNetV2 ● Android device (5.0+) Link to blog | Android sample 44

Slide 45

Slide 45 text

@margaretmz | #ML | #GDE TFLite on microcontroller ● Tiny models on tiny computers ● Consumes much less power than CPUs - days on a coin battery ● Tiny RAM and Flash available ● Opens up voice interface to devs More info here - ● Doc - https://www.tensorflow.org/lite/guide/microcontroller ● Code lab - https://g.co/codelabs/sparkfunTF ● Purchase - https://www.sparkfun.com/products/15170 45

Slide 46

Slide 46 text

@margaretmz | #ML | #GDE Coral edge TPU (beta) - hardware for on-device ML acceleration Link to codelab: https://codelabs.developers.google.com/codelabs/edgetpu-classifier/index.html#0 ● Dev board (+ camera module) ● USB Accelerator (+ camera module + Raspberry Pi) Coral Edge TPU 46

Slide 47

Slide 47 text

@margaretmz | #ML | #GDE Coral Edge TPU MobileNet SSD model running on TPU Inference time: < ~20 ms > ~60 fps 47

Slide 48

Slide 48 text

@margaretmz | #ML | #GDE Coral Edge TPU demo MobileNet SSD model running on CPU Inference time > ~390ms ~ 3fps 48

Slide 49

Slide 49 text

@margaretmz | #ML | #GDE On-device ML trends ● Why the future of machine learning is tiny? - Pete Warden ● Deploying to mobile and IoT will get much easier ● TFLite will have many more features ● Federated learning ● On device training 49

Slide 50

Slide 50 text

@margaretmz | #ML | #GDE Awesome TFLite bit.ly/awesome-tflite - please star ⭐ the repo if you find it useful! 50

Slide 51

Slide 51 text

@margaretmz | #ML | #GDE Thank you! 51 Follow me on Twitter, Medium or GitHub to learn more about deep learning, TensorFlow and on-device ML @margaretmz @margaretmz margaretmz