On-device ML with TFLite - DevFest Vancouver

Slide 1

Slide 1 text

On-device ML with TensorFlow Lite DevFest Vancouver Margaret Maynard-Reid, 9/7/2019 @margaretmz

Slide 2

Slide 2 text

@margaretmz | #MachineLearning #GDE Slides Slides for this talk are posted on speakerdeck: bit.ly/on-device-ml-tﬂite-devfest-vancouver Click on download PDF to access the links 2

Slide 3

Slide 3 text

@margaretmz | #MachineLearning #GDE Topics ● Intro to TF 2.0 & tf.Keras ● On-device ML options ● E2E tf.Keras to TFLite to Android ○ train a model from scratch ○ convert to TFLite ○ deploy to mobile and IoT ● TFLite models on microcontroller & Coral Edge TPU 3

Slide 4

Slide 4 text

Intro AI, ML, Deep Learning Computer Vision TensorFlow, Keras 4

Slide 5

Slide 5 text

@margaretmz | #MachineLearning #GDE AI vs. ML vs. Deep Learning Artificial Intelligence Machine Learning Deep Learning: - Computer Vision - NLP ….

Slide 6

Slide 6 text

@margaretmz | #MachineLearning #GDE Examples of computer vision 6 Generative Adversarial Networks (GANs) Generating new images Image classification Is this a cat? Object detection Drawing bounding boxes around the objects Dance Like @I/O Segmentation, pose, GPU on-device Other examples: - Photos enhancement - Style transfer - OCR - Face keypoints

Slide 7

Slide 7 text

@margaretmz | #MachineLearning #GDE Deep Learning - getting started ● Deep learning Frameworks: ○ TensorFlow (>129k stars on Github) ← most popular! ○ PyTorch ○ Caffe (1 & 2) ○ Theano… ● Languages: Python, Swift, Javascript etc. ● IDE - Colab ● Popular neural networks: ○ CNN (Convolutional Neural Networks) ○ RNN (Recurrent Neural Networks) ○ Generative Models (Auto encoder, GANs) ○ ... 7

Slide 8

Slide 8 text

@margaretmz | #MachineLearning #GDE TensorFlow model building APIs TensorFlow is a deep learning framework for both research & production Write TensorFlow code in C++, Python, Java, R, Go, SWIFT, JavaScript Deploy to CPU, GPU, TPU, Mobile, Android Things, Raspberry Pi tf.* tf.layers tf.keras Custom Estimator Premade Estimator ← Low level ← Mid level (moving to tf.keras in TF 2.0) ← High level ← Model in a box ← Distributed execution, tf serving 8 TensorFlow 2.0 Beta just got announced! | My Notes on TensorFlow 2.0

Slide 9

Slide 9 text

@margaretmz | #MachineLearning #GDE tf.Keras vs Keras No 1:1 mapping between tf.Keras and Keras 9 tf.keras - part of the TensorFlow core APIs import tensorﬂow as tf # import TensorFlow from tensorﬂow import keras # import Keras Keras remains an independent open-source project, with backend: ● TensorFlow (Protip: use tf.keras, instead of Keras + TF as backend) ● Theano ● CNTK...

Slide 10

Slide 10 text

@margaretmz | #MachineLearning #GDE tf.Keras model building APIs ● Sequential - the easiest way ● Functional - more ﬂexibility ● Model subclassing - extend a Model class Learn more in Josh Gordon’s blog: What are Symbolic and Imperative APIs in TensorFlow 2.0? 10

Slide 11

Slide 11 text

@margaretmz | #MachineLearning #GDE Blog.tensorflow.org TensorFlow and ML learning resources Tensorflow.org Deep learning with Python by Francois Chollet TensorFlow on Youtube TensorFlow on Twitter #AskTensorFlow #TensorFlowMeets Collection of interactive ML examples (blogpost | website) 11 Interested in learning about TensorFlow 2.0 and try it out? Read My Notes on TensorFlow 2.0 TensorFlow Dev Summit 2019 By Aurélien Géron

Slide 12

Slide 12 text

@margaretmz | #MachineLearning #GDE Anaconda, TensorFlow & Keras Why use a virtual environment? Ease of upgrade/downgrade of tensorﬂow ● Download anaconda here ● Create a new virtual environment $ conda create -n [my-env-name] ● Activate the virtual environment you created $ conda activate [my-env-name] ● Install TensorFlow beta $ pip install tensorﬂow==2.0.0-beta1 My blog post Anaconda, Jupyter Notebook, TensorFlow, Keras 12

Slide 13

Slide 13 text

@margaretmz | #MachineLearning #GDE Google Colab What is Google Colab? ● Jupyter Notebook ○ stored on Google Drive ○ running on Google’s VM in the cloud ● Free GPU and TPU! ● TensorFlow is already installed ● Save and share from your Drive ● Save directly to GitHub 13 Check out these learning resources ● My blog on Colab ● TF team’s blog on Colab ● Laurence’ Video Build a deep neural network in 4 mins with TensorFlow in Colab ● Paige’s video How to take advantage of GPUs & TPUs for your ML project ● Sam’s blog Keras on TPUs in Colab Launch Colab from colab.research.google.com/

Slide 14

Slide 14 text

@margaretmz | #MachineLearning #GDE TensorBoard in Colab TensorBoard now integrated in Colab! ● Debug ● Monitor ● Visualize Lab - https://www.tensorflow.org/tensorboard/r2/tensorboard_in_notebooks 14

Slide 15

Slide 15 text

ML Pipeline 15

Slide 16

Slide 16 text

On-device ML What are your options? 16

Slide 17

Slide 17 text

@margaretmz | #MachineLearning #GDE TensorFlow for edge devices 17 2015 TF open sourced 2016 TF mobile 2017 TF Lite developer preview 2018 ML Kit 2019 TF Mobile deprecated ML Kit improves TF Lite exits dev preview More than just mobile apps: ● Microcontrollers ● Edge TPUs

Slide 18

Slide 18 text

@margaretmz | #MachineLearning #GDE TensorFlow Lite ● For deploying to edge devices ● Works with Inception & MobileNet ● May not support all operations ● Supports ○ Mobile: Android & IOS ○ Android Things ○ Raspberry Pi ○ Microcontroller ○ Edge TPU 18

Slide 19

Slide 19 text

@margaretmz | #MachineLearning #GDE Optimization TFLite model optimization toolkit ● Quantization - convert 32 bit ﬂoating point to ﬁxed point (e.g. 8-bit int) ○ Post-training quantization ○ Quantization-aware training ● Pruning - eliminating unnecessary values in the weight tensor Android: ● GPU delegate ● Android NNAPI 19

Slide 20

Slide 20 text

@margaretmz | #MachineLearning #GDE Image labelling OCR Face detection Barcode scanning Landmark detection Smart reply (coming soon) Object detection & Tracking Translation (56 languages) AutoML ML Kit 20 Brings Google’s ML expertise to mobile developers in a powerful and easy-to-use package. Powered by TF Lite and hosted on Firebase Base APIs: Custom models ● Dynamic model downloads ● A/B testing (via Firebase remote Conﬁguration) ● Model compression & conversion (from TensorFlow to TF Lite)

Slide 21

Slide 21 text

@margaretmz | #MachineLearning #GDE Android ML with TensorFlow Your options: ● With ML Kit ○ (Out of the box) Base APIs ○ Custom model ● Direct deploy to Android ○ Custom model 21 Custom Models ● Download pre trained models ● Retrain model ● Train your own from scratch ○ data ○ train ○ convert ○ inference Note: you can use AutoML to train but no easy implementation on mobile until recently

Slide 22

Slide 22 text

@margaretmz | #MachineLearning #GDE End to end: model training to inference 22 Model ● tf.Keras (TensorFlow) ● Python libraries: Numpy, Matplotlib etc SavedModel or Keras model Serving ● Cloud ● Web ● Mobile ● IoT ● Micro controllers ● Edge TPU Training Inference Data

Slide 23

Slide 23 text

@margaretmz | #MachineLearning #GDE Data ● Existing datasets ○ Part of the deep learning framework: ■ MNIST, CIFAR10, FASHION_MNIST, IMDB movie reviews etc ○ Open datasets: ■ MNIST, MS-COCO, IMAGENet, CelebA etc ○ Kaggle datasets: https://www.kaggle.com/datasets ○ Google Dataset search tool: https://toolbox.google.com/datasetsearch ○ TF 2.0: TFDS ● Collect your own data 23

Slide 24

Slide 24 text

@margaretmz | #MachineLearning #GDE Models Options of getting a model: ● Download a pre-trained model (here): Inception-v3, mobilenet etc. ● Transfer learning with a pre-trained model ○ Feature extraction or ﬁne tuning on pre-trained model ○ TensorFlow hub (https://www.tensorﬂow.org/hub/) ● Train your own model from scratch (example in this talk) 24

Slide 25

Slide 25 text

@margaretmz | #MachineLearning #GDE Model saving, conversion, deployment ● Model saving - SavedModel or Keras model ● Model conversion ○ Convert the model to tﬂite format ○ Validate the converted model before deploy ● Deploy TFLite for inference 25

Slide 26

Slide 26 text

Datasets Train model Convert to TFLite Deploy for inference End to End tf.Keras to TFLite to Android Train a model from scratch 26

Slide 27

Slide 27 text

@margaretmz | #MachineLearning #GDE MNIST dataset ● 60,000 train set and 10,000 test set ● 28x28x1 grayscale images ● 10 classes: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ● Popular for computer vision ○ “hello world” tutorial or ○ benchmarking ML algorithms 27

Slide 28

Slide 28 text

@margaretmz | #MachineLearning #GDE Training the model in Colab Launch sample code on Colab → mnist_tfkeras_to_tflite.ipynb 1. Import data 2. Define model architecture 3. Train the model 4. Model saving & conversion ○ Save a Keras model ○ convert to tflite format 28

Slide 29

Slide 29 text

@margaretmz | #MachineLearning #GDE A typical CNN model architecture MNIST example: ● Convolutional layer (definition) ● Pooling layer (definition) ● Dense (fully-connected layer) definition 29 input conv pool conv pool conv pool Dense 0 1 2 3 4 5 6 7 8 9

Slide 30

Slide 30 text

@margaretmz | #MachineLearning #GDE Inspect the model - in python code In python code, after deﬁning the model architecture, use model.summary() to show the model architecture 30

Slide 31

Slide 31 text

@margaretmz | #MachineLearning #GDE Virtualize model Use a visualization tool: ● TensorBoard ● Netron (https://github.com/lutzroeder/Netron) Drop the .tﬂite model into Netron and see the model visually 31

Slide 32

Slide 32 text

@margaretmz | #MachineLearning #GDE TensorFlow Lite Converter Convert Keras model → a tflite model with the tflite converter There are two options: 1. Command line 2. Python API Note: ● you can convert from SavedModel as well, ● GraphDef and tf.Session are no longer supported in 2.0 for TFLite conversion. Read details on tflite converter on TF documentation here 32

Slide 33

Slide 33 text

@margaretmz | #MachineLearning #GDE Tflite convert through command line To convert a tf.keras model to a tflite model: $ tflite_convert \ $--output_file=mymodel.tflite \ $ --keras_model_file=mymodel.h5 33

Slide 34

Slide 34 text

@margaretmz | #MachineLearning #GDE Tﬂite convert through Python code Note: converter API is different between TF 1.13, 1.14, 2.0 Alpha & nightly # Create a converter converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file(keras_model) # Set quantize to true converter.post_training_quantize=True # Convert the model tflite_model = converter.convert() # Create the tflite model file tflite_model_name = "mymodel.tflite" open(tflite_model_name, "wb").write(tflite_model) 34

Slide 35

Slide 35 text

@margaretmz | #MachineLearning #GDE Validate the tflite model Protip: validate the tflite model in python after conversion - # Load TFLite model and allocate tensors. interpreter = tf.lite.Interpreter(model_path="converted_model.tflite") interpreter.allocate_tensors() # Get input and output tensors. input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() # Test model on random input data. input_shape = input_details[0]['shape'] input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() Tflite_results = interpreter.get_tensor(output_details[0]['index']) # Test the TensorFlow model on random input data. tf_results = model(tf.constant(input_data)) # Compare the result. for tf_result, tflite_result in zip(tf_results, tflite_results): 35

Slide 36

Slide 36 text

@margaretmz | #MachineLearning #GDE Validate the tflite model Protip: validate the tflite model in python after conversion - 36 TensorFlow result TFLite result Compare results # Test the TensorFlow model on random Input data. tf_result = model(tf.constant(input_data)) # Load TFLite model and allocate tensors. interpreter = tf.lite.Interpreter(model_path="converted_model.tflite") interpreter.allocate_tensors() # Get input and output tensors. input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() # Test model on random input data. input_shape = input_details[0]['shape'] input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() tflite_result = interpreter.get_tensor(output_details[0]['index']) # Compare the result. for tf_result, tflite_result in zip(tf_result, tflite_result): np.testing.assert_almost_equal(tf_result, tflite_result, decimal=5)

Slide 37

Slide 37 text

@margaretmz | #MachineLearning #GDE Tﬂite on Android Android sample code DigitRecognizer, step by step: ● Place tf.lite model under assets folder ● Update build.gradle dependencies ● Input image - custom view, gallery or camera ● Data preprocessing ● Classify with the model ● Post processing ● Display result in UI 37

Slide 38

Slide 38 text

@margaretmz | #MachineLearning #GDE Dependencies Update build.gradle to include tensorflow lite android { // Make sure model doesn't get compressed when app is compiled aaptOptions { noCompress "tflite" } } dependencies { …. // Add dependency for TensorFlow Lite compile 'org.tensorflow:tensorflow-lite:[version-number]’ } Place the mnist.tflite model file under /assets folder 38

Slide 39

Slide 39 text

@margaretmz | #MachineLearning #GDE Input - image data Input to the classiﬁer is an image, your options: ● Draw on canvas from custom View ● Get image from Gallery or a 3rd party camera ● Live frames from Camera2 API Make sure the image dimensions (shape) matches what your classiﬁer expects ● 28x28x1- MNIST or FASHION_MNIST gray scale image ● 299x299x3 - Inception V3 ● 256x256x3 - MobileNet 39

Slide 40

Slide 40 text

@margaretmz | #MachineLearning #GDE Image preprocessing ● Convert Bitmap to ByteBuffer ● Normalize pixel values to be a certain range ● Convert from color to grayscale, if needed 40

Slide 41

Slide 41 text

@margaretmz | #MachineLearning #GDE Run inference Load the model ﬁle located under the assets folder Use the TensorFlow Lite interpreter to run inference on the input image 41

Slide 42

Slide 42 text

@margaretmz | #MachineLearning #GDE Post processing The output is an array of probabilities, each correspond to a category Find the category with the highest probability and output result to UI 42

Slide 43

Slide 43 text

@margaretmz | #MachineLearning #GDE Summary ● Training with tf.Keras is easy ● Model conversion to TFLite is easier ● Android implementation is still challenging & error-prone: (Hopefully this gets improved in the future!) ○ Validate tﬂite model before deploy to Android ○ Image pre-processing ○ Input tensor shape? ○ Color or grayscale? ○ Post processing My blog post: E2E tf.Keras to TFLite to Android 43

Slide 44

Slide 44 text

@margaretmz | #MachineLearning #GDE TFLite demo app Check out the Demo app in TensorFlow repo Clone tensorflow project from github git clone https://www.github.com/tensorflow/tensorflow Then open the tflite Android demo from Android Studio /tensorflow/lite/java/demo 44

Slide 45

Slide 45 text

@margaretmz | #MachineLearning #GDE More TFLite examples More TensorFlow examples → 45

Slide 46

Slide 46 text

@margaretmz | #MachineLearning #GDE Inference with GPU ● Face contour detection ● Link to blog post: TensorFlow Lite Now Faster with Mobile GPUs (Developer Preview) 46

Slide 47

Slide 47 text

@margaretmz | #MachineLearning #GDE Posenet example ● PoseNet model on Android ● Camera live frames ● Display key body parts in real time ● Link to blog post: Track human poses in real-time on Android with TensorFlow Lite 47

Slide 48

Slide 48 text

@margaretmz | #MachineLearning #GDE TFLite on microcontroller ● Tiny models on tiny computers ● Consumes much less power than CPUs - days on a coin battery ● Tiny RAM and Flash available ● Opens up voice interface to devs More info here - ● Doc - https://www.tensorflow.org/lite/guide/microcontroller ● Code lab - https://g.co/codelabs/sparkfunTF ● Purchase - https://www.sparkfun.com/products/15170 48

Slide 49

Slide 49 text

@margaretmz | #MachineLearning #GDE Coral edge TPU (beta) - hardware for on-device ML acceleration Link to codelab: https://codelabs.developers.google.com/codelabs/edgetpu-classifier/index.html#0 ● Dev board (+ camera module) ● USB Accelerator (+ camera module + Raspberry Pi) Coral Edge TPU 49

Slide 50

Slide 50 text

@margaretmz | #MachineLearning #GDE Coral Edge TPU MobileNet SSD model running on TPU Inference time: < ~20 ms > ~60 fps 50

Slide 51

Slide 51 text

@margaretmz | #MachineLearning #GDE Coral Edge TPU demo MobileNet SSD model running on CPU Inference time > ~390ms ~ 3fps 51

Slide 52

Slide 52 text

@margaretmz | #MachineLearning #GDE Upcoming ● Why the future of machine learning is tiny? - Pete Warden ● Deploying to mobile and IoT will get much easier ● TFLite will have many more features ● Federated learning ● On device training 52

Slide 53

Slide 53 text

@margaretmz | #MachineLearning #GDE Thank you! 53 Follow me on Twitter, Medium or GitHub to learn more about Deep learning, TensorFlow and on-device ML @margaretmz @margaretmz margaretmz