ML for Mobiles using TensorFlow Lite

ML For Mobiles using TensorFlow Lite Avinash Hindupur @hindupuravinash

Why On-Device Machine Learning? ML is moving more towards on-device,
allowing us to build new types of applications and products. • Access to a lot of data • Super fast interactions • Pro-Privacy and security • Serve offline

But what’s the big deal? All embedded devices are constrained:
• Reduced compute power • Limited memory • Battery constraints

Cloud Prediction Training Local Google Cloud Microsoft Azure AWS TensorFlow
Lite TensorFlow.js CoreML Tensorflow Keras PyTorch Algorithmia Google Cloud ML APIs Microsoft Cognitive Services Amazon ML Tools Landscape

What’s TensorFlow Lite? TensorFlow Lite is an open source deep
learning framework for on-device inference • Easy to use • Lightweight • Fast • Cross-platform

Why TensorFlow Lite? • A set of core operators -
quantized and float tuned for mobile platforms. • A new FlatBuffers-based model file format. • On-device interpreter with kernels optimized for faster execution on mobile. • TensorFlow converter to convert TF-trained models to the .tflite format. • Smaller in size - less than 300KB • Numerous pre-tested models • Java and C++ API support

Okay. So what? Simplifying ML on-device - TF Lite makes
these challenges much easier Text Audio Video Speech Content

But, is it production ready? Recently released Tensorflow Lite 1.0
2 Bn

Use Cases - Gboard End-to-end, all-neural, on-device speech recognizer to
power speech input in Gboard 4x compression 4x speedup at run-time 80MB final model [Source: An All-Neural On-Device Speech Recognizer]

Use Cases - Face Contour detection TensorFlow Lite GPU inference
in Pixel 3 Portrait mode accelerates: • Foreground-background segmentation model by over 4x • Depth estimation model by over 10x [Source: TensorFlow Lite Now Faster with Mobile GPUs]

How Does It Work? Pick a model Deploy Optimize Convert

Pick a Model Depending on the use case • Use
a pre-trained and hosted model • Retrain using custom data • Train a custom model

Convert This stage consists of two steps: 1. Freeze graph
freeze_graph --input_graph=/tmp/mobilenet_v1_224.pb \ --input_checkpoint=/tmp/checkpoints/mobilenet-10202.ckpt \ --input_binary=true \ --output_graph=/tmp/frozen_mobilenet_v1_224.pb \ --output_node_names=MobileNetV1/Predictions/Reshape_1 The process of merging the checkpoint values with the graph structure

Convert 2. Convert the graph tflite_convert \ --output_file=/tmp/mobilenet_v1_1.0_224.tflite \ --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb
\ --input_arrays=input \ --output_arrays=MobilenetV1/Predictions/Reshape_1 .tflite - A serialized FlatBuffer that contains TensorFlow Lite operators and tensors for the TensorFlow Lite interpreter.

Optimize This step is optional - Post training quantization import
tensorflow as tf converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE] tflite_quant_model = converter.convert() Lowering the precision of parameters from their training-time 32-bit floating-point representations into much smaller and efficient 8-bit integer ones

Optimize models to reduce • Size • Latency • Power
Results in negligible loss in accuracy Source: Introducing Model Optimization Toolkit for TensorFlow

Deploy • Install SDKs and NDK • Install Bazel •
Import the TensorFlow AAR file • Build the source code with Bazel • Implement the Interpreter • Run the apk Resource: Using TensorFlow on Android Trained TF Model .tflite model file TFLite Converter Interpreter Android NN API Java API C++ API Interpreter C++ API Kernels Kernels Android App iOS App Architecture

Need More Help? • Super strong community • High quality
documentation • Committed team

Stay Updated • Do side projects • Collaborate • Attend
Meetups • Reach out • Follow Deep Hunt www.deephunt.in

Thank You!

ML for Mobiles using TensorFlow Lite

ML for Mobiles using TensorFlow Lite

Avinash Hindupur

More Decks by Avinash Hindupur

Other Decks in Technology

Featured

Transcript