GIDS18_SupriyaSrivatsa.pdf

TensorFlow for Mobile Machine Learning Supriya Srivatsa, Software Engineer, Xome

Overview • AI and Mobile – the Convergence • Inference
– Today and Tomorrow • TensorFlow Primer • TensorFlow in your Pocket – TensorFlow Mobile – TensorFlow Lite • PokéDemo • Applications and Case Studies • Q & A

AI AND MOBILE – THE CONVERGENCE

INFERENCE - TODAY AND TOMORROW

The “Transfer to Infer” Approach

Why On Device Prediction • Data Privacy • Poor Internet
Connection • Questionable User Experience

Why On Device Prediction Case Study: Portrait Mode

TENSORFLOW PRIMER

TensorFlow – Deferred Execution Model (Building the Computational Graph) import
tensorflow as tf num1 = tf.constant(5) num2 = tf.constant(10) sum = num1 + num2 print(sum) #O/P: Tensor("add:0", shape=(), dtype=int32)

TensorFlow – Deferred Execution Model (Running the Computational Graph) import
tensorflow as tf num1 = tf.constant(5) num2 = tf.constant(10) sum = num1 + num2 with tf.Session() as sess: print(sess.run(sum)) #O/P: 15

TENSORFLOW IN YOUR POCKET

Pick Your Weapon • Choose a pre-trained TF Model –
Inception V3 Model – MNIST – Smart Reply – Deep Speech • Build a TF Model

Sharpen your Sword • Retrain Model as required.

Neural Network and Transfer Learning

TENSORFLOW MOBILE VS TENSORFLOW LITE

TensorFlow Lite • Smaller binary size, better performance. • Ability
to leverage hardware acceleration. • Only supports a limited set of operators.

TensorFlow Mobile and TensorFlow Lite

Optimization • optimize_for_inference • Quantization

Quantization • Round it up • Transform: round_weights • Compression
rates: ~8% => ~70% • Shrink down node names • Transform: obfuscate_names • Eight bit calculations

Quantization – Eight Bit Calculations

Optimization – Before and After

TensorFlow Mobile and TensorFlow Lite

TensorFlow Lite • TOCO – TensorFlow Lite Optimizing Converter –
Pruning unused nodes. – Performance Improvements. – Convert to tflite format. (Generate FlatBuffer file.)

ü Frozen ü Optimized, Quantized ü .tflite / FlatBuffer

How does it work?

Packaging App and Model

CODE AWAY J

Code Away – Gradle Files

Code Away :) Tflite = new Interpreter(<loadmodelfile>) tflite.run(giveInput, outputObject) •
Create Interpreter • Run model with input, fetch output.

POKÉDEMO!

PokéDemo

APPLICATIONS AND CASE STUDIES

Coca Cola

Google Assistant

Smart Reply

Thank you

GIDS18_SupriyaSrivatsa.pdf

GIDS18_SupriyaSrivatsa.pdf

More Decks by Supriya Srivatsa

Other Decks in Technology

Featured

Transcript