[DroidCon London] Heat the neurons of your smartphone with Deep Learning

Heat the Neurons of Your Smartphone with Deep Learning Qian
Jin Twitter: @bonbonking | Email: [email protected]

About Me !"# About This Talk '

Sylvain Lequeux Data Engineer @ Xebia Yoann Benoit Data Scientist
@ Xebia

On-Device Intelligence

Android Wear 2.0 Smart Reply Source: https://research.googleblog.com/2017/02/on-device-machine-intelligence.html 6 Learned Projection
Model

https://en.wikipedia.org/wiki/Moore%27s_law

Source: https://www.qualcomm.com/news/snapdragon/2017/01/09/tensorflow-machine-learning-now-optimized-snapdragon-835-and-hexagon-682

Credit: 9 Source: https://9to5google.com/2017/01/10/qualcomm-snapdragon-835-machine-learning-tensorflow/

The ultimate goal of the on-device intelligence is to improve
mobile devices’ ability to understand the world. 10 Image inspiration credit: Moodstocks

#datamobile Chat History of the Slack channel

Magritte Ceci n’est pas une pomme.

René Magritte (1898-1967)

Build TensorFlow Android Example With Bazel

Android Developer Deep Learning Noob

NEURONS NEURONS EVERYWHERE

WE CAN RECOGNIZE ALL THE THINGS!

I THOUGHT THERE WERE MODELS FOR EVERYTHING...

Neural Networks in a Nutshell

Here’s a Neural Network 24

Prediction on an image - Inference 25

Prediction on an image - Inference 26

Prediction on an image - Inference 27 Apple: 0.98 Banana:
0.02

How to train a model?

Back Propagation 30

Back Propagation 31 Apple: 0.34 Banana: 0.66

Apple: 0.34 Banana: 0.66 Back Propagation 32 Prediction Error

Back Propagation 35 Apple: 0.87 Banana: 0.13

Back Propagation 36 Banana: 0.93 Apple: 0.07

Deep Convolutional Neural Network & Inception Architecture (Please don’t ask
me any question about this part because I’ve no idea what are they talking about) Credit: http://nicolovaligi.com/history-inception-deep-learning-architecture.html

Network In Network, Lin et al. (2014) 38 Credit: https://arxiv.org/pdf/1312.4400v3.pdf

Going deeper with convolutions, Szegedy et al. (2014) Credit: https://arxiv.org/pdf/1409.4842v1.pdf
39

Rethinking the inception architecture for computer vision, Szegedy et al.
(2015) Credit: https://arxiv.org/pdf/1512.00567v3.pdf 40

Deep Convolutional Neural Network 41 Image Credit: https://github.com/tensorflow/models/tree/master/research/inception Visualisation of
Inception v3 Model Architecture Edges Shapes High Level Features Classifiers

Source: CS231n Convolutional Neural Networks for Visual Recognition http://cs231n.stanford.edu/

Source: https://code.facebook.com/posts/1687861518126048/facebook-to-open-source-ai-hardware-design/

Transfer Learning

Transfer Learning • Use a pre-trained Deep Neural Network •
Keep all operations but the last one • Re-train only the last operation to specialize your network to your classes Keep all weights identical except these ones 45

Gather Training Data 46

47 Retrain a Model Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py \
--bottleneck_dir=tf_files/bottlenecks \ --how_many_training_steps=500 \ --model_dir=tf_files/models/ \ --summaries_dir=tf_files/training_summaries/ \ --output_graph=tf_files/retrained_graph.pb \ --output_labels=tf_files/retrained_labels.txt \ --image_dir=tf_files/fruit_photos

Overfitting

Obtain the Retrained Model •2 outputs: • Model as protobuf
file: contains a version of the selected network with a final layer retrained on your categories • Labels as text file 50 model.pb label.txt

51 public class ClassifierActivity extends CameraActivity implements OnImageAvailableListener { private
static final int INPUT_SIZE = 224; private static final int IMAGE_MEAN = 117; private static final float IMAGE_STD = 1; private static final String INPUT_NAME = "input"; private static final String OUTPUT_NAME = "output"; private static final String MODEL_FILE = "file:///android_asset/ tensorflow_inception_graph.pb"; private static final String LABEL_FILE = "file:///android_asset/ imagenet_comp_graph_label_strings.txt"; } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/ClassifierActivity.java

java.lang.UnsupportedOperationException: Op BatchNormWithGlobalNormalization is not available in GraphDef version 21.
52

Unsupported Operations • Only keep the operations dedicated to the
inference step • Remove decoding, training, loss and evaluation operations 53

54 Optimize for Inference Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/python/tools/optimize_for_inference \
--input=tf_files/retrained_graph.pb \ --output=tf_files/optimized_graph.pb \ --input_names="input" \ --output_names="final_result"

Build Standalone App

Pre-Google I/O 2017 • Use nightly build • Library .so
• Java API jar android { //… sourceSets { main { jniLibs.srcDirs = ['libs'] } } } 56

Post-Google I/O 2017 Source: Android Meets TensorFlow: How to Accelerate
Your App with AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys 57 Currently: 1.4.0-rc0

App size ~80MB 59

Reducing Model Size

WHO CARES? MODEL SIZE 61

Model Size All weights are stored as they are (64-bit
floats) => 80MB 62

~80MB -> ~20MB 63 Weights Quantization 6.372638493746383 => 6.4 Source:
https://www.tensorflow.org/performance/quantization

64 Quantize Graph Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/tools/quantization/quantize_graph.py \ --input=tf_files/optimized_graph.pb
\ --output=tf_files/rounded_graph.pb \ --output_node_names=final_result \ --mode=weights_rounded

MobileNet Mobile-first computer vision models for TensorFlow 65 Image credit
: https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md

Inception V3 v.s. MobileNet 66 Inception V3 78% Accuracy* 85MB
MobileNet (Largest configuration) 70.5% Accuracy* 19MB *: accuracy on ImageNet images

68 Optimize for Mobile > IMAGE_SIZE=224 > ARCHITECTURE="mobilenet_0.50_${IMAGE_SIZE}"

69 Optimize for Mobile Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py \
--bottleneck_dir=tf_files/bottlenecks \ --how_many_training_steps=500 \ --model_dir=tf_files/models/ \ --summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \ --output_graph=tf_files/retrained_graph.pb \ --output_labels=tf_files/retrained_labels.txt \ --architecture="${ARCHITECTURE}" \ —image_dir=tf_files/fruit_photos

~80Mb => ~20Mb => ~1-5Mb Source: https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html

Underneath the Android App

Android SDK (Java) Android NDK (C++) Classifier Implementation TensorFlow JNI
wrapper Image (Bitmap) Trained Model top_results Classifications + Confidence input_tensor 1 2 3 4 Camera Preview Ref: https://jalammar.github.io/Supercharging-android-apps-using-tensorflow/ Overlay Display

Image Sampling Get Image from Camera Preview Crop the center
square Resize Sample Image 74

Converts YUV420 (NV21) to ARGB8888 75 public static native void
convertYUV420ToARGB8888( byte[] y, byte[] u, byte[] v, int[] output, int width, int height, int yRowStride, int uvRowStride, int uvPixelStride, boolean halfSize );

76 /** * Initializes a native TensorFlow session for classifying
images. * * @param assetManager The asset manager to be used to load assets. * @param modelFilename The filepath of the model GraphDef protocol buffer. * @param labels The list of labels * @param inputSize The input size. A square image of inputSize x inputSize is assumed. * @param imageMean The assumed mean of the image values. * @param imageStd The assumed std of the image values. * @param inputName The label of the image input node. * @param outputName The label of the output node. * @throws IOException */ public static Classifier create( AssetManager assetManager, String modelFilename, List<String> labels, int inputSize, int imageMean, float imageStd, String inputName, String outputName) { }

77 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess
bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Preprocess Bitmap / Create Tensor

bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Feed Input Data to TensorFlow

bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Run the Inference Call

bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Fetch the Output Tensor

81 (continue..) // Find the best classifications. PriorityQueue<Recognition> pq =
new PriorityQueue<>( 3, (lhs, rhs) -> { // Intentionally reversed to put high confidence at the head of the queue. return Float.compare(rhs.getConfidence(), lhs.getConfidence()); }); for (int i = 0; i < outputs.length; ++i) { if (outputs[i] > THRESHOLD) { pq.add( new Recognition( "" + i, labels.size() > i ? labels.get(i) : "unknown", outputs[i], null)); } } //... return recognitions; } Find the Best Classification

Adding New Models

Adding a New Model 84 2 * 20 MB =
40 MB

Model Stacking • Start from previous model to keep all
specific operations in the graph • Specify all operations to keep when optimizing for inference graph_util.convert_variables_to_constants(sess, graph.as_graph_def(), [“final_result_fruits”, “final_result_vegetables”]

Android Makers Paris 2017

87 Source: https://www.youtube.com/watch?v=EnFyneRScQ8

Continuous Training Pipeline 89 Source: https://www.tensorflow.org/serving/

TensorFlow Serving Hosts the model provides remote access to it
90 Source: https://www.tensorflow.org/serving/

model.pb label.txt Continuous Training Pipeline

model.pb label.txt Dispensing Model

Currently with Project Magritte… Training • Model debug done by
overheating a laptop • Model built on personal GPU • Files uploaded manually Model dispensing • API available • Deployment on AWS, currently migrating on Google Cloud 93

Android App Evolves

Android FilesDir model.pb Labels model.pb label.txt

public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model;
this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java

Model Inception V3 Optimized & Quantized

Model MobileNets_1.0_224

Demo Time

What’s next?

Source: Android Meets TensorFlow: How to Accelerate Your App with
AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys

Federate Learning Collaborative Machine Learning without Centralized Training Data 110
Source: https://research.googleblog.com/2017/04/federated-learning-collaborative.html

Resources

Resources • Artificial neural network: https://en.wikipedia.org/wiki/Artificial_neural_network • Deep Learning: https://en.wikipedia.org/wiki/Deep_learning
• Convolutional Neural Network: https://en.wikipedia.org/wiki/Convolutional_neural_network • TensorFlow for Poets: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ • TensorFlow for Poets 2: Optimize for Mobile: https://codelabs.developers.google.com/ codelabs/tensorflow-for-poets-2/ • TensorFlow Glossary: https://www.tensorflow.org/versions/r0.12/resources/glossary • Magritte project blog: http://blog.xebia.fr/2017/07/24/on-device-intelligence-integrez-du- deep-learning-sur-vos-smartphones/ 112

Thank you! Github: https://github.com/xebia-france/magritte Twitter: @bonbonking Email: [email protected]

[DroidCon London] Heat the neurons of your smar...

[DroidCon London] Heat the neurons of your smartphone with Deep Learning

More Decks by jinqian

Other Decks in Technology

Featured

Transcript