Faites chauffer les neurones de votre smartphone avec du deep learning on-device

Faites chauffer les neurones de votre smartphone avec du deep
learning on-device Qian Jin, aka "SpeedRabbit" Yoann Benoit, aka "TensorMan" Sylvain Lequeux, aka "TeddyBière" @bonbonking @YoannBENOIT @slequeux

On-Device (Machine) Intelligence

Learned Projection Model Android Wear 2.0 Smart Reply Source: https://research.googleblog.com/2017/02/on-device-machine-intelligence.html

Source: Qualcomm

The ultimate goal of the on-device intelligence is to improve
mobile devices’ ability to understand the world.

#datamobile Chat History of the Slack channel 9

Magritte Ceci n’est pas une pomme

Build TensorFlow android example with Bazel

Android developer Deep learning Noob

NEURONS NEURONS EVERYWHERE 17

WE CAN RECOGNIZE ALL THE THINGS! 18

I THOUGHT THERE WERE MODELS FOR EVERYTHING... 19

Neural networks in a nutshell

Apple: 0.98 Banana: 0.02

Training a model

Apple: 0.34 Banana: 0.66 Prediction error

Transfert learning

Keep all weights identical except these ones Use a pre-trained
Deep Neural Network Keep all operations but the last one Re-train only the last operation to specialize your network to your classes

Creating the final step in transfer learning with tf.name_scope("input"): bottleneck_input
= tf.placeholder_with_default(bottleneck_tensor, shape=[None, bottleneck_size]) with tf.name_scope("intermediate_training_ops"): layer_weights = tf.Variable(tf.truncated_normal([bottleneck_size, 500], stddev=0.001)) layer_biases = tf.Variable(tf.zeros([500])) hidden = tf.nn.relu(tf.matmul(bottleneck_input, layer_weights) + layer_biases) dropout = tf.nn.dropout(hidden, keep_prob=keep_prob) with tf.name_scope("final_training_ops"): layer_weights = tf.Variable(tf.truncated_normal([500, class_count], stddev=0.001)) layer_biases = tf.Variable(tf.zeros([class_count])) logits = tf.matmul(dropout, layer_weights) + layer_biases final_tensor = tf.nn.softmax(logits)

Two things to save • Execution graph • Weights for
each operation Two outputs • Model as protobuf file • Labels in text files with gfile.FastGFile(FLAGS.output_graph_path, 'wb') as f: f.write(output_graph_def.SerializeToString()) model.pb label.txt

java.lang.UnsupportedOperationException: Op BatchNormWithGlobalNormalization is not available in GraphDef version 21.
38

Unsupported operation Only keep the operations dedicated to the inference
step Remove decoding, training, loss and evaluation operations

Data Scientist ANdroid Noob

CLICK 7 TIMES ON BUILD NUMBER 41

Build standalone app

Use nightly build • Library.so • Java API Jar Pre-Google
I/O android { //… sourceSets { main { jniLibs.srcDirs = ['libs'] } } }

POST-Google I/O 1.4.0-rc0

App size ~80M

Reducing model size

WHO CARES? MODEL SIZE 48

All weight are stored as they are (64-bits float) =>
80Mb

Weights quantization 6.372638493746383 => 6.4 80Mb => 20Mb

Mobilenets

MobileNets : Mobile-First computer vision models for TensorFlow

Image credit : Google Research Blog 80Mb => 20Mb =>
1~5Mb

Image credit : https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md

mAGRITTE aNDROID app ARCHITECTURE

Android SDK (Java) Android NDK (C++) Classifier Implementation TensorFlow JNI
wrapper Image (Bitmap) Trained Model top_results Classifications + Confidence input_tensor 1 2 3 4 Camera Preview Ref: https://jalammar.github.io/Supercharging-android-apps-using-tensorflow/ Overlay Display 57

Image sampling on ANdroid device Get Image from Camera Preview
Crop the center square Resize Sample Image

Converts YUV420 to ARGB8888 public static native void convertYUV420ToARGB8888 (
byte[] y, byte[] u, byte[] v, int[] output, int width, int height, int yRowStride, int uvRowStride, int uvPixelStride, boolean halfSize );

Steps of Recognizing Image @Override public List<Recognition> recognizeImage(final Bitmap bitmap)
{ // Preprocess bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..)

(continue..) // Find the best classifications. PriorityQueue<Recognition> pq = new
PriorityQueue<>( 3, (lhs, rhs) -> { // Intentionally reversed to put high confidence at the head of the queue. return Float.compare(rhs.getConfidence(), lhs.getConfidence()); }); for (int i = 0; i < outputs.length; ++i) { if (outputs[i] > THRESHOLD) { pq.add( new Recognition( "" + i, labels.size() > i ? labels.get(i) : "unknown", outputs[i], null)); } } //... return recognitions; } Steps of Recognizing Image

Adding new models

2 x 20 Mb = 40 Mb Adding a new
model

DRY Who’s a craftsman ?

Start from previous model to keep all specific operations in
the graph Specify all operations to keep when optimizing for inference Model stacking graph_util.convert_variables_to_constants(sess, graph.as_graph_def(), [“final_result_fruits”, “final_result_vegetables”]

Android Makers Paris 2017

To cloud and beyond - Training

To cloud and beyond - Training model. pb label .txt

to cloud and beyond - Serving model. pb label .txt

as for today... Serving • API available • Deployment on
AWS, currently migrating on Google Cloud

as for today... Serving • API available • Deployment on
AWS, currently migrating on Google Cloud Training • Model debug done by overheating a laptop • Model built on personal GPU • Files uploaded manually

Android App evolves!

model.pb label.txt Android FilesDir model.pb Labels

public TensorFlowInferenceInterface(AssetManager var1, String var2) { //... this.modelName = var2;
this.g = new Graph(); this.sess = new Session(this.g); this.runner = this.sess.runner(); boolean var3 = var2.startsWith("file:///android_asset/"); Object var4 = null; try { String var5 = var3?var2.split("file:///android_asset/")[1]:var2; var4 = var1.open(var5); } catch (IOException var11) { if(var3) { throw new RuntimeException("Failed to load model from '" + var2 + "'", var11); } try { var4 = new FileInputStream(var2); } catch (IOException var8) { throw new RuntimeException("Failed to load model from '" + var2 + "'", var11); } }} }

Model Inception V3 Optimized

Model Mobilenets 1.0

Demo time

Next Steps

Federated Learning: Collaborative Machine Learning without Centralized Training Data

Thanks! Merci! 谢谢! xebia-france/magritte

Faites chauffer les neurones de votre smartphon...

Faites chauffer les neurones de votre smartphone avec du deep learning on-device

More Decks by jinqian

Other Decks in Technology

Featured

Transcript