Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[DroidCon London] Heat the neurons of your smar...

jinqian
October 27, 2017

[DroidCon London] Heat the neurons of your smartphone with Deep Learning

jinqian

October 27, 2017
Tweet

More Decks by jinqian

Other Decks in Technology

Transcript

  1. Heat the Neurons of Your Smartphone with Deep Learning Qian

    Jin Twitter: @bonbonking | Email: qjin@xebia.fr
  2. The ultimate goal of the on-device intelligence is to improve

    mobile devices’ ability to understand the world. 10 Image inspiration credit: Moodstocks
  3. 16

  4. 18

  5. 29

  6. Deep Convolutional Neural Network & Inception Architecture (Please don’t ask

    me any question about this part because I’ve no idea what are they talking about) Credit: http://nicolovaligi.com/history-inception-deep-learning-architecture.html
  7. Rethinking the inception architecture for computer vision, Szegedy et al.

    (2015) Credit: https://arxiv.org/pdf/1512.00567v3.pdf 40
  8. Transfer Learning • Use a pre-trained Deep Neural Network •

    Keep all operations but the last one • Re-train only the last operation to specialize your network to your classes Keep all weights identical except these ones 45
  9. 47 Retrain a Model Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py \

    --bottleneck_dir=tf_files/bottlenecks \ --how_many_training_steps=500 \ --model_dir=tf_files/models/ \ --summaries_dir=tf_files/training_summaries/ \ --output_graph=tf_files/retrained_graph.pb \ --output_labels=tf_files/retrained_labels.txt \ --image_dir=tf_files/fruit_photos
  10. Obtain the Retrained Model •2 outputs: • Model as protobuf

    file: contains a version of the selected network with a final layer retrained on your categories • Labels as text file 50 model.pb label.txt
  11. 51 public class ClassifierActivity extends CameraActivity implements OnImageAvailableListener { private

    static final int INPUT_SIZE = 224; private static final int IMAGE_MEAN = 117; private static final float IMAGE_STD = 1; private static final String INPUT_NAME = "input"; private static final String OUTPUT_NAME = "output"; private static final String MODEL_FILE = "file:///android_asset/ tensorflow_inception_graph.pb"; private static final String LABEL_FILE = "file:///android_asset/ imagenet_comp_graph_label_strings.txt"; } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/ClassifierActivity.java
  12. Unsupported Operations • Only keep the operations dedicated to the

    inference step • Remove decoding, training, loss and evaluation operations 53
  13. 54 Optimize for Inference Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/python/tools/optimize_for_inference \

    --input=tf_files/retrained_graph.pb \ --output=tf_files/optimized_graph.pb \ --input_names="input" \ --output_names="final_result"
  14. Pre-Google I/O 2017 • Use nightly build • Library .so

    • Java API jar android { //… sourceSets { main { jniLibs.srcDirs = ['libs'] } } } 56
  15. Post-Google I/O 2017 Source: Android Meets TensorFlow: How to Accelerate

    Your App with AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys 57 Currently: 1.4.0-rc0
  16. ~80MB -> ~20MB 63 Weights Quantization 6.372638493746383 => 6.4 Source:

    https://www.tensorflow.org/performance/quantization
  17. MobileNet Mobile-first computer vision models for TensorFlow 65 Image credit

    : https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
  18. Inception V3 v.s. MobileNet 66 Inception V3 78% Accuracy* 85MB

    MobileNet (Largest configuration) 70.5% Accuracy* 19MB *: accuracy on ImageNet images
  19. 69 Optimize for Mobile Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py \

    --bottleneck_dir=tf_files/bottlenecks \ --how_many_training_steps=500 \ --model_dir=tf_files/models/ \ --summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \ --output_graph=tf_files/retrained_graph.pb \ --output_labels=tf_files/retrained_labels.txt \ --architecture="${ARCHITECTURE}" \ —image_dir=tf_files/fruit_photos
  20. Android SDK (Java) Android NDK (C++) Classifier Implementation TensorFlow JNI

    wrapper Image (Bitmap) Trained Model top_results Classifications + Confidence input_tensor 1 2 3 4 Camera Preview Ref: https://jalammar.github.io/Supercharging-android-apps-using-tensorflow/ Overlay Display
  21. Converts YUV420 (NV21) to ARGB8888 75 public static native void

    convertYUV420ToARGB8888( byte[] y, byte[] u, byte[] v, int[] output, int width, int height, int yRowStride, int uvRowStride, int uvPixelStride, boolean halfSize );
  22. 76 /** * Initializes a native TensorFlow session for classifying

    images. * * @param assetManager The asset manager to be used to load assets. * @param modelFilename The filepath of the model GraphDef protocol buffer. * @param labels The list of labels * @param inputSize The input size. A square image of inputSize x inputSize is assumed. * @param imageMean The assumed mean of the image values. * @param imageStd The assumed std of the image values. * @param inputName The label of the image input node. * @param outputName The label of the output node. * @throws IOException */ public static Classifier create( AssetManager assetManager, String modelFilename, List<String> labels, int inputSize, int imageMean, float imageStd, String inputName, String outputName) { }
  23. 77 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess

    bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Preprocess Bitmap / Create Tensor
  24. 78 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess

    bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Feed Input Data to TensorFlow
  25. 79 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess

    bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Run the Inference Call
  26. 80 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess

    bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Fetch the Output Tensor
  27. 81 (continue..) // Find the best classifications. PriorityQueue<Recognition> pq =

    new PriorityQueue<>( 3, (lhs, rhs) -> { // Intentionally reversed to put high confidence at the head of the queue. return Float.compare(rhs.getConfidence(), lhs.getConfidence()); }); for (int i = 0; i < outputs.length; ++i) { if (outputs[i] > THRESHOLD) { pq.add( new Recognition( "" + i, labels.size() > i ? labels.get(i) : "unknown", outputs[i], null)); } } //... return recognitions; } Find the Best Classification
  28. Model Stacking • Start from previous model to keep all

    specific operations in the graph • Specify all operations to keep when optimizing for inference graph_util.convert_variables_to_constants(sess, graph.as_graph_def(), [“final_result_fruits”, “final_result_vegetables”]
  29. TensorFlow Serving Hosts the model provides remote access to it

    90 Source: https://www.tensorflow.org/serving/
  30. Currently with Project Magritte… Training • Model debug done by

    overheating a laptop • Model built on personal GPU • Files uploaded manually Model dispensing • API available • Deployment on AWS, currently migrating on Google Cloud 93
  31. public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model;

    this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  32. public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model;

    this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  33. public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model;

    this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  34. Source: Android Meets TensorFlow: How to Accelerate Your App with

    AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys
  35. Source: Android Meets TensorFlow: How to Accelerate Your App with

    AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys
  36. Federate Learning Collaborative Machine Learning without Centralized Training Data 110

    Source: https://research.googleblog.com/2017/04/federated-learning-collaborative.html
  37. Resources • Artificial neural network: https://en.wikipedia.org/wiki/Artificial_neural_network • Deep Learning: https://en.wikipedia.org/wiki/Deep_learning

    • Convolutional Neural Network: https://en.wikipedia.org/wiki/Convolutional_neural_network • TensorFlow for Poets: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ • TensorFlow for Poets 2: Optimize for Mobile: https://codelabs.developers.google.com/ codelabs/tensorflow-for-poets-2/ • TensorFlow Glossary: https://www.tensorflow.org/versions/r0.12/resources/glossary • Magritte project blog: http://blog.xebia.fr/2017/07/24/on-device-intelligence-integrez-du- deep-learning-sur-vos-smartphones/ 112