[DroidCon London] Heat the neurons of your smartphone with Deep Learning

7b5a07956eb0b62be7214d043821a987?s=47 jinqian
October 27, 2017

[DroidCon London] Heat the neurons of your smartphone with Deep Learning

7b5a07956eb0b62be7214d043821a987?s=128

jinqian

October 27, 2017
Tweet

Transcript

  1. 1.

    Heat the Neurons of Your Smartphone with Deep Learning Qian

    Jin Twitter: @bonbonking | Email: qjin@xebia.fr
  2. 5.
  3. 10.

    The ultimate goal of the on-device intelligence is to improve

    mobile devices’ ability to understand the world. 10 Image inspiration credit: Moodstocks
  4. 12.
  5. 13.
  6. 16.

    16

  7. 18.

    18

  8. 29.

    29

  9. 37.

    Deep Convolutional Neural Network & Inception Architecture (Please don’t ask

    me any question about this part because I’ve no idea what are they talking about) Credit: http://nicolovaligi.com/history-inception-deep-learning-architecture.html
  10. 40.

    Rethinking the inception architecture for computer vision, Szegedy et al.

    (2015) Credit: https://arxiv.org/pdf/1512.00567v3.pdf 40
  11. 45.

    Transfer Learning • Use a pre-trained Deep Neural Network •

    Keep all operations but the last one • Re-train only the last operation to specialize your network to your classes Keep all weights identical except these ones 45
  12. 47.

    47 Retrain a Model Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py \

    --bottleneck_dir=tf_files/bottlenecks \ --how_many_training_steps=500 \ --model_dir=tf_files/models/ \ --summaries_dir=tf_files/training_summaries/ \ --output_graph=tf_files/retrained_graph.pb \ --output_labels=tf_files/retrained_labels.txt \ --image_dir=tf_files/fruit_photos
  13. 49.
  14. 50.

    Obtain the Retrained Model •2 outputs: • Model as protobuf

    file: contains a version of the selected network with a final layer retrained on your categories • Labels as text file 50 model.pb label.txt
  15. 51.

    51 public class ClassifierActivity extends CameraActivity implements OnImageAvailableListener { private

    static final int INPUT_SIZE = 224; private static final int IMAGE_MEAN = 117; private static final float IMAGE_STD = 1; private static final String INPUT_NAME = "input"; private static final String OUTPUT_NAME = "output"; private static final String MODEL_FILE = "file:///android_asset/ tensorflow_inception_graph.pb"; private static final String LABEL_FILE = "file:///android_asset/ imagenet_comp_graph_label_strings.txt"; } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/ClassifierActivity.java
  16. 53.

    Unsupported Operations • Only keep the operations dedicated to the

    inference step • Remove decoding, training, loss and evaluation operations 53
  17. 54.

    54 Optimize for Inference Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/python/tools/optimize_for_inference \

    --input=tf_files/retrained_graph.pb \ --output=tf_files/optimized_graph.pb \ --input_names="input" \ --output_names="final_result"
  18. 56.

    Pre-Google I/O 2017 • Use nightly build • Library .so

    • Java API jar android { //… sourceSets { main { jniLibs.srcDirs = ['libs'] } } } 56
  19. 57.

    Post-Google I/O 2017 Source: Android Meets TensorFlow: How to Accelerate

    Your App with AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys 57 Currently: 1.4.0-rc0
  20. 58.
  21. 63.

    ~80MB -> ~20MB 63 Weights Quantization 6.372638493746383 => 6.4 Source:

    https://www.tensorflow.org/performance/quantization
  22. 65.

    MobileNet Mobile-first computer vision models for TensorFlow 65 Image credit

    : https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md
  23. 66.

    Inception V3 v.s. MobileNet 66 Inception V3 78% Accuracy* 85MB

    MobileNet (Largest configuration) 70.5% Accuracy* 19MB *: accuracy on ImageNet images
  24. 67.
  25. 69.

    69 Optimize for Mobile Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ python -m tensorflow/examples/image_retraining/retrain.py \

    --bottleneck_dir=tf_files/bottlenecks \ --how_many_training_steps=500 \ --model_dir=tf_files/models/ \ --summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \ --output_graph=tf_files/retrained_graph.pb \ --output_labels=tf_files/retrained_labels.txt \ --architecture="${ARCHITECTURE}" \ —image_dir=tf_files/fruit_photos
  26. 71.
  27. 73.

    Android SDK (Java) Android NDK (C++) Classifier Implementation TensorFlow JNI

    wrapper Image (Bitmap) Trained Model top_results Classifications + Confidence input_tensor 1 2 3 4 Camera Preview Ref: https://jalammar.github.io/Supercharging-android-apps-using-tensorflow/ Overlay Display
  28. 75.

    Converts YUV420 (NV21) to ARGB8888 75 public static native void

    convertYUV420ToARGB8888( byte[] y, byte[] u, byte[] v, int[] output, int width, int height, int yRowStride, int uvRowStride, int uvPixelStride, boolean halfSize );
  29. 76.

    76 /** * Initializes a native TensorFlow session for classifying

    images. * * @param assetManager The asset manager to be used to load assets. * @param modelFilename The filepath of the model GraphDef protocol buffer. * @param labels The list of labels * @param inputSize The input size. A square image of inputSize x inputSize is assumed. * @param imageMean The assumed mean of the image values. * @param imageStd The assumed std of the image values. * @param inputName The label of the image input node. * @param outputName The label of the output node. * @throws IOException */ public static Classifier create( AssetManager assetManager, String modelFilename, List<String> labels, int inputSize, int imageMean, float imageStd, String inputName, String outputName) { }
  30. 77.

    77 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess

    bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Preprocess Bitmap / Create Tensor
  31. 78.

    78 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess

    bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Feed Input Data to TensorFlow
  32. 79.

    79 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess

    bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Run the Inference Call
  33. 80.

    80 @Override public List<Recognition> recognizeImage(final Bitmap bitmap) { // Preprocess

    bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..) Fetch the Output Tensor
  34. 81.

    81 (continue..) // Find the best classifications. PriorityQueue<Recognition> pq =

    new PriorityQueue<>( 3, (lhs, rhs) -> { // Intentionally reversed to put high confidence at the head of the queue. return Float.compare(rhs.getConfidence(), lhs.getConfidence()); }); for (int i = 0; i < outputs.length; ++i) { if (outputs[i] > THRESHOLD) { pq.add( new Recognition( "" + i, labels.size() > i ? labels.get(i) : "unknown", outputs[i], null)); } } //... return recognitions; } Find the Best Classification
  35. 83.
  36. 85.

    Model Stacking • Start from previous model to keep all

    specific operations in the graph • Specify all operations to keep when optimizing for inference graph_util.convert_variables_to_constants(sess, graph.as_graph_def(), [“final_result_fruits”, “final_result_vegetables”]
  37. 88.
  38. 90.

    TensorFlow Serving Hosts the model provides remote access to it

    90 Source: https://www.tensorflow.org/serving/
  39. 93.

    Currently with Project Magritte… Training • Model debug done by

    overheating a laptop • Model built on personal GPU • Files uploaded manually Model dispensing • API available • Deployment on AWS, currently migrating on Google Cloud 93
  40. 95.
  41. 97.

    public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model;

    this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  42. 98.

    public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model;

    this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  43. 99.

    public TensorFlowInferenceInterface(AssetManager assetManager, String model) { prepareNativeRuntime(); this.modelName = model;

    this.g = new Graph(); this.sess = new Session(g); this.runner = sess.runner(); final boolean hasAssetPrefix = model.startsWith(ASSET_FILE_PREFIX); InputStream is = null; try { String aname = hasAssetPrefix ? model.split(ASSET_FILE_PREFIX)[1] : model; is = assetManager.open(aname); } catch (IOException e) { if (hasAssetPrefix) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } // Perhaps the model file is not an asset but is on disk. try { is = new FileInputStream(model); } catch (IOException e2) { throw new RuntimeException("Failed to load model from '" + model + "'", e); } } } Source: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java
  44. 104.
  45. 105.
  46. 106.
  47. 108.

    Source: Android Meets TensorFlow: How to Accelerate Your App with

    AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys
  48. 109.

    Source: Android Meets TensorFlow: How to Accelerate Your App with

    AI (Google I/O '17) https://www.youtube.com/watch?v=25ISTLhz0ys
  49. 110.

    Federate Learning Collaborative Machine Learning without Centralized Training Data 110

    Source: https://research.googleblog.com/2017/04/federated-learning-collaborative.html
  50. 111.
  51. 112.

    Resources • Artificial neural network: https://en.wikipedia.org/wiki/Artificial_neural_network • Deep Learning: https://en.wikipedia.org/wiki/Deep_learning

    • Convolutional Neural Network: https://en.wikipedia.org/wiki/Convolutional_neural_network • TensorFlow for Poets: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/ • TensorFlow for Poets 2: Optimize for Mobile: https://codelabs.developers.google.com/ codelabs/tensorflow-for-poets-2/ • TensorFlow Glossary: https://www.tensorflow.org/versions/r0.12/resources/glossary • Magritte project blog: http://blog.xebia.fr/2017/07/24/on-device-intelligence-integrez-du- deep-learning-sur-vos-smartphones/ 112