Faites chauffer les neurones de votre smartphone avec du deep learning on-device

7b5a07956eb0b62be7214d043821a987?s=47 jinqian
October 20, 2017

Faites chauffer les neurones de votre smartphone avec du deep learning on-device

Nous entendons aujourd'hui parler de Deep Learning un peu partout : reconnaissance d'images, de sons, génération de texte, etc. Les annonces sur Android Neural Network API et TensorFlowLite, la release du framework CoreML d’Apple, toutes ces actions nous confirment la tendance d’aller plus loins sur l’on-device intelligence. Mais, bien que les techniques et frameworks soient en train de se démocratiser, il reste difficile d'en voir des applications concrètes en entreprise, et encore moins sur des applications mobiles. Nous avons donc décidé de construire un Proof Of Concept pour relever les défis du domaine. A travers une application mobile à but éducatif utilisant du Deep Learning pour de la reconnaissance d’objets, nous aborderons les questions des impacts de ce type de modèle sur les smartphones, d'architecture pour l'entraînement et le déploiement de modèles sur le service Cloud, ainsi que la construction de l’application mobile avec les dernières nouveautés annoncées.

7b5a07956eb0b62be7214d043821a987?s=128

jinqian

October 20, 2017
Tweet

Transcript

  1. Faites chauffer les neurones de votre smartphone avec du deep

    learning on-device Qian Jin, aka "SpeedRabbit" Yoann Benoit, aka "TensorMan" Sylvain Lequeux, aka "TeddyBière" @bonbonking @YoannBENOIT @slequeux
  2. On-Device (Machine) Intelligence

  3. Learned Projection Model Android Wear 2.0 Smart Reply Source: https://research.googleblog.com/2017/02/on-device-machine-intelligence.html

  4. None
  5. None
  6. Source: Qualcomm

  7. Source: Qualcomm

  8. The ultimate goal of the on-device intelligence is to improve

    mobile devices’ ability to understand the world.
  9. #datamobile Chat History of the Slack channel 9

  10. 10

  11. None
  12. Magritte Ceci n’est pas une pomme

  13. 13

  14. Build TensorFlow android example with Bazel

  15. 15

  16. Android developer Deep learning Noob

  17. NEURONS NEURONS EVERYWHERE 17

  18. WE CAN RECOGNIZE ALL THE THINGS! 18

  19. I THOUGHT THERE WERE MODELS FOR EVERYTHING... 19

  20. Neural networks in a nutshell

  21. None
  22. None
  23. None
  24. Apple: 0.98 Banana: 0.02

  25. Training a model

  26. 26

  27. None
  28. Apple: 0.34 Banana: 0.66

  29. Apple: 0.34 Banana: 0.66 Prediction error

  30. Apple: 0.34 Banana: 0.66 Prediction error

  31. Apple: 0.34 Banana: 0.66 Prediction error

  32. Apple: 0.27 Banana: 0.73

  33. Transfert learning

  34. None
  35. Keep all weights identical except these ones Use a pre-trained

    Deep Neural Network Keep all operations but the last one Re-train only the last operation to specialize your network to your classes
  36. Creating the final step in transfer learning with tf.name_scope("input"): bottleneck_input

    = tf.placeholder_with_default(bottleneck_tensor, shape=[None, bottleneck_size]) with tf.name_scope("intermediate_training_ops"): layer_weights = tf.Variable(tf.truncated_normal([bottleneck_size, 500], stddev=0.001)) layer_biases = tf.Variable(tf.zeros([500])) hidden = tf.nn.relu(tf.matmul(bottleneck_input, layer_weights) + layer_biases) dropout = tf.nn.dropout(hidden, keep_prob=keep_prob) with tf.name_scope("final_training_ops"): layer_weights = tf.Variable(tf.truncated_normal([500, class_count], stddev=0.001)) layer_biases = tf.Variable(tf.zeros([class_count])) logits = tf.matmul(dropout, layer_weights) + layer_biases final_tensor = tf.nn.softmax(logits)
  37. Two things to save • Execution graph • Weights for

    each operation Two outputs • Model as protobuf file • Labels in text files with gfile.FastGFile(FLAGS.output_graph_path, 'wb') as f: f.write(output_graph_def.SerializeToString()) model.pb label.txt
  38. java.lang.UnsupportedOperationException: Op BatchNormWithGlobalNormalization is not available in GraphDef version 21.

    38
  39. Unsupported operation Only keep the operations dedicated to the inference

    step Remove decoding, training, loss and evaluation operations
  40. Data Scientist ANdroid Noob

  41. CLICK 7 TIMES ON BUILD NUMBER 41

  42. Build standalone app

  43. Use nightly build • Library.so • Java API Jar Pre-Google

    I/O android { //… sourceSets { main { jniLibs.srcDirs = ['libs'] } } }
  44. POST-Google I/O 1.4.0-rc0

  45. None
  46. App size ~80M

  47. Reducing model size

  48. WHO CARES? MODEL SIZE 48

  49. All weight are stored as they are (64-bits float) =>

    80Mb
  50. Weights quantization 6.372638493746383 => 6.4 80Mb => 20Mb

  51. Mobilenets

  52. MobileNets : Mobile-First computer vision models for TensorFlow

  53. Image credit : Google Research Blog 80Mb => 20Mb =>

    1~5Mb
  54. Image credit : https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md

  55. None
  56. mAGRITTE aNDROID app ARCHITECTURE

  57. Android SDK (Java) Android NDK (C++) Classifier Implementation TensorFlow JNI

    wrapper Image (Bitmap) Trained Model top_results Classifications + Confidence input_tensor 1 2 3 4 Camera Preview Ref: https://jalammar.github.io/Supercharging-android-apps-using-tensorflow/ Overlay Display 57
  58. Image sampling on ANdroid device Get Image from Camera Preview

    Crop the center square Resize Sample Image
  59. Converts YUV420 to ARGB8888 public static native void convertYUV420ToARGB8888 (

    byte[] y, byte[] u, byte[] v, int[] output, int width, int height, int yRowStride, int uvRowStride, int uvPixelStride, boolean halfSize );
  60. Steps of Recognizing Image @Override public List<Recognition> recognizeImage(final Bitmap bitmap)

    { // Preprocess bitmap bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight()); for (int i = 0; i < intValues.length; ++i) { final int val = intValues[i]; floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd; floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd; } // Copy the input data into TensorFlow. inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); // Run the inference call. inferenceInterface.run(outputNames, logStats); // Copy the output Tensor back into the output array. inferenceInterface.fetch(outputName, outputs); (continue..)
  61. (continue..) // Find the best classifications. PriorityQueue<Recognition> pq = new

    PriorityQueue<>( 3, (lhs, rhs) -> { // Intentionally reversed to put high confidence at the head of the queue. return Float.compare(rhs.getConfidence(), lhs.getConfidence()); }); for (int i = 0; i < outputs.length; ++i) { if (outputs[i] > THRESHOLD) { pq.add( new Recognition( "" + i, labels.size() > i ? labels.get(i) : "unknown", outputs[i], null)); } } //... return recognitions; } Steps of Recognizing Image
  62. Adding new models

  63. 2 x 20 Mb = 40 Mb Adding a new

    model
  64. DRY Who’s a craftsman ?

  65. Start from previous model to keep all specific operations in

    the graph Specify all operations to keep when optimizing for inference Model stacking graph_util.convert_variables_to_constants(sess, graph.as_graph_def(), [“final_result_fruits”, “final_result_vegetables”]
  66. Android Makers Paris 2017

  67. None
  68. To cloud and beyond - Training

  69. To cloud and beyond - Training

  70. To cloud and beyond - Training

  71. To cloud and beyond - Training

  72. To cloud and beyond - Training

  73. To cloud and beyond - Training

  74. To cloud and beyond - Training model. pb label .txt

  75. to cloud and beyond - Serving model. pb label .txt

  76. to cloud and beyond - Serving model. pb label .txt

  77. to cloud and beyond - Serving model. pb label .txt

  78. None
  79. as for today... Serving • API available • Deployment on

    AWS, currently migrating on Google Cloud
  80. as for today... Serving • API available • Deployment on

    AWS, currently migrating on Google Cloud Training • Model debug done by overheating a laptop • Model built on personal GPU • Files uploaded manually
  81. as for today... Serving • API available • Deployment on

    AWS, currently migrating on Google Cloud Training • Model debug done by overheating a laptop • Model built on personal GPU • Files uploaded manually
  82. Android App evolves!

  83. None
  84. model.pb label.txt Android FilesDir model.pb Labels

  85. public TensorFlowInferenceInterface(AssetManager var1, String var2) { //... this.modelName = var2;

    this.g = new Graph(); this.sess = new Session(this.g); this.runner = this.sess.runner(); boolean var3 = var2.startsWith("file:///android_asset/"); Object var4 = null; try { String var5 = var3?var2.split("file:///android_asset/")[1]:var2; var4 = var1.open(var5); } catch (IOException var11) { if(var3) { throw new RuntimeException("Failed to load model from '" + var2 + "'", var11); } try { var4 = new FileInputStream(var2); } catch (IOException var8) { throw new RuntimeException("Failed to load model from '" + var2 + "'", var11); } }} }
  86. Model Inception V3 Optimized

  87. Model Mobilenets 1.0

  88. Model Mobilenets 0.5

  89. Model Mobilenets 0.25

  90. Demo time

  91. 91

  92. Next Steps

  93. None
  94. None
  95. Federated Learning: Collaborative Machine Learning without Centralized Training Data

  96. Thanks! Merci! 谢谢! xebia-france/magritte