Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud Machine Learning

Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud Machine Learning

The biggest challenge of Deep Learning technology is the scalability. As long as using single GPU server, you have to wait for hours or days to get the result of your work. This doesn't scale for production service, so you need a Distributed Training on the cloud eventually. Google has been building infrastructure for training the large scale neural network on the cloud for years, and now started to share the technology with external developers. In this session, we will introduce new pre-trained ML services such as Cloud Vision API and Speech API that works without any training. Also, we will look how TensorFlow and Cloud Machine Learning will accelerate custom model training for 10x - 40x with Google's distributed training infrastructure.

現在のディープラーニング技術の最大の課題はスケーラビリティです。1台のGPUサーバを使っている限り、学習結果が得られるまで数時間や数日待つ必要があります。プロダクションでの利用には分散学習の導入が不可欠です。Googleは過去数年に渡ってクラウド上での大規模なニューラルネットワーク学習のためのインフラを構築し、その成果を外部の開発者に提供し始めました。このセッションでは、Cloud Vision APIやSpeech APIなど、学習せずにすぐに使えるMLサービスを紹介します。また、TensorFlowやCloud Machine Learningなど、Googleの分散学習インフラにより10〜40倍の高速学習を実現するサービスについて解説します。


Kazunori Sato

April 04, 2016


  1. Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud

  2. +Kazunori Sato @kazunori_279 Kaz Sato Staff Developer Advocate Tech Lead

    for Data & Analytics Cloud Platform, Google Inc.
  3. What we’ll cover Deep learning and distributed training Large scale

    neural network on Google Cloud Cloud Vision API and Speech API TensorFlow and Cloud Machine Learning
  4. Deep Learning and Distributed Training

  5. None
  6. From: Andrew Ng

  7. DNN = a large matrix ops a few GPUs >>

    CPU (but it still takes days to train) a supercomputer >> a few GPUs (but you don't have a supercomputer) You need Distributed Training on the cloud
  8. Google Brain. Large scale neural network on Google Cloud

  9. None
  10. Enterprise Google Cloud is The Datacenter as a Computer

  11. Jupiter network 10 GbE x 100 K = 1 Pbps

    Consolidates servers with microsec latency
  12. Borg No VMs, pure containers 10K - 20K nodes per

    Cell DC-scale job scheduling CPUs, mem, disks and IO
  13. 13 Google Cloud + Neural Network = Google Brain

  14. The Inception model (GoogLeNet, 2015)

  15. What's the scalability of Google Brain? "Large Scale Distributed Systems

    for Training Neural Networks", NIPS 2015 ◦ Inception / ImageNet: 40x with 50 GPUs ◦ RankBrain: 300x with 500 nodes
  16. Large-scale neural network for everyone

  17. None
  18. None
  19. None
  20. Pre-trained models. No ML skill required REST API: receives images

    and returns a JSON $2.5 or $5 / 1,000 units (free to try) Public Beta - Cloud Vision API
  21. None
  22. 22 22 Demo

  23. Pre-trained models. No ML skill required REST API: receives audio

    and returns texts Supports 80+ languages Streaming or non-streaming Limited Preview - Cloud Speech API
  24. 24 24 Demo Video

  25. TensorFlow

  26. The Machine Learning Spectrum TensorFlow Cloud Machine Learning Machine Learning

    APIs Industry / applications Academic / research
  27. Google's open source library for machine intelligence launched in

    Nov 2015 The second generation Used by many production ML projects What is TensorFlow?
  28. What is TensorFlow? Tensor: N-dimensional array Flow: data flow computation

    framework (like MapReduce) For Machine Learning and Deep Learning Or any HPC (High Performance Computing) applications
  29. # define the network import tensorflow as tf x =

    tf.placeholder(tf.float32, [None, 784]) W = tf.Variable(tf.zeros([784, 10])) b = tf.Variable(tf.zeros([10])) y = tf.nn.softmax(tf.matmul(x, W) + b) # define a training step y_ = tf.placeholder(tf.float32, [None, 10]) xent = -tf.reduce_sum(y_*tf.log(y)) step = tf.train.GradientDescentOptimizer(0.01).minimize (xent)
  30. # initialize session init = tf.initialize_all_variables() sess = tf.Session()

    # training for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100), feed_dict={x: batch_xs, y_: batch_ys})
  31. None
  32. Portable • Training on: ◦ Data Center ◦ CPUs, GPUs

    and etc • Running on: ◦ Mobile phones ◦ IoT devices
  33. TensorBoard: visualization tool

  34. Cloud Machine Learning

  35. Fully managed, distributed training and prediction for custom TensorFlow graph

    Supports Regression and Classification initially Integrated with Cloud Dataflow and Cloud Datalab Limited Preview - Cloud Machine Learning (Cloud ML)
  36. None
  37. Distributed Training with TensorFlow

  38. • CPU/GPU scheduling • Communications ◦ Local, RPC, RDMA ◦

    32/16/8 bit quantization • Cost-based optimization • Fault tolerance Distributed Training with TensorFlow
  39. Data Parallelism = split data, share model (but ordinary network

    is 1,000x slower than GPU and doesn't scale)
  40. Cloud ML demo video

  41. Jeff Dean's keynote: YouTube video Define a custom TensorFlow graph

    Training at local: 8.3 hours w/ 1 node Training at cloud: 32 min w/ 20 nodes (15x faster) Prediction at cloud at 300 reqs / sec Cloud ML demo
  42. Summary

  43. Ready to use Machine Learning models Use your own data

    to train models Cloud Vision API Cloud Speech API Cloud Translate API Cloud Machine Learning Develop - Model - Test Google BigQuery Stay Tuned…. Cloud Storage Cloud Datalab NEW Alpha GA Beta GA Alpha Beta GA
  44. Links & Resources Large Scale Distributed Systems for Training Neural

    Networks, Jeff Dean and Oriol Vinals Cloud Vision API: Cloud Speech API: TensorFlow: Cloud Machine Learning: Cloud Machine Learning: demo video
  45. Thank you!