Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Intelligence at Google Scale: Vision/Sp...

Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud Machine Learning

The biggest challenge of Deep Learning technology is the scalability. As long as using single GPU server, you have to wait for hours or days to get the result of your work. This doesn't scale for production service, so you need a Distributed Training on the cloud eventually. Google has been building infrastructure for training the large scale neural network on the cloud for years, and now started to share the technology with external developers. In this session, we will introduce new pre-trained ML services such as Cloud Vision API and Speech API that works without any training. Also, we will look how TensorFlow and Cloud Machine Learning will accelerate custom model training for 10x - 40x with Google's distributed training infrastructure.

Guillaume Laforge

November 10, 2016
Tweet

More Decks by Guillaume Laforge

Other Decks in Technology

Transcript

  1. Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud

    Machine Learning Guillaume Laforge & Martin Görner Developer Advocates / Google Cloud Platform @glaforge & @martin_gorner Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud Machine Learning Guillaume Laforge & Martin Görner Developer Advocates / Google Cloud Platform @glaforge & @martin_gorner
  2. Confidential & Proprietary Confidential & Proprietary How did we escape

    the AI winter? Ongoing research on neural networks More labeled datasets to learn from More scalable compute power to train bigger models
  3. The Machine Learning Spectrum TensorFlow Cloud Machine Learning Machine Learning

    APIS Academia Research Industry application ML as a Service Your own ML infrastructure
  4. The Machine Learning Spectrum TensorFlow Cloud Machine Learning Machine Learning

    APIS Academia Research Industry application ML as a Service Your own ML infrastructure
  5. @martin_gorner 8 tensorflow.org Recurrent Neural Networks softmax tanh X: inputs

    Y: outputs H: internal state RNN cell H Xt Yt N: internal size
  6. @martin_gorner 9 tensorflow.org Language model in Tensorflow 0 H 5

    S t _ J o h t _ J o h n character- based Characters, one-hot encoded
  7. @martin_gorner 10 tensorflow.org Language model in Tensorflow cell = tf.nn.rnn_cell.GRUCell(CELLSIZE)

    0 GRU H 0 X 0 H 0 mcell = tf.nn.rnn_cell.MultiRNNCell([cell]*NLAYERS, state_is_tuple=False) Hr, H = tf.nn.dynamic_rnn(mcell, X, initial_state=Hin) GRU 0 H’ 0 H’ 0 GRU 0 H” 0 H” 0 GRU H 1 X 1 H 0 GRU H’ 1 H’ 0 GRU H” 1 H” 1 GRU H 2 X 2 H 0 GRU H’ 2 H’ 0 GRU H” 2 H” 2 GRU H 3 X 3 H 0 GRU H’ 3 H’ 0 GRU H” 3 H” 3 GRU H 5 X 4 H 0 GRU H’ 5 H’ 0 GRU H” 5 H” 5 GRU H 6 X 6 H 0 GRU H’ 6 H’ 0 GRU H” 6 H” 6 GRU H 7 X 7 H 0 GRU H’ 7 H’ 0 GRU H” 7 H” 7 GRU H 8 X 8 H 0 GRU H’ 8 H’ 0 GRU H” 8 H” 8 H Hin ALPHASIZE = 98 CELLSIZE = 512 NLAYERS = 3 SEQLEN = 30 defines weights and biases internally
  8. @martin_gorner 11 tensorflow.org Inputs and outputs 0 H 0 X

    0 H 0 0 H’ 0 H’ 0 0 H” 0 H 1 X 1 H 0 H’ 1 H’ 0 H” 1 H 2 X 2 H 0 H’ 2 H’ 0 H” 2 H 3 X 3 H 0 H’ 3 H’ 0 H” 3 H 5 X 4 H 0 H’ 5 H’ 0 H” 5 H 6 X 6 H 0 H’ 6 H’ 0 H” 6 H 7 X 7 H 0 H’ 7 H’ 0 H” 7 H 8 X 8 H 0 H’ 8 H’ 0 H” 8 ALPHASIZE = 98 CELLSIZE = 512 NLAYERS = 3 SEQLEN = 30 Y 0 Y 1 Y 2 Y 3 Y 4 Y 5 Y 6 Y 7 S t _ A n d t _ A n d r e r e w
  9. @martin_gorner 12 tensorflow.org Language model in Tensorflow ALPHASIZE = 98

    CELLSIZE = 512 NLAYERS = 3 SEQLEN = 30 Xd = tf.placeholder(tf.uint8, [None, None]) X = tf.one_hot(X, ALPHASIZE, 1.0, 0.0) Yd_ = tf.placeholder(tf.uint8, [None, None]) Y_ = tf.one_hot(Y_, ALPHASIZE, 1.0, 0.0) Hin = tf.placeholder(tf.float32, [None, CELLSIZE*NLAYERS]) # the model cell = tf.nn.rnn_cell. GRUCell(CELLSIZE) mcell = tf.nn.rnn_cell. MultiRNNCell([cell]*NLAYERS,state_is_tuple=False) Hr,H = tf.nn. dynamic_rnn(mcell, X, initial_state=Hin) # softmax output layer Hf = tf.reshape(Hr, [-1, CELLSIZE]) Ylogits = layers.linear(Hf, ALPHASIZE) Y = tf.nn.softmax(Ylogits) Yp = tf.argmax(Y, 1) Yp = tf.reshape(Yp, [batchsize, -1]) # loss and training step (optimizer) loss = tf.nn.softmax_cross_entropy_with_logits(Ylogits, Y_) train_step = tf.train.AdamOptimizer(1e-3).minimize(loss) # training loop for epoch in range(20): inH = np.zeros([BATCHSIZE, INTERNALSIZE*NLAYERS]) for x, y_ in tf.models.rnn.ptb.reader.ptb_iterator(codedtext,BATCHSIZE,SEQLEN): dic = {X: x, Y_: y_, Hin:inH} _,y,outH = sess.run([train_step,Yp,H,], feed_dict=dic) inH = outH
  10. @martin_gorner 13 tensorflow.org ee o no nonnaoter s ee seih

    iae r t i r io i ro s sierota tsohoreroneo rsa esia anehereeo hensh rho etnrhhs iti saoitns t et rsearh tshseoeh ta oirhroren e eaetetnesnareeeoaraihss nshtano eter e oooaoaeee nonn is heh easren ieson httn nihensont t e n a ooe oerhi neaeehteriseat tiet i i ntsh orhi e ohhsiea e aht ohr er ra eeo oeeitrot hethisesaaei o saeii straieiteoeresorh e ooeri e ninesh sort a es h rs hattnteseato sonoanr sniaase s rshninsasi na sntennn oti r etnsnrse oh n r e tiathhnaeeano trrr hhohooon rrt eernre e rnoh Shakespeare 0.03 epochs C1
  11. @martin_gorner 14 tensorflow.org Shakespeare II WERENI Are I I wos

    the wheer boaer. Tin thim mh cals sate bauut site tar oue tinl an bsisonetoal yer an fimireeren. L[IO SI Hns oret bsllssts aaau ton hete me toer frurtor sheus aed trat A faler bis tote oadt tou than male, tel mou ce an cime. ais fauto ws cien whus yas. Ande fert te a ut wond aal sinr be at saar 0.1 epochs C3
  12. @martin_gorner 15 tensorflow.org BERENS Hall hat in she the hir

    meres. Perstr in ame not of heard, me thin hild of shear and ant on of mare. I lore wes lour. DOCHES The chaster'd on not fenst The laldoos more. [Ixeln thrish] And tho priines sith of hamdeling the san wind Shakespeare 0.2 epochs C5 Scenic indication ?
  13. @martin_gorner 16 tensorflow.org KING LEAR Alas, I am not forsworn

    both to bod! And let the firm I have to'st trainoured. KING HENRY VIII I love not my father. PORDIA He tash you will have it. HENRY BLUTIUS Work, thou lovest my son here, thy father's fath! CLIOND Why, then, would say, the beasts are Shakespeare 1 epoch C6 Invented names !
  14. @martin_gorner 17 tensorflow.org Shakespeare 30 epochs TITUS ANDRONICUS ACT I

    SCENE III An ante-chamber. The COUNT's palace. [Enter CLEOMENES, with the Lord SAY] Chamberlain Let me see your worshing in my hands. LUCETTA I am a sign of me, and sorrow sounds it. B10
  15. @martin_gorner 18 tensorflow.org Shakespeare 30 epochs And sorrow far into

    the stars of men, Without a second tears to seek the best and bed, With a strange service, and the foul prince of Rome [Exeunt MARK ANTONY and LEPIDUS] Well said, my lord,-- MENENIUS I do not say so. Well, I will not have no better ways; But not a woman's misery, and yonder to her B10
  16. @martin_gorner 19 tensorflow.org diassts_= =tlns==eti.s=tessn_(( sie_s_nts_ens= dondtnenroe dnar taonte srst

    anttntoilonttiteaen detrtstinsenoaolsesnesoairt( arssserleeeerltrdlesssoeeslslrlslie(e drnnaleeretteaelreesioe niennoarens dssnstssaorns sreeoeslrteasntotnnai(ar dsopelntederlalesdanserl lts(sitae(e) Python code 0.03 epochs A1
  17. @martin_gorner 20 tensorflow.org with self.essors_sigeater(output_dits_allss, self._train. for sampated to than

    ubtexsormations. expeddions = np.randim(natched_collection, ranger, mang_ops, samplering) def assestErrorume_gens(assignex) as and(sampled_veases): eved. Python code 0.1 epochs A2 Python keywords
  18. @martin_gorner 21 tensorflow.org def testGiddenSelfBeShareMecress(self): with self.test_session() as sess: tat

    = tf.contrib.matrix.cast_column_variable([1, 1], [0, 1, 1], [1, 7]], [[1, 1, 1]].file(file, line_state_will_file)) with self.test_session(): self.assertAllEqual(1, l.ex6) self.assertEqual(output_graph_def is_output_tensors_op( tf.pro_context_name.sqrt(sess) def test_shape(self): res = values=value_rns[0].eval()) def tempDimpleSeriesGredicsIothasedWouthAverageData(self): self._testDirector(self): self._test_inv3_size = 5 with tf.train.ConvolutioBailLors_startswith("save_dir_context.PutIsprint().eval()) return tf.contrib.learn.RUCISLCCS: # Check the orfloating so that the nimesting object mumputable othersifier. # dense_keys.tokens_prefix/statch_size of the input1 tensors. @property Python code 0.4 epochs A3 Wrong ([]) nesting Correct use of colons: Hallucinated function names
  19. @martin_gorner 22 tensorflow.org # Copyright 2015 The TensorFlow Authors. All

    Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in [0.1, 2.0, 3.0]] def __init__(self, expected): return np.array([[0, 0, 0], [0, 0, 0]]) self.assertAllEqual(tf.placeholder(tf.float32, shape=(3, 3)),(shape, prior.pack(), tf.float32)) for keys in tensor_list: return np.array([[0, 0, 0]]).astype(np.float32) # Check that we have both scalar tensor for being invalid to a vector of 1 indicating # the total loss of the same shape as the shape of the tensor. sharded_weights = [[0.0, 1.0]] # Create the string op to apply gradient terms that also batch. # The original any operation as a code when we should alw infer to the session case. Python code 12 epochs B10 Correct triple ([]) nesting Recites Apache license Tensorflow tips!
  20. @martin_gorner 23 tensorflow.org ...and more Credit to Andrej Karpathy’s blog:

    The Unreasonable Effectiveness of Recurrent Neural Networks
  21. The Machine Learning Spectrum TensorFlow Cloud Machine Learning Machine Learning

    APIS Academia Research Industry application ML as a Service Your own ML infrastructure
  22. @martin_gorner 26 tensorflow.org Data-parallel distributed training parameter servers model replicas

    data W’ = W + ∆W synchronous or asynchronous updates I ♡ noise
  23. @martin_gorner 27 tensorflow.org Estimator from tensorflow.contrib import learn from tensorflow.contrib

    import layers def model(X, Y_): XX = tf.reshape(X, [-1, 28, 28, 1]) Y1 = layers.conv2d(XX, num_outputs=6, kernel_size=[6, 6]) Y2 = layers.conv2d(Y1, num_outputs=12, kernel_size=[5, 5], stride=2) Y3 = layers.conv2d(Y2, num_outputs=24, kernel_size=[4, 4], stride=2) Y4 = layers.flatten(Y3) Y5 = layers.relu(Y4, 200) Ylogits = layers.linear(Y5, 10) pred = tf.nn.softmax(Ylogits) loss = tf.nn.softmax_cross_entropy_with_logits(Ylogits, tf.one_hot(Y_, 10)) train_step = optimizer.minimize(loss) return {"Predictions": pred, "classes": …}, loss, train_step estimator = learn.Estimator(model_fn=model) “features” and “targets estimator.fit(...) estimator.predict(...)
  24. @martin_gorner 28 tensorflow.org tf.learn distributed training def experiment_fn(outdir): return learn.Experiment(

    estimator=learn.Estimator(model_fn, model_dir=outdir), train_input_fn=…, # data feed eval_input_fn=…, # data feed eval_metrics=evaluationMetrics, train_steps=10000, eval_steps=1, local_eval_frequency=100 ) def main(argv=None): learn_runner.run(experiment_fn, outdir) if __name__ == '__main__': main() trainingInput: scaleTier: STANDARD_1 Model -> Estimator -> Experiment -> run
  25. @martin_gorner 29 tensorflow.org Run it trainingInput: scaleTier: STANDARD_1 gcloud beta

    ml jobs submit training job22 --package-path=trainer --module-name=trainer.task --staging-bucket=gs://mybucket/job22 --config=config.yaml -- --train_dir=gs://mybucket/job22/train Model checkpoints Tensorboard summaries
  26. >TensorFlow and deep learning_ without a PhD >TensorFlow and deep

    learning_ without a PhD #Tensorflow @martin_gorner deep Science ! deep Code ...
  27. The Machine Learning Spectrum TensorFlow Cloud Machine Learning Machine Learning

    APIS Academia Research Industry application ML as a Service Your own ML infrastructure
  28. • Natural Language Processing (NLP) • Sentiment analysis • Entity

    extraction with salience • Syntax analysis with dependency trees Google Natural Language API
  29. Confidential & Proprietary Google Cloud Platform 35 Features Extract sentence,

    identify parts of speech and create dependency parse trees for each sentence. Identify entities and label by types such as person, organization, location, events, products and media. Understand the overall sentiment of a block of text. Syntax Analysis Entity Recognition Sentiment Analysis
  30. • Translate text between thousands of language pairs. • Let

    websites and programs integrate with Google Translate programmatically Google Cloud Translate API
  31. • Pass raw audio data and language • Returns a

    transcript of the audio data • Works across >80 languages • Receive response in streaming or non-streaming Google Cloud Speech API
  32. Confidential & Proprietary Google Cloud Platform 40 Features Automatic Speech

    Recognition (ASR) powered by deep learning neural networking to power your applications like voice search or speech transcription. Recognizes over 80 languages and variants with an extensive vocabulary. Returns partial recognition results immediately, as they become available. Filter inappropriate content in text results. Audio input can be captured by an application’s microphone or sent from a pre-recorded audio file. Multiple audio file formats are supported, including FLAC, AMR, PCMU and linear-16. Handles noisy audio from many environments without requiring additional noise cancellation. Audio files can be uploaded in the request and, in future releases, integrated with Google Cloud Storage. Automatic Speech Recognition Global Vocabulary Inappropriate Content Filtering Streaming Recognition Real-time or Buffered Audio Support Noisy Audio Handling Integrated API
  33. • Detect faces, landmarks, logos, text, and more • Perform

    sentiment analysis • Straightforward REST API • Works on a base64-encoded image • Connects to Google Cloud Storage • Returns label, score pair Google Cloud Vision API
  34. Confidential & Proprietary Google Cloud Platform 43 Faces Faces, facial

    landmarks, emotions OCR Read and extract text, with support for > 10 languages Label Detect entities from furniture to transportation Logos Identify product logos Landmarks & Image Properties Detect landmarks & dominant color of image Safe Search Detect explicit content - adult, violent, medical and spoof
  35. The impractical is now simple All images are searchable Labels

    are extracted for 2M daily uploaded images Existing user photos have been annotated, So have free and paid photos Face detection across all platforms Coordinates added to EXIF data Face-cropping added to image editing capabilities
  36. 48

  37. Machine Learning products from Google TensorFlow Cloud Machine Learning Machine

    Learning APIS Easy to use, for non-ML engineers Customizable, for data scientists
  38. 50 @glaforge | @martin_gorner Links & Resources • Google Cloud

    Machine Learning bit.ly/gcp-ml • Google Cloud Speech API bit.ly/gcp-speech • Google Cloud Vision API bit.ly/gcp-vision • Google Cloud Translate API bit.ly/gcp-translate • TensorFlow bit.ly/tensorflow-oss • TensorFlow for Poets, by Pete Warden bit.ly/tensorflow-for-poets
  39. >TensorFlow and deep learning_ without a PhD >TensorFlow and deep

    learning_ without a PhD #Tensorflow @martin_gorner deep Science ! deep Code ...