Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud Machine Learning

Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud Machine Learning

The biggest challenge of Deep Learning technology is the scalability. As long as using single GPU server, you have to wait for hours or days to get the result of your work. This doesn't scale for production service, so you need a Distributed Training on the cloud eventually. Google has been building infrastructure for training the large scale neural network on the cloud for years, and now started to share the technology with external developers. In this session, we will introduce new pre-trained ML services such as Cloud Vision API and Speech API that works without any training. Also, we will look how TensorFlow and Cloud Machine Learning will accelerate custom model training for 10x - 40x with Google's distributed training infrastructure.

Guillaume Laforge

November 10, 2016
Tweet

More Decks by Guillaume Laforge

Other Decks in Technology

Transcript

  1. Machine Intelligence at Google Scale:
    Vision/Speech API, TensorFlow
    and Cloud Machine Learning
    Guillaume Laforge & Martin Görner
    Developer Advocates / Google Cloud Platform
    @glaforge & @martin_gorner
    Machine Intelligence at Google Scale:
    Vision/Speech API, TensorFlow
    and Cloud Machine Learning
    Guillaume Laforge & Martin Görner
    Developer Advocates / Google Cloud Platform
    @glaforge & @martin_gorner

    View Slide

  2. Confidential & Proprietary
    Confidential & Proprietary
    How did we escape the AI winter?
    Ongoing research
    on neural networks
    More labeled
    datasets to
    learn from
    More scalable
    compute power to
    train bigger models

    View Slide

  3. Beach
    Woman
    Pool
    Coast
    Water

    View Slide

  4. Confidential & Proprietary
    Confidential & Proprietary
    Machine Learning is everywhere at Google
    Confidential & Proprietary

    View Slide

  5. Confidential & Proprietary
    DEMO

    View Slide

  6. The Machine Learning Spectrum
    TensorFlow
    Cloud Machine
    Learning
    Machine
    Learning APIS
    Academia
    Research
    Industry
    application
    ML as a Service
    Your own ML infrastructure

    View Slide

  7. The Machine Learning Spectrum
    TensorFlow
    Cloud Machine
    Learning
    Machine
    Learning APIS
    Academia
    Research
    Industry
    application
    ML as a Service
    Your own ML infrastructure

    View Slide

  8. @martin_gorner 8
    tensorflow.org
    Recurrent Neural Networks
    softmax
    tanh
    X: inputs
    Y: outputs
    H: internal
    state
    RNN cell
    H
    Xt
    Yt
    N: internal size

    View Slide

  9. @martin_gorner 9
    tensorflow.org
    Language model in Tensorflow
    0 H
    5
    S t _ J o h
    t _ J o h n
    character-
    based
    Characters,
    one-hot encoded

    View Slide

  10. @martin_gorner 10
    tensorflow.org
    Language model in Tensorflow
    cell = tf.nn.rnn_cell.GRUCell(CELLSIZE)
    0
    GRU H
    0
    X
    0
    H
    0
    mcell = tf.nn.rnn_cell.MultiRNNCell([cell]*NLAYERS, state_is_tuple=False)
    Hr, H = tf.nn.dynamic_rnn(mcell, X, initial_state=Hin)
    GRU
    0
    H’
    0
    H’
    0
    GRU
    0
    H”
    0
    H”
    0
    GRU H
    1
    X
    1
    H
    0
    GRU H’
    1
    H’
    0
    GRU H”
    1
    H”
    1
    GRU H
    2
    X
    2
    H
    0
    GRU H’
    2
    H’
    0
    GRU H”
    2
    H”
    2
    GRU H
    3
    X
    3
    H
    0
    GRU H’
    3
    H’
    0
    GRU H”
    3
    H”
    3
    GRU H
    5
    X
    4
    H
    0
    GRU H’
    5
    H’
    0
    GRU H”
    5
    H”
    5
    GRU H
    6
    X
    6
    H
    0
    GRU H’
    6
    H’
    0
    GRU H”
    6
    H”
    6
    GRU H
    7
    X
    7
    H
    0
    GRU H’
    7
    H’
    0
    GRU H”
    7
    H”
    7
    GRU H
    8
    X
    8
    H
    0
    GRU H’
    8
    H’
    0
    GRU H”
    8
    H”
    8
    H
    Hin
    ALPHASIZE = 98
    CELLSIZE = 512
    NLAYERS = 3
    SEQLEN = 30
    defines weights and
    biases internally

    View Slide

  11. @martin_gorner 11
    tensorflow.org
    Inputs and outputs
    0
    H
    0
    X
    0
    H
    0
    0
    H’
    0
    H’
    0
    0
    H”
    0
    H
    1
    X
    1
    H
    0 H’
    1
    H’
    0 H”
    1
    H
    2
    X
    2
    H
    0 H’
    2
    H’
    0 H”
    2
    H
    3
    X
    3
    H
    0 H’
    3
    H’
    0 H”
    3
    H
    5
    X
    4
    H
    0 H’
    5
    H’
    0 H”
    5
    H
    6
    X
    6
    H
    0 H’
    6
    H’
    0 H”
    6
    H
    7
    X
    7
    H
    0 H’
    7
    H’
    0 H”
    7
    H
    8
    X
    8
    H
    0 H’
    8
    H’
    0 H”
    8
    ALPHASIZE = 98
    CELLSIZE = 512
    NLAYERS = 3
    SEQLEN = 30
    Y
    0
    Y
    1
    Y
    2
    Y
    3
    Y
    4
    Y
    5
    Y
    6
    Y
    7
    S t _ A n d
    t _ A n d
    r e
    r e w

    View Slide

  12. @martin_gorner 12
    tensorflow.org
    Language model in Tensorflow
    ALPHASIZE = 98
    CELLSIZE = 512
    NLAYERS = 3
    SEQLEN = 30
    Xd = tf.placeholder(tf.uint8, [None, None])
    X = tf.one_hot(X, ALPHASIZE, 1.0, 0.0)
    Yd_ = tf.placeholder(tf.uint8, [None, None])
    Y_ = tf.one_hot(Y_, ALPHASIZE, 1.0, 0.0)
    Hin = tf.placeholder(tf.float32, [None,
    CELLSIZE*NLAYERS])
    # the model
    cell = tf.nn.rnn_cell.
    GRUCell(CELLSIZE)
    mcell = tf.nn.rnn_cell.
    MultiRNNCell([cell]*NLAYERS,state_is_tuple=False)
    Hr,H = tf.nn.
    dynamic_rnn(mcell, X, initial_state=Hin)
    # softmax output layer
    Hf = tf.reshape(Hr, [-1, CELLSIZE])
    Ylogits = layers.linear(Hf, ALPHASIZE)
    Y = tf.nn.softmax(Ylogits)
    Yp = tf.argmax(Y, 1)
    Yp = tf.reshape(Yp, [batchsize, -1])
    # loss and training step (optimizer)
    loss = tf.nn.softmax_cross_entropy_with_logits(Ylogits, Y_)
    train_step = tf.train.AdamOptimizer(1e-3).minimize(loss)
    # training loop
    for epoch in range(20):
    inH = np.zeros([BATCHSIZE, INTERNALSIZE*NLAYERS])
    for x, y_ in
    tf.models.rnn.ptb.reader.ptb_iterator(codedtext,BATCHSIZE,SEQLEN):
    dic = {X: x, Y_: y_, Hin:inH}
    _,y,outH = sess.run([train_step,Yp,H,], feed_dict=dic)
    inH = outH

    View Slide

  13. @martin_gorner 13
    tensorflow.org
    ee o no nonnaoter s ee seih iae r t i r io i ro s
    sierota tsohoreroneo rsa esia anehereeo hensh
    rho etnrhhs iti saoitns t et rsearh tshseoeh ta
    oirhroren e eaetetnesnareeeoaraihss nshtano eter
    e oooaoaeee nonn is heh easren ieson httn nihensont
    t e n a ooe oerhi neaeehteriseat tiet i i ntsh
    orhi e ohhsiea e aht ohr er ra eeo oeeitrot
    hethisesaaei o saeii straieiteoeresorh e ooeri
    e ninesh sort a es h rs hattnteseato sonoanr sniaase
    s rshninsasi na sntennn oti r etnsnrse oh n
    r e tiathhnaeeano trrr hhohooon rrt eernre e rnoh
    Shakespeare
    0.03
    epochs
    C1

    View Slide

  14. @martin_gorner 14
    tensorflow.org
    Shakespeare
    II WERENI
    Are I I wos the wheer boaer.
    Tin thim mh cals sate bauut site tar oue tinl an
    bsisonetoal yer an fimireeren.
    L[IO SI Hns oret bsllssts aaau ton hete me toer
    frurtor sheus aed trat
    A faler bis tote oadt tou than male, tel mou ce an
    cime. ais fauto ws cien whus yas. Ande fert te a
    ut wond aal sinr be at saar
    0.1
    epochs
    C3

    View Slide

  15. @martin_gorner 15
    tensorflow.org
    BERENS Hall hat in she the hir meres.
    Perstr in ame not of heard, me thin hild of shear and
    ant on of mare. I lore wes lour.
    DOCHES The chaster'd on not fenst
    The laldoos more.
    [Ixeln thrish]
    And tho priines sith of hamdeling the san wind
    Shakespeare
    0.2
    epochs
    C5
    Scenic indication ?

    View Slide

  16. @martin_gorner 16
    tensorflow.org
    KING LEAR Alas, I am not forsworn both to bod!
    And let the firm I have to'st trainoured.
    KING HENRY VIII I love not my father.
    PORDIA He tash you will have it.
    HENRY BLUTIUS Work, thou lovest my son here, thy
    father's fath!
    CLIOND Why, then, would say, the beasts are
    Shakespeare
    1
    epoch
    C6
    Invented
    names !

    View Slide

  17. @martin_gorner 17
    tensorflow.org
    Shakespeare
    30
    epochs
    TITUS ANDRONICUS
    ACT I
    SCENE III An ante-chamber. The COUNT's palace.
    [Enter CLEOMENES, with the Lord SAY]
    Chamberlain Let me see your worshing in my hands.
    LUCETTA I am a sign of me, and sorrow sounds it.
    B10

    View Slide

  18. @martin_gorner 18
    tensorflow.org
    Shakespeare
    30
    epochs
    And sorrow far into the stars of men,
    Without a second tears to seek the best and bed,
    With a strange service, and the foul prince of Rome
    [Exeunt MARK ANTONY and LEPIDUS]
    Well said, my lord,--
    MENENIUS I do not say so.
    Well, I will not have no better ways;
    But not a woman's misery, and yonder to her
    B10

    View Slide

  19. @martin_gorner 19
    tensorflow.org
    diassts_= =tlns==eti.s=tessn_((
    sie_s_nts_ens= dondtnenroe dnar taonte
    srst anttntoilonttiteaen
    detrtstinsenoaolsesnesoairt(
    arssserleeeerltrdlesssoeeslslrlslie(e
    drnnaleeretteaelreesioe niennoarens
    dssnstssaorns sreeoeslrteasntotnnai(ar
    dsopelntederlalesdanserl
    lts(sitae(e)
    Python code
    0.03
    epochs
    A1

    View Slide

  20. @martin_gorner 20
    tensorflow.org
    with
    self.essors_sigeater(output_dits_allss,
    self._train.
    for sampated to than ubtexsormations.
    expeddions = np.randim(natched_collection,
    ranger, mang_ops, samplering)
    def assestErrorume_gens(assignex) as
    and(sampled_veases):
    eved.
    Python code
    0.1
    epochs
    A2
    Python
    keywords

    View Slide

  21. @martin_gorner 21
    tensorflow.org
    def testGiddenSelfBeShareMecress(self):
    with self.test_session() as sess:
    tat = tf.contrib.matrix.cast_column_variable([1, 1], [0, 1, 1], [1, 7]],
    [[1, 1, 1]].file(file, line_state_will_file))
    with self.test_session():
    self.assertAllEqual(1, l.ex6)
    self.assertEqual(output_graph_def is_output_tensors_op(
    tf.pro_context_name.sqrt(sess)
    def test_shape(self):
    res = values=value_rns[0].eval())
    def tempDimpleSeriesGredicsIothasedWouthAverageData(self):
    self._testDirector(self):
    self._test_inv3_size = 5
    with tf.train.ConvolutioBailLors_startswith("save_dir_context.PutIsprint().eval())
    return tf.contrib.learn.RUCISLCCS:
    # Check the orfloating so that the nimesting object mumputable othersifier.
    # dense_keys.tokens_prefix/statch_size of the input1 tensors.
    @property
    Python code
    0.4
    epochs
    A3
    Wrong
    ([])
    nesting
    Correct
    use of
    colons:
    Hallucinated
    function
    names

    View Slide

  22. @martin_gorner 22
    tensorflow.org
    # Copyright 2015 The TensorFlow Authors. All Rights Reserved.
    #
    # Licensed under the Apache License, Version 2.0 (the "License");
    # you may not use this file except in compliance with the License.
    # You may obtain a copy of the License at
    #
    # http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in [0.1, 2.0, 3.0]]
    def __init__(self, expected):
    return np.array([[0, 0, 0], [0, 0, 0]])
    self.assertAllEqual(tf.placeholder(tf.float32, shape=(3, 3)),(shape, prior.pack(),
    tf.float32))
    for keys in tensor_list:
    return np.array([[0, 0, 0]]).astype(np.float32)
    # Check that we have both scalar tensor for being invalid to a vector of 1 indicating
    # the total loss of the same shape as the shape of the tensor.
    sharded_weights = [[0.0, 1.0]]
    # Create the string op to apply gradient terms that also batch.
    # The original any operation as a code when we should alw infer to the session case.
    Python code
    12
    epochs
    B10
    Correct triple ([]) nesting
    Recites
    Apache
    license
    Tensorflow
    tips!

    View Slide

  23. @martin_gorner 23
    tensorflow.org
    ...and more
    Credit to Andrej
    Karpathy’s blog:
    The Unreasonable
    Effectiveness of
    Recurrent Neural
    Networks

    View Slide

  24. The Machine Learning Spectrum
    TensorFlow
    Cloud Machine
    Learning
    Machine
    Learning APIS
    Academia
    Research
    Industry
    application
    ML as a Service
    Your own ML infrastructure

    View Slide

  25. @martin_gorner 25
    tensorflow.org
    Cloud Machine learning

    View Slide

  26. @martin_gorner 26
    tensorflow.org
    Data-parallel distributed training
    parameter servers
    model
    replicas
    data
    W’ = W + ∆W
    synchronous or
    asynchronous
    updates
    I ♡ noise

    View Slide

  27. @martin_gorner 27
    tensorflow.org
    Estimator
    from tensorflow.contrib import learn
    from tensorflow.contrib import layers
    def model(X, Y_):
    XX = tf.reshape(X, [-1, 28, 28, 1])
    Y1 = layers.conv2d(XX, num_outputs=6, kernel_size=[6, 6])
    Y2 = layers.conv2d(Y1, num_outputs=12, kernel_size=[5, 5], stride=2)
    Y3 = layers.conv2d(Y2, num_outputs=24, kernel_size=[4, 4], stride=2)
    Y4 = layers.flatten(Y3)
    Y5 = layers.relu(Y4, 200)
    Ylogits = layers.linear(Y5, 10)
    pred = tf.nn.softmax(Ylogits)
    loss = tf.nn.softmax_cross_entropy_with_logits(Ylogits, tf.one_hot(Y_, 10))
    train_step = optimizer.minimize(loss)
    return {"Predictions": pred, "classes": …}, loss, train_step
    estimator = learn.Estimator(model_fn=model)
    “features” and “targets
    estimator.fit(...)
    estimator.predict(...)

    View Slide

  28. @martin_gorner 28
    tensorflow.org
    tf.learn distributed training
    def experiment_fn(outdir):
    return learn.Experiment(
    estimator=learn.Estimator(model_fn, model_dir=outdir),
    train_input_fn=…, # data feed
    eval_input_fn=…, # data feed
    eval_metrics=evaluationMetrics,
    train_steps=10000, eval_steps=1, local_eval_frequency=100
    )
    def main(argv=None):
    learn_runner.run(experiment_fn, outdir)
    if __name__ == '__main__': main()
    trainingInput:
    scaleTier: STANDARD_1
    Model -> Estimator -> Experiment -> run

    View Slide

  29. @martin_gorner 29
    tensorflow.org
    Run it
    trainingInput:
    scaleTier: STANDARD_1
    gcloud beta ml jobs submit training job22
    --package-path=trainer
    --module-name=trainer.task
    --staging-bucket=gs://mybucket/job22
    --config=config.yaml
    --
    --train_dir=gs://mybucket/job22/train
    Model
    checkpoints
    Tensorboard
    summaries

    View Slide

  30. @martin_gorner 30
    tensorflow.org
    Inception
    Car
    wheel
    Retrain
    this
    Retrain Inception yourself: goo.gl/Z9eNek

    View Slide

  31. @martin_gorner 31
    tensorflow.org
    Demo: aucnet
    Retrain Inception yourself: goo.gl/Z9eNek

    View Slide

  32. >TensorFlow and deep learning_
    without a PhD
    >TensorFlow and deep learning_
    without a PhD
    #Tensorflow @martin_gorner
    deep
    Science !
    deep
    Code ...

    View Slide

  33. The Machine Learning Spectrum
    TensorFlow
    Cloud Machine
    Learning
    Machine
    Learning APIS
    Academia
    Research
    Industry
    application
    ML as a Service
    Your own ML infrastructure

    View Slide

  34. ● Natural Language Processing (NLP)
    ● Sentiment analysis
    ● Entity extraction with salience
    ● Syntax analysis with dependency trees
    Google Natural Language API

    View Slide

  35. Confidential & Proprietary
    Google Cloud Platform 35
    Features
    Extract sentence, identify parts of
    speech and create dependency parse
    trees for each sentence.
    Identify entities and label by types such
    as person, organization, location, events,
    products and media.
    Understand the overall sentiment of a
    block of text.
    Syntax Analysis Entity Recognition
    Sentiment Analysis

    View Slide

  36. DEMO
    Cloud Natural Language API
    36

    View Slide

  37. ● Translate text between thousands of
    language pairs.
    ● Let websites and programs integrate with
    Google Translate programmatically
    Google Cloud Translate API

    View Slide

  38. DEMO
    Cloud Translate API
    38

    View Slide

  39. ● Pass raw audio data and language
    ● Returns a transcript of the audio data
    ● Works across >80 languages
    ● Receive response in streaming or non-streaming
    Google Cloud Speech API

    View Slide

  40. Confidential & Proprietary
    Google Cloud Platform 40
    Features
    Automatic Speech Recognition (ASR)
    powered by deep learning neural
    networking to power your
    applications like voice search or
    speech transcription.
    Recognizes over 80
    languages and variants
    with an extensive
    vocabulary.
    Returns partial
    recognition results
    immediately, as they
    become available.
    Filter inappropriate
    content in text results.
    Audio input can be captured by an application’s
    microphone or sent from a pre-recorded audio
    file. Multiple audio file formats are supported,
    including FLAC, AMR, PCMU and linear-16.
    Handles noisy audio from many
    environments without requiring
    additional noise cancellation.
    Audio files can be uploaded in the
    request and, in future releases,
    integrated with Google Cloud
    Storage.
    Automatic Speech Recognition Global Vocabulary
    Inappropriate Content
    Filtering
    Streaming Recognition
    Real-time or Buffered Audio Support Noisy Audio Handling Integrated API

    View Slide

  41. DEMO
    Cloud Speech API
    41

    View Slide

  42. ● Detect faces, landmarks, logos, text, and more
    ● Perform sentiment analysis
    ● Straightforward REST API
    ● Works on a base64-encoded image
    ● Connects to Google Cloud Storage
    ● Returns label, score pair
    Google Cloud Vision API

    View Slide

  43. Confidential & Proprietary
    Google Cloud Platform 43
    Faces
    Faces, facial landmarks, emotions
    OCR
    Read and extract text, with
    support for > 10 languages
    Label
    Detect entities from furniture to
    transportation
    Logos
    Identify product logos
    Landmarks & Image Properties
    Detect landmarks & dominant
    color of image
    Safe Search
    Detect explicit content - adult,
    violent, medical and spoof

    View Slide

  44. DEMO
    Cloud Vision API
    44

    View Slide

  45. The Wix Media Platform Example

    View Slide

  46. View Slide

  47. The impractical is now simple
    All images are
    searchable
    Labels are extracted for 2M daily uploaded images
    Existing user photos have been annotated,
    So have free and paid photos
    Face detection
    across all
    platforms
    Coordinates added to EXIF data
    Face-cropping added to image editing capabilities

    View Slide

  48. 48

    View Slide

  49. Machine Learning products from Google
    TensorFlow
    Cloud Machine
    Learning
    Machine
    Learning APIS
    Easy to use, for
    non-ML engineers
    Customizable, for data scientists

    View Slide

  50. 50
    @glaforge | @martin_gorner
    Links & Resources
    ● Google Cloud Machine Learning bit.ly/gcp-ml
    ● Google Cloud Speech API bit.ly/gcp-speech
    ● Google Cloud Vision API bit.ly/gcp-vision
    ● Google Cloud Translate API bit.ly/gcp-translate
    ● TensorFlow bit.ly/tensorflow-oss
    ● TensorFlow for Poets, by Pete Warden bit.ly/tensorflow-for-poets

    View Slide

  51. 51
    @glaforge | @martin_gorner
    Join the Discussion
    bit.ly/gcp-slack bit.ly/gcp-github
    bit.ly/gcp-twitter

    View Slide

  52. 52
    @glaforge | @martin_gorner 52
    Thanks
    for your
    attention!
    @glaforge | @martin_gorner

    View Slide

  53. >TensorFlow and deep learning_
    without a PhD
    >TensorFlow and deep learning_
    without a PhD
    #Tensorflow @martin_gorner
    deep
    Science !
    deep
    Code ...

    View Slide