From Zero to ML on Google Cloud Platform

778425a9498f00198e57896c7b2a95d3?s=47 sararob
February 10, 2018

From Zero to ML on Google Cloud Platform

Everyone’s talking about machine learning, but we hear much less about how to put it into practice. And let’s face it, that can be daunting! Even just 10 years ago you needed access to extensive academic and computing resources to make use of machine learning. Fast forward to today and we’ve seen revolutionary changes in the hardware and software that are making ML accessible for any developer or data scientist. Whether you’re completely new to ML or you’ve already trained and deployed your own model from scratch, Google Cloud Platform has a variety of tools to help you start using ML right now. I’ll start with the basics: how to use a pre-trained ML model with one REST API call. Then you’ll learn how to use your own dataset to customize a pre-trained model with transfer learning. We’ll end by learning how to build your own model from scratch with TensorFlow, and how train and serve it in the cloud with GCP.

778425a9498f00198e57896c7b2a95d3?s=128

sararob

February 10, 2018
Tweet

Transcript

  1. Confidential & Proprietary Zero to ML on GCP Sara Robinson

    Developer Advocate @SRobTweets
  2. Confidential & Proprietary What is machine learning? @SRobTweets

  3. Confidential & Proprietary Learning from examples and experience @SRobTweets

  4. Confidential & Proprietary How did you learn your first language?

    @SRobTweets
  5. Confidential & Proprietary ML is... Solving problems without explicitly knowing

    the solution Loosely based on how the human brain learns Enable systems that improve over time @SRobTweets
  6. Confidential & Proprietary inputs, labels Model predictions Framing a machine

    learning problem @SRobTweets
  7. Confidential & Proprietary How do we get from input to

    prediction? @SRobTweets
  8. Confidential & Proprietary How do we get from input to

    prediction? “cat” @SRobTweets
  9. Confidential & Proprietary How do we get from input to

    prediction? “cat” @SRobTweets
  10. Confidential & Proprietary How do we get from input to

    prediction? @SRobTweets
  11. Confidential & Proprietary How do we get from input to

    prediction? sports @SRobTweets
  12. Confidential & Proprietary How do we get from input to

    prediction? sports baseball @SRobTweets
  13. Confidential & Proprietary How do we get from input to

    prediction? sports baseball nytimes @SRobTweets
  14. Confidential & Proprietary How do we get from input to

    prediction? @SRobTweets
  15. Confidential & Proprietary How do we get from input to

    prediction? mountain pass nature birthday cake family @SRobTweets
  16. Confidential & Proprietary Is machine learning only for experts?

  17. Confidential & Proprietary First neural network 1957 @SRobTweets

  18. Confidential & Proprietary First neural network 1957

  19. Confidential & Proprietary

  20. Confidential & Proprietary Democratizing ML @SRobTweets

  21. Confidential & Proprietary What type of ML problem are you

    solving? Someone else has solved this before Specific to your dataset Custom task Generic task @SRobTweets
  22. Confidential & Proprietary Custom task Generic task What type of

    ML problem are you solving? Someone else has solved this before Specific to your dataset “cat” “bob” Example: image classification @SRobTweets
  23. Confidential & Proprietary Custom task Generic task What type of

    ML problem are you solving? Someone else has solved this before Specific to your dataset verb noun pronoun programming Google Cloud Example: NLP @SRobTweets
  24. Confidential & Proprietary From zero… Machine Learning APIs ...to ML

    AutoML Custom model App developers Data scientists & ML practitioners @SRobTweets
  25. Confidential & Proprietary What resources do you need to solve

    an ML problem? Training data Model code Training + serving infrastructure Prediction code Time Custom model: build from scratch Custom model: transfer learning AutoML ML APIs @SRobTweets
  26. ML as an API Use a pre-trained model with a

    single REST API request
  27. Confidential & Proprietary Use a pre-trained model to accomplish common

    ML tasks Cloud Vision Cloud Translation Cloud Natural Language Cloud Speech Cloud Video Intelligence @SRobTweets
  28. Confidential & Proprietary Let’s look at Cloud Video & NL

    Cloud Vision Cloud Translation Cloud Natural Language Cloud Speech Cloud Video Intelligence @SRobTweets
  29. Confidential & Proprietary Cloud Vision @SRobTweets

  30. Confidential & Proprietary Label & web detection OCR Logo detection

    Explicit content detection Crop hints Landmark detection
  31. Confidential & Proprietary Cloud Vision in production: GIPHY For more

    insight visit: Engineering.giphy.com @GIPHYEng
  32. Confidential & Proprietary Let’s see a demo! @SRobTweets

  33. Confidential & Proprietary Calling the Vision API from Node.js const

    gcloud = require('google-cloud'); const vision = gcloud.vision(); const types = [ 'face', 'label' ]; vision.detect('image.jpg', types, function(err, detections, apiResponse) { // detections = { // faces: [...], // labels: [...] // } }); @SRobTweets
  34. Confidential & Proprietary Cloud Natural Language @SRobTweets

  35. Extract entities Detect sentiment Analyze syntax Classify content

  36. Confidential & Proprietary Analyze syntax The natural language API helps

    us understand text . Dependency Parse label det nn nn nsubj root nsubj ccomp dobj p Part of speech DET NOUN NOUN NOUN VERB PRON VERB NOUN PUNCT Lemma help Number = SINGULAR Proper = PROPER Number = SINGULAR Proper = PROPER Number = SINGULAR Proper = PROPER Mood = INDICATIVE Number = SINGULAR Person = THIRD Tense = Present Case = ACCUSATIVE Number = PLURAL Person = FIRST Number = SINGULAR Morphology @SRobTweets
  37. Confidential & Proprietary Classify Content Rafael Montero Shines in Mets’

    Victory Over the Reds. Montero, who was demoted at midseason, took a one-hitter into the ninth inning as the Mets continued to dominate Cincinnati with a win at Great American Ball Park. { categories: [ { name: '/Sports/Team Sports/Baseball', confidence: 0.99 } ] } @SRobTweets
  38. Confidential & Proprietary Let’s see a demo! @SRobTweets

  39. Confidential & Proprietary Calling the NL API from Node.js const

    gcloud = require('google-cloud'); const language = gcloud.language(); const bucket = gcs.bucket('my-bucket'); const file = bucket.file('my-file'); function callback(err, entities, apiResponse) {} language.annotate(file, callback); @SRobTweets
  40. Confidential & Proprietary Use a pre-trained model to accomplish common

    ML tasks Cloud Vision Cloud Translation Cloud Natural Language Cloud Speech Cloud Video Intelligence @SRobTweets
  41. Confidential & Proprietary What if we want to train these

    APIs on our own custom data? @SRobTweets
  42. AutoML **alpha Use your own data to customize a pre-trained

    API
  43. Confidential & Proprietary Let’s say I’m a meteorologist

  44. Confidential & Proprietary I want to predict weather trends and

    flight plans from images
  45. Confidential & Proprietary Can we use the cloud to analyze

    clouds?
  46. Confidential & Proprietary There are 10+ different types of clouds

  47. Confidential & Proprietary They all indicate different weather patterns

  48. Confidential & Proprietary Let’s try the Vision API @SRobTweets

  49. Confidential & Proprietary AutoML Vision to the rescue AutoML Vision

    Photo dataset Train Deploy Serve Generate predictions with a REST API @SRobTweets
  50. Confidential & Proprietary Who is using AutoML Vision? In the

    past, ZSL used to track animals and understand their lives by having a human review photos from cameras deployed in the wild. Now they can tag these pictures automatically, enabling a deeper understanding across wider geographies to help protect the world’s wildlife. Build vision models to annotate shop Disney’s products with Disney characters, product category, and dominant color. Annotations are being integrated into Disney’s search engine to help users get more relevant results and find their ideal products faster. Building a model to create a comprehensive set of product attributes to improve product recommendations, search results, and product filters. Recognizing nuanced characteristics like patterns and neckline styles. @SRobTweets
  51. Confidential & Proprietary What if you have a custom prediction

    task specific to your dataset or use case? @SRobTweets
  52. Confidential & Proprietary Custom text classification @SRobTweets

  53. Confidential & Proprietary Custom text classification javascript c++ sql @SRobTweets

  54. Confidential & Proprietary Given demographic data, can we predict voting

    trends? @SRobTweets
  55. Confidential & Proprietary Given demographic data, can we predict voting

    trends? inputs prediction @SRobTweets
  56. Confidential & Proprietary Identifying the location of objects in an

    image @SRobTweets
  57. Confidential & Proprietary Identifying the location of objects in an

    image sara 97% Kitten 99% @SRobTweets
  58. Roll your own Build and train a custom model with

    your own data
  59. Confidential & Proprietary Build train and serve your own models

    TensorFlow ML Engine @SRobTweets
  60. Confidential & Proprietary Build train and serve your own models

    TensorFlow ML Engine @SRobTweets
  61. Confidential & Proprietary TensorFlow Created by Google Brain team Most

    popular ML project on GitHub Multiple deployment options @SRobTweets
  62. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... @SRobTweets
  63. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... tf.layers Build models @SRobTweets
  64. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... Estimator tf.layers Build models Keras @SRobTweets
  65. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... Models in a box Estimator tf.layers Build models Keras Pre-built Estimators @SRobTweets
  66. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... Layers Pre-built Estimators Estimator Keras @SRobTweets
  67. Confidential & Proprietary Once you’ve got a TensorFlow model, what

    about…. Training Serving & @SRobTweets
  68. Confidential & Proprietary Cloud Machine Learning Engine (ML Engine) Fully

    managed platform for TensorFlow Distributed Training with GPUs (and TPUs) Fast and scalable online/batch prediction cloud.google.com/ml-engine @SRobTweets
  69. Confidential & Proprietary ML Engine in 3 steps Prepare your

    TensorFlow code and training data on Cloud Storage Run CMLE job with gcloud command Monitor the job on Cloud Console @SRobTweets
  70. Confidential & Proprietary Example: building a Taylor Swift detector @SRobTweets

  71. Confidential & Proprietary TensorFlow Object Detection to build the object

    recognition model @SRobTweets
  72. Confidential & Proprietary ...and ML Engine to train, serve, and

    generate predictions @SRobTweets
  73. Confidential & Proprietary TensorFlow Object Detection API @SRobTweets

  74. Confidential & Proprietary TensorFlow Object Detection API @SRobTweets

  75. Confidential & Proprietary Need a custom model but don’t have

    enough training data? Transfer learning Model trained on lots of data Your data Updated output using your training data @SRobTweets
  76. Confidential & Proprietary How does it all fit together? Learn

    more: bit.ly/swift-detector @SRobTweets
  77. Confidential & Proprietary What if we have a custom task

    and enough training data to build + train our model from scratch? @SRobTweets
  78. Confidential & Proprietary Custom model: build from scratch Training data

    Model code Serving infrastructure Prediction code Time @SRobTweets
  79. Confidential & Proprietary There are ~3,000 counties in the US

    Can we use demographic data to predict which way a county will vote? Example: a classification model to predict voting trends @SRobTweets
  80. Confidential & Proprietary Get the data → Kaggle @SRobTweets

  81. Confidential & Proprietary Select features from BigQuery @SRobTweets

  82. Confidential & Proprietary From this... inputs prediction @SRobTweets

  83. Confidential & Proprietary ...to this inputs prediction @SRobTweets

  84. Confidential & Proprietary Preprocess our input data import pandas as

    pd df = pd.read_csv('data.csv', header=None) for idx, row in df.iterrows(): # Convert % to binary classes: 0 = Trump, 1 = Clinton classification_val = 0 if row[7] >= .50: classification_val = 1 classification_row = row.as_matrix() classification_row[7] = classification_val new_rows.append(classification_row) # Write new_rows to new csv file @SRobTweets
  85. Confidential & Proprietary Define feature columns feature_names = [ 'PctUnder18',

    'PctOver65', 'PctFemale', 'PctWhite', 'PctBachelors', 'PctDem', 'PctGop' ] feature_columns = [tf.feature_column.numeric_column(k) for k in feature_names] @SRobTweets
  86. Confidential & Proprietary Define our input function def my_input_fn(file_path, repeat_count):

    def decode_csv(line): features = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]) label = features[-1:] del features[-1] return dict(zip(feature_names, features)), label dataset = (tf.data.TextLineDataset(file_path) .map(decode_csv) .shuffle(buffer_size=256) .repeat(repeat_count) .batch(8)) iterator = dataset.make_one_shot_iterator() batch_features, batch_labels = iterator.get_next() return batch_features, batch_labels @SRobTweets
  87. Confidential & Proprietary Define our input function def my_input_fn(file_path, repeat_count):

    def decode_csv(line): features = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]) label = features[-1:] del features[-1] return dict(zip(feature_names, features)), label dataset = (tf.data.TextLineDataset(file_path) .map(decode_csv) .shuffle(buffer_size=256) .repeat(repeat_count) .batch(8)) iterator = dataset.make_one_shot_iterator() batch_features, batch_labels = iterator.get_next() return batch_features, batch_labels @SRobTweets
  88. Confidential & Proprietary Define our input function def my_input_fn(file_path, repeat_count):

    def decode_csv(line): features = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]) label = features[-1:] del features[-1] return dict(zip(feature_names, features)), label dataset = (tf.data.TextLineDataset(file_path) .map(decode_csv) .shuffle(buffer_size=256) .repeat(repeat_count) .batch(16)) iterator = dataset.make_one_shot_iterator() batch_features, batch_labels = iterator.get_next() return batch_features, batch_labels @SRobTweets
  89. Confidential & Proprietary Define our input function def my_input_fn(file_path, repeat_count):

    def decode_csv(line): features = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]) label = features[-1:] del features[-1] return dict(zip(feature_names, features)), label dataset = (tf.data.TextLineDataset(file_path) .map(decode_csv) .shuffle(buffer_size=256) .repeat(repeat_count) .batch(16)) iterator = dataset.make_one_shot_iterator() batch_features, batch_labels = iterator.get_next() return batch_features, batch_labels @SRobTweets
  90. Confidential & Proprietary Create a classification model classifier = tf.estimator.LinearClassifier(

    feature_columns=feature_columns ) @SRobTweets
  91. Confidential & Proprietary Run training & evaluation # Run training

    for 10 epochs classifier.train( input_fn=lambda: my_input_fn(train_file, 10)) # Evaluate the accuracy of our trained model results = classifier.evaluate( input_fn=lambda: my_input_fn(test_file, 1)) @SRobTweets
  92. Confidential & Proprietary Run training & evaluation # Evaluation output

    accuracy: 0.965686 auc_precision_recall: 0.942694 average_loss: 0.110322 label/mean: 0.111111 prediction/mean: 0.130689 @SRobTweets
  93. Confidential & Proprietary Generate predictions predict_results = classifier.predict( input_fn=lambda: my_input_fn(test_file,

    4) ) @SRobTweets
  94. Confidential & Proprietary Generate predictions { 'probabilities': array( [ 0.99163872,

    0.00836132], dtype=float32), 'class_ids': array([0]), 'classes': array([b'0'], dtype=object) } @SRobTweets
  95. Confidential & Proprietary Generate predictions { 'probabilities': array( [ 0.99163872,

    0.00836132], dtype=float32), 'class_ids': array([0]), 'classes': array([b'0'], dtype=object) } @SRobTweets
  96. Confidential & Proprietary Let’s see it in action! @SRobTweets

  97. Confidential & Proprietary If you remember 3 things from this

    presentation
  98. Confidential & Proprietary Use a pre-trained API to accomplish common

    ML tasks like image analysis, NLP, or translation @SRobTweets
  99. Confidential & Proprietary To build an image classification API trained

    on your own data, use AutoML Vision @SRobTweets
  100. Confidential & Proprietary For custom tasks, build a TensorFlow model

    with your data. Train and serve it on ML Engine. @SRobTweets
  101. Confidential & Proprietary Video: cloud.google.com/video-intelligence Vision: cloud.google.com/vision Speech: cloud.google.com/speech Natural

    Language: cloud.google.com/natural-language API overview talk: bit.ly/ml-apis-overview Sign up for the alpha: cloud.google.com/automl Announcement blog post: bit.ly/announcing-automl Intro video: bit.ly/automl-intro-video Podcast: bit.ly/automl-podcast Object Detection: bit.ly/tf-obj-detection T-Swift detector blog: bit.ly/swift-detector Pet detector blog: bit.ly/pet-model-ml-engine TensorFlow: tensorflow.org ML Engine: cloud.google.com/ml-engine TF Estimators: bit.ly/tf-estimators Keras blog post: bit.ly/text-classification-keras ML APIs AutoML Transfer learning Custom model @SRobTweets
  102. Confidential & Proprietary Thank you @SRobTweets medium.com/@SRobTweets