Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Zero to ML on Google Cloud Platform

sararob
February 10, 2018

From Zero to ML on Google Cloud Platform

Everyone’s talking about machine learning, but we hear much less about how to put it into practice. And let’s face it, that can be daunting! Even just 10 years ago you needed access to extensive academic and computing resources to make use of machine learning. Fast forward to today and we’ve seen revolutionary changes in the hardware and software that are making ML accessible for any developer or data scientist. Whether you’re completely new to ML or you’ve already trained and deployed your own model from scratch, Google Cloud Platform has a variety of tools to help you start using ML right now. I’ll start with the basics: how to use a pre-trained ML model with one REST API call. Then you’ll learn how to use your own dataset to customize a pre-trained model with transfer learning. We’ll end by learning how to build your own model from scratch with TensorFlow, and how train and serve it in the cloud with GCP.

sararob

February 10, 2018
Tweet

More Decks by sararob

Other Decks in Programming

Transcript

  1. Confidential & Proprietary ML is... Solving problems without explicitly knowing

    the solution Loosely based on how the human brain learns Enable systems that improve over time @SRobTweets
  2. Confidential & Proprietary How do we get from input to

    prediction? “cat” @SRobTweets
  3. Confidential & Proprietary How do we get from input to

    prediction? “cat” @SRobTweets
  4. Confidential & Proprietary How do we get from input to

    prediction? sports baseball @SRobTweets
  5. Confidential & Proprietary How do we get from input to

    prediction? sports baseball nytimes @SRobTweets
  6. Confidential & Proprietary How do we get from input to

    prediction? mountain pass nature birthday cake family @SRobTweets
  7. Confidential & Proprietary What type of ML problem are you

    solving? Someone else has solved this before Specific to your dataset Custom task Generic task @SRobTweets
  8. Confidential & Proprietary Custom task Generic task What type of

    ML problem are you solving? Someone else has solved this before Specific to your dataset “cat” “bob” Example: image classification @SRobTweets
  9. Confidential & Proprietary Custom task Generic task What type of

    ML problem are you solving? Someone else has solved this before Specific to your dataset verb noun pronoun programming Google Cloud Example: NLP @SRobTweets
  10. Confidential & Proprietary From zero… Machine Learning APIs ...to ML

    AutoML Custom model App developers Data scientists & ML practitioners @SRobTweets
  11. Confidential & Proprietary What resources do you need to solve

    an ML problem? Training data Model code Training + serving infrastructure Prediction code Time Custom model: build from scratch Custom model: transfer learning AutoML ML APIs @SRobTweets
  12. Confidential & Proprietary Use a pre-trained model to accomplish common

    ML tasks Cloud Vision Cloud Translation Cloud Natural Language Cloud Speech Cloud Video Intelligence @SRobTweets
  13. Confidential & Proprietary Let’s look at Cloud Video & NL

    Cloud Vision Cloud Translation Cloud Natural Language Cloud Speech Cloud Video Intelligence @SRobTweets
  14. Confidential & Proprietary Label & web detection OCR Logo detection

    Explicit content detection Crop hints Landmark detection
  15. Confidential & Proprietary Cloud Vision in production: GIPHY For more

    insight visit: Engineering.giphy.com @GIPHYEng
  16. Confidential & Proprietary Calling the Vision API from Node.js const

    gcloud = require('google-cloud'); const vision = gcloud.vision(); const types = [ 'face', 'label' ]; vision.detect('image.jpg', types, function(err, detections, apiResponse) { // detections = { // faces: [...], // labels: [...] // } }); @SRobTweets
  17. Confidential & Proprietary Analyze syntax The natural language API helps

    us understand text . Dependency Parse label det nn nn nsubj root nsubj ccomp dobj p Part of speech DET NOUN NOUN NOUN VERB PRON VERB NOUN PUNCT Lemma help Number = SINGULAR Proper = PROPER Number = SINGULAR Proper = PROPER Number = SINGULAR Proper = PROPER Mood = INDICATIVE Number = SINGULAR Person = THIRD Tense = Present Case = ACCUSATIVE Number = PLURAL Person = FIRST Number = SINGULAR Morphology @SRobTweets
  18. Confidential & Proprietary Classify Content Rafael Montero Shines in Mets’

    Victory Over the Reds. Montero, who was demoted at midseason, took a one-hitter into the ninth inning as the Mets continued to dominate Cincinnati with a win at Great American Ball Park. { categories: [ { name: '/Sports/Team Sports/Baseball', confidence: 0.99 } ] } @SRobTweets
  19. Confidential & Proprietary Calling the NL API from Node.js const

    gcloud = require('google-cloud'); const language = gcloud.language(); const bucket = gcs.bucket('my-bucket'); const file = bucket.file('my-file'); function callback(err, entities, apiResponse) {} language.annotate(file, callback); @SRobTweets
  20. Confidential & Proprietary Use a pre-trained model to accomplish common

    ML tasks Cloud Vision Cloud Translation Cloud Natural Language Cloud Speech Cloud Video Intelligence @SRobTweets
  21. Confidential & Proprietary What if we want to train these

    APIs on our own custom data? @SRobTweets
  22. Confidential & Proprietary AutoML Vision to the rescue AutoML Vision

    Photo dataset Train Deploy Serve Generate predictions with a REST API @SRobTweets
  23. Confidential & Proprietary Who is using AutoML Vision? In the

    past, ZSL used to track animals and understand their lives by having a human review photos from cameras deployed in the wild. Now they can tag these pictures automatically, enabling a deeper understanding across wider geographies to help protect the world’s wildlife. Build vision models to annotate shop Disney’s products with Disney characters, product category, and dominant color. Annotations are being integrated into Disney’s search engine to help users get more relevant results and find their ideal products faster. Building a model to create a comprehensive set of product attributes to improve product recommendations, search results, and product filters. Recognizing nuanced characteristics like patterns and neckline styles. @SRobTweets
  24. Confidential & Proprietary What if you have a custom prediction

    task specific to your dataset or use case? @SRobTweets
  25. Confidential & Proprietary TensorFlow Created by Google Brain team Most

    popular ML project on GitHub Multiple deployment options @SRobTweets
  26. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... @SRobTweets
  27. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... tf.layers Build models @SRobTweets
  28. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... Estimator tf.layers Build models Keras @SRobTweets
  29. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... Models in a box Estimator tf.layers Build models Keras Pre-built Estimators @SRobTweets
  30. Confidential & Proprietary TensorFlow Distributed Execution Engine CPU GPU Android

    iOS ... C++ Frontend Python Frontend ... Layers Pre-built Estimators Estimator Keras @SRobTweets
  31. Confidential & Proprietary Cloud Machine Learning Engine (ML Engine) Fully

    managed platform for TensorFlow Distributed Training with GPUs (and TPUs) Fast and scalable online/batch prediction cloud.google.com/ml-engine @SRobTweets
  32. Confidential & Proprietary ML Engine in 3 steps Prepare your

    TensorFlow code and training data on Cloud Storage Run CMLE job with gcloud command Monitor the job on Cloud Console @SRobTweets
  33. Confidential & Proprietary Need a custom model but don’t have

    enough training data? Transfer learning Model trained on lots of data Your data Updated output using your training data @SRobTweets
  34. Confidential & Proprietary How does it all fit together? Learn

    more: bit.ly/swift-detector @SRobTweets
  35. Confidential & Proprietary What if we have a custom task

    and enough training data to build + train our model from scratch? @SRobTweets
  36. Confidential & Proprietary Custom model: build from scratch Training data

    Model code Serving infrastructure Prediction code Time @SRobTweets
  37. Confidential & Proprietary There are ~3,000 counties in the US

    Can we use demographic data to predict which way a county will vote? Example: a classification model to predict voting trends @SRobTweets
  38. Confidential & Proprietary Preprocess our input data import pandas as

    pd df = pd.read_csv('data.csv', header=None) for idx, row in df.iterrows(): # Convert % to binary classes: 0 = Trump, 1 = Clinton classification_val = 0 if row[7] >= .50: classification_val = 1 classification_row = row.as_matrix() classification_row[7] = classification_val new_rows.append(classification_row) # Write new_rows to new csv file @SRobTweets
  39. Confidential & Proprietary Define feature columns feature_names = [ 'PctUnder18',

    'PctOver65', 'PctFemale', 'PctWhite', 'PctBachelors', 'PctDem', 'PctGop' ] feature_columns = [tf.feature_column.numeric_column(k) for k in feature_names] @SRobTweets
  40. Confidential & Proprietary Define our input function def my_input_fn(file_path, repeat_count):

    def decode_csv(line): features = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]) label = features[-1:] del features[-1] return dict(zip(feature_names, features)), label dataset = (tf.data.TextLineDataset(file_path) .map(decode_csv) .shuffle(buffer_size=256) .repeat(repeat_count) .batch(8)) iterator = dataset.make_one_shot_iterator() batch_features, batch_labels = iterator.get_next() return batch_features, batch_labels @SRobTweets
  41. Confidential & Proprietary Define our input function def my_input_fn(file_path, repeat_count):

    def decode_csv(line): features = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]) label = features[-1:] del features[-1] return dict(zip(feature_names, features)), label dataset = (tf.data.TextLineDataset(file_path) .map(decode_csv) .shuffle(buffer_size=256) .repeat(repeat_count) .batch(8)) iterator = dataset.make_one_shot_iterator() batch_features, batch_labels = iterator.get_next() return batch_features, batch_labels @SRobTweets
  42. Confidential & Proprietary Define our input function def my_input_fn(file_path, repeat_count):

    def decode_csv(line): features = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]) label = features[-1:] del features[-1] return dict(zip(feature_names, features)), label dataset = (tf.data.TextLineDataset(file_path) .map(decode_csv) .shuffle(buffer_size=256) .repeat(repeat_count) .batch(16)) iterator = dataset.make_one_shot_iterator() batch_features, batch_labels = iterator.get_next() return batch_features, batch_labels @SRobTweets
  43. Confidential & Proprietary Define our input function def my_input_fn(file_path, repeat_count):

    def decode_csv(line): features = tf.decode_csv(line, [[0.],[0.],[0.],[0.],[0.],[0.],[0.],[0.]]) label = features[-1:] del features[-1] return dict(zip(feature_names, features)), label dataset = (tf.data.TextLineDataset(file_path) .map(decode_csv) .shuffle(buffer_size=256) .repeat(repeat_count) .batch(16)) iterator = dataset.make_one_shot_iterator() batch_features, batch_labels = iterator.get_next() return batch_features, batch_labels @SRobTweets
  44. Confidential & Proprietary Run training & evaluation # Run training

    for 10 epochs classifier.train( input_fn=lambda: my_input_fn(train_file, 10)) # Evaluate the accuracy of our trained model results = classifier.evaluate( input_fn=lambda: my_input_fn(test_file, 1)) @SRobTweets
  45. Confidential & Proprietary Run training & evaluation # Evaluation output

    accuracy: 0.965686 auc_precision_recall: 0.942694 average_loss: 0.110322 label/mean: 0.111111 prediction/mean: 0.130689 @SRobTweets
  46. Confidential & Proprietary Generate predictions { 'probabilities': array( [ 0.99163872,

    0.00836132], dtype=float32), 'class_ids': array([0]), 'classes': array([b'0'], dtype=object) } @SRobTweets
  47. Confidential & Proprietary Generate predictions { 'probabilities': array( [ 0.99163872,

    0.00836132], dtype=float32), 'class_ids': array([0]), 'classes': array([b'0'], dtype=object) } @SRobTweets
  48. Confidential & Proprietary Use a pre-trained API to accomplish common

    ML tasks like image analysis, NLP, or translation @SRobTweets
  49. Confidential & Proprietary To build an image classification API trained

    on your own data, use AutoML Vision @SRobTweets
  50. Confidential & Proprietary For custom tasks, build a TensorFlow model

    with your data. Train and serve it on ML Engine. @SRobTweets
  51. Confidential & Proprietary Video: cloud.google.com/video-intelligence Vision: cloud.google.com/vision Speech: cloud.google.com/speech Natural

    Language: cloud.google.com/natural-language API overview talk: bit.ly/ml-apis-overview Sign up for the alpha: cloud.google.com/automl Announcement blog post: bit.ly/announcing-automl Intro video: bit.ly/automl-intro-video Podcast: bit.ly/automl-podcast Object Detection: bit.ly/tf-obj-detection T-Swift detector blog: bit.ly/swift-detector Pet detector blog: bit.ly/pet-model-ml-engine TensorFlow: tensorflow.org ML Engine: cloud.google.com/ml-engine TF Estimators: bit.ly/tf-estimators Keras blog post: bit.ly/text-classification-keras ML APIs AutoML Transfer learning Custom model @SRobTweets