GDG Google Cloud AI/ML Next Ext 19

May 2019 Google Cloud AI Platform Carlos Timoteo Data Scientist

3 © The Pythian Group Inc., 2019 Years in Business
22 Experts in 35 Countries 400+ Clients Globally 350+ pythian.com/googlecloudplatform

Google AI Building Blocks

Building blocks deliver a depth of functionality Sight • Understand
the content of an image • Classify images into categories • Detect individual objects and faces within images • Make videos searchable and discoverable Cloud Vision Cloud Video Intelligence Cloud AutoML Vision Conversation • Convert real-time streaming or pre-recorded audio to text • Synthesize natural-sounding speech with 30+ voices • Synthesize speech in multiple languages and variants • Create conversational experiences across devices and platforms Cloud Speech-to-Text Dialogflow Enterprise Edition Cloud Text-to-Speech Language • Extract information from unstructured text • Reveal structure and meaning of text • Translate dynamically between languages Cloud Translation Cloud Natural Language AutoML Natural Language AutoML Translation

Cloud AI solutions ML professionals & service partners Cloud AI
building blocks Cloud AI Platform Cloud job discovery Contact center Document understanding ASL Professional services organization Cloud Video Intelligence Cloud AutoML Vision Cloud Vision Cloud Natural Language Cloud AutoML NL Cloud Translation Cloud Speech-to-Text Dialogflow Enterprise Cloud Text-to-Speech Cloud AutoML Translation Vision Language Conversation Cloud ML Engine Cloud GPU Cloud TPU Cloud Dataflow Cloud Dataproc Tensorflow Kubeflow Minimal ML expertise Deep ML expertise ML accelerators ML libraries Kaggle/datasets Datasets New New New Google BigQuery BigQuery ML Cloud Dataprep Google Data Studio New New Data Analytics and ML

Confidential End-to-end ML pipeline Data ingestion 1 Data analysis 2
Data transformation 3 Trainer 4 Model evaluation 5 Model validation 6 Serving 7 Pub/Sub Data studio Datalab Dataproc Dataflow Dataprep BigQuery 7

New deep learning VM image Easier and faster ML on
GCE Fast prototyping Prototype your ML project quickly with pre-conﬁgured VMs for deep learning. CPU, GPU and TPU support Choose to add the latest Cloud TPU or GPUs on Google Cloud to your instance in a single click and accelerate your model training jobs. Performance optimized for Google Cloud We tune the libraries and conﬁg to get the optimal performance on our infrastructure, so you don’t need to worry about it. Flexibility Choose between different ML frameworks like TensorFlow, PyTorch, and scikit-learn or install your own on top of our common base image.

Fully managed notebook with configured environments Pre-installed GCP client libraries
Pre-conﬁgured environments for: User Proxy for Accessing Jupyter User’s Project Notebooks Cloud Console SubNetwork Deep Learning VM1 Deep Learning VM2 Deep Learning VM3 Proxy Agent Proxy Agent Proxy Agent

Cloud Dataflow The fully-managed data processing service that simplifies development
and management of stream and batch pipelines Accelerate development for streaming & batch Fast, simplified data pipeline development via expressive Java and Python APIs in the Apache Beam SDK Simplified management and operations Remove operational overhead by letting Cloud Dataflow auto-manage performance, scaling, availability, security and compliance. Build on a foundation for machine learning Add TensorFlow-based Cloud Machine Learning models and APIs to your data processing pipelines for real-time predictions

Data Labeling Get high quality training data Supports the most
popular use cases of image, video, audio, and text annotation with quality assurance. Choose with confidence and ease Allows you to create a plan for our qualiﬁed, trusted human labelers to annotate your data with instructions and examples. Work seamlessly Import your labeled data to AutoML to continue with your machine learning development.

Confidential Cloud ML Engine Data ingestion 1 Data analysis 2
Data transformation 3 Trainer 4 Model evaluation 5 Model validation 6 Serving 7 • Managed service to make training & prediction easy • Easy distributed training • Hyperparameter tuning • Top 4 frameworks • Custom container support coming soon 12

Built-in models in AI Platform • Integrated Hyperparameter Tuning •
Support for popular algorithms • Discoverable via AI Hub • Easy to add new algorithms Algorithm as a container

• Train models without ﬁnagling with infrastructure • Supports all
popular data science and machine learning frameworks. You can even run your own Docker container • Leverage distributed training on the latest GPUs and TPUs to ﬁnish jobs faster • Improve your model quality with the state-of-the-art automated hyperparameter tuning Serverless training using AI Platform AI Platform

• Set up online endpoints for low-latency predictions, or get
predictions on massive batches of data • Deploy models trained on premises or on Google Cloud • Scale automatically based on your traﬃc • Use GPUs for faster predictions Deploy your model with ease AI Platform

Increasing complexity and compute needs

Offers proven, Google-qualified reference models, optimized for performance, accuracy, and
quality Built for AI on Google Cloud Custom ASIC by Google to train and execute deep neural networks Fast, iterative development Cloud TPU

Confidential What is available on Google Cloud? Cloud TPU v2
180 teraflops 64 GB HBM training and inference Cloud TPU v2 Pod 11.5 petaflops 4 TB HBM 2-D toroidal mesh network training and inference Cloud TPU v3 420 teraflops 128 GB HBM training and inference 18

Cloud AutoML Creating ML solutions AutoML Dataset Train Deploy Serve
Generate predictions with a REST API

• AutoML technology beating ImageNet • Optimally trading off size
for accuracy Higher model quality Model size (millions of params) Accuracy Model size (millions of params)

Cloud AutoML AutoML Natural Language Model is now trained and
ready to make predictions This model can scale as needed to adapt to customer demands Upload and label text Train your model Evaluate Sports Lifestyle, Money Tech Sports Lifestyle Money Tech Travel

Model is now trained and ready to make predictions This
model can scale as needed to adapt to customer demands Upload translated language pairs to train your own custom model Train your model Evaluate Cloud AutoML AutoML Translation

AI Platform

• End-to-end, code-based development environment for AI inside GCP console
• Built on Kubeﬂow, Google’s open-source project, offers an integrated tool chain from data engineering to model deployment with “no lock-in” • Allows you to run on-premises or on Google Cloud without signiﬁcant code changes. • Access to cutting-edge Google AI technology like TensorFlow, TPUs, and TFX tools as you deploy your AI applications to production. Introducing AI Platform AI Platform

What is included? AI Platform Integrated with Deep Learning VM
Images Cloud Dataﬂow Cloud Dataproc Google BigQuery Cloud Dataprep Google Data Studio Notebooks Data Labeling Training Predictions Built-in Algorithms For data warehousing For data transformation For data cleansing For Hadoop and Spark clusters For BI dashboards Kubeflow (On premises) AI Hub

On-premises training using Kubeflow ML microservices Cloud On- premises Training
Predict Training Predict … … Infrastructure abstraction Kubernetes manages all underlying dependencies, and resources Swappable & scalable Library of ML microservices to deploy training and prediction jobs Run where you want • GCP • On-premises Kubernetes

Offers integrated TFX tools to analyze models for bias and
drift Provides a visual interface to compare different versions of a model • Plot metrics side by side or in the same chart • Load TF metrics in same TensorBoard instance • View diff conﬁgs, parameters and even code changes from the console Manage and analyze models

Fairing - Build, train, and deploy anywhere Kubeﬂow Kubeflow Notebooks
On-premises AI Platform GCP Training AI Platform Notebooks Prediction Training Prediction

Getting things done with AI Platform Build, train, serve models

Open a new notebook instance

Or discover notebooks from AI Hub

Use AI Platform Notebooks

Import your data from GCS

Use an existing model from AI Hub

Train the notebook anywhere On-Premises Google Cloud Change this to
train the same notebook on-premises using Kubeﬂow, or on Google Cloud using AI Platform

Serve the model with ease

AI Hub

1. One stop AI catalog Easily discover plug & play
pipelines & other content built by Google AI and partners 2. Private hosting Host pipelines and ML content with private sharing controls within an enterprise to foster reuse within organizations. 3. Easy deployment on GCP and hybrid Deploy pipelines via Kubeﬂow on GCP and on premise. AI HubBETA

AI Hub Public Content + Private Content By Google Unique
AI assets by Google By Partners Created, shared & monetized by anyone. By Customers Content shared Securely within and with other organizations. AutoML, TPUs, Cloud AI Platform, etc.

All users Different development paths Multiple ML frameworks Data ingestion
Data analysis Data transformation Training Evolution and validation Serving Business analysts Developers ML engineers Data scientists ML APIs AutoML ML Platform Tensorﬂow XGBoost SKLearn PyTorch Keras Content Pipelines Notebooks TF modules AI services Deep learning VM images Data engineer End user Datasets, etc. AI Hub: On stop AI catalog for enterprises Entire AI workflows

Data and AI Key Announcements

NEXT’19 Launch Dataflow Streaming Engine (GA) Almost every streaming Cloud
Dataflow pipeline needs to shuffle data and store time window state. Currently, Cloud Dataflow performs these operations on worker Virtual Machines and uses attached Persistent Disks for window state and shuffle storage. The new opt-in Streaming Engine (also now available in beta) moves these operations from worker VMs into a Cloud Dataflow backend service, leading to several improvements. Dataflow FlexRS (Beta) FlexRS will be a scheduling/pricing option for Dataflow that will offer a lower price to batch processing users who don't have time sensitivity (ie get it done before AM) in their jobs. By giving them a guarantee over a broader window, we'll be able to lower their costs. Dataflow SQL Pipelines (Public Alpha) Dataflow SQL will enable the creation of dataflow processing pipelines to be activated via simple SQL statements. Integration with BQ will allow SQL users, often those who work within the EDW space, to activate these pipelines directly within the BQ editor. NVIDIA T4 GPU for GCE (GA) Launch a new NVIDIA Tensor Core GPU, the Tesla T4 on GCE. This GPU will target ML inference, ML training (entry level pricing), and visualization with NVIDIA's new ray tracing feature. Launching in 8 locations for GA. Connected sheets in Sheets (beta*) Introducing the datasheet (beta coming soon): a new type of sheet that activates only when using the Sheets data connector, allowing users to access, analyze, visualize and collaborate on up to 10 billion rows of BigQuery data—no SQL scripts needed. Datasheets make it easier for Sheets users to surface insights without needing the help of a BigQuery expert, saving time for both users and experts. Users can then make sense of that data with spreadsheet formulas or perform deeper analysis with features like Explore, pivot tables and charts. It’s that easy.

NEXT’19 Launch AutoML Tables (Beta) AutoML Tables enables teams of
data scientists, analysts, and developers to automatically build and deploy state-of-the-art machine learning models on structured data at a massively increased speed and scale. BigQuery Insights: BigQuery ML - Core (GA) BigQuery ML was created to enable data analysts and data scientists to build machine learning models in minutes by using just SQL, entirely in BigQuery. We have seen 1000s of customers use BigQuery ML in Beta and now we are announcing the GA of BigQuery ML. BigQuery: K-means Clustering ML (Beta) New algorithm for BigQuery ML to create customer segmentations BigQuery: Matrix Factorization (Alpha) New algorithm for BigQuery ML to build product recommendations BigQuery: Import TensorFlow Models (Alpha) Import TensforFlow models into BigQuery ML for predictions Data Catalog (Beta) Fully managed and scalable metadata management service that allows organizations to quickly discover, manage and understand all their data in Google Cloud.

NEXT’19 Launch Data Fusion Data Analytics (Beta) Data Fusion will
provide a fully managed, code-free data integration service that helps customers move data to, and subsequently process and land that data within, GCP. [Retail] Recommendations AI (Beta) This product enables personalized recommendations. At launch, we will focus on the Retail sector. Beyond launch, it will serve additional verticals like Media & Entertainment, FInancial Services, and more. We are talking with Disney about featuring their use of this product in a breakout session about Reinventing Retail with AI. [Retail] Visual Product Search (GA) This feature is part of the Vision API and enables customers to add Google Lens-type capabilities to their mobile apps. This allows customers to take photos of objects in their environment, and the app to pull up the webpage to purchase the identified product. Nordstrom will be featuring its experience in a breakout session. IKEA will be featuring its experience in the Retail Showcase. AKA: Vision API Updates AutoML Vision for edge devices (Beta) AutoML Vision can now export models to the cloud -- and the edge (e.g., Android/iOS mobile phones, cameras, IoT devices with limited/unreliable connectivity). This launch involves Edge TPU. AutoML Video (Beta) We already had the Video Intelligence API. Now we introduce AutoML Video. Video text detection and tracking: User uploads videos into their GCS bucket and asks for OCR. We return bounding boxes for the text(s) detected and along with the text labels. Video Intelligence API - enhancements (Beta, GA) Video Intelligence API has pre-trained machine learning models that automatically recognize a vast number of objects, places, and actions in stored and streaming video. We’re adding *Streaming API for Video features (Beta) *Video OCR (GA) *Video Object Tracking (GA)

NEXT’19 Launch Explainability: Single Instance explanation for CMLE (Alpha) This
is the new set of model analysis tools available via the AI Studio within the GCP console. These tools help users interpret models meaningfully and understand model bias and drift closely. CMLE Online Prediction: User Code Support (Beta) This is an add-on feature to the existing CMLE Prediction service that allows users to customize their online predictions by adding custom functionalities via Python code. This is a relatively low-priority announcement compared to all other platform announcements at Next. AI Hub (Beta) AI Hub is a hosted repository of plug-and-play AI components, including end-to-end AI pipelines and out-of-the-box algorithms. AI Hub provides enterprise-grade sharing capabilities that let organizations privately host their AI content to foster reuse and collaboration among machine learning developers and users internally. You can also easily deploy unique Google Cloud AI and Google AI technologies for experimentation and ultimately production on Google Cloud and hybrid infrastructures. BigQuery BI Engine (Beta) BigQuery BI Engine is a blazing-fast, in-memory analysis service for BigQuery that allows users to analyze large and complex data sets interactively with sub-second query response time and with high concurrency. BigQuery BI Engine seamlessly integrates with familiar tools like Data Studio, Looker* and Google Sheets** to accelerate data exploration and analysis. (*by early 2020, **by Q4, 2019) Explainability: Single Instance explanation for CMLE (Alpha) This is the new set of model analysis tools available via the AI Studio within the GCP console. These tools help users interpret models meaningfully and understand model bias and drift closely. CMLE Online Prediction: User Code Support (Beta) This is an add-on feature to the existing CMLE Prediction service that allows users to customize their online predictions by adding custom functionalities via Python code. This is a relatively low-priority announcement compared to all other platform announcements at Next.

Thank you. Questions?

GDG Google Cloud AI/ML Next Ext 19

GDG Google Cloud AI/ML Next Ext 19

More Decks by cncf-canada-meetups

Other Decks in Technology

Featured

Transcript