Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Notebook to Production with Amazon Sagemaker

From Notebook to Production with Amazon Sagemaker

Julio Faerman

July 03, 2018
Tweet

More Decks by Julio Faerman

Other Decks in Technology

Transcript

  1. Train and Deploy ML Models with Amazon Sagemaker Julien Simon

    @julsimon Julio Faerman @julioaws July 2018
  2. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Platform Services AWS ML Stack Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference.
  3. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Data Visualization & Analysis Business Problem – M L problem fram ing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering M odel Training & Param eter Tuning M odel Evaluation Are Business Goals met? M odel Deploym ent M onitoring & Debugging Yes No Data Augmentation Feature Augmentation The Machine Learning Process Re-training Predictions
  4. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Data Visualization & Analysis Business Problem – M L problem fram ing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering M odel Training & Param eter Tuning M odel Evaluation Are Business Goals met? M odel Deploym ent M onitoring & Debugging Yes No Data Augmentation Feature Augmentation Problem discovery Re-training • Help formulate the right questions • Dom ain Know ledge Predictions
  5. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Data Visualization & Analysis Business Problem – M L problem fram ing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering M odel Training & Param eter Tuning M odel Evaluation Are Business Goals met? M odel Deploym ent M onitoring & Debugging Yes No Data Augmentation Feature Augmentation Retraining • Need a data platform? • Am azon S3 • AW S Glue • Am azon Athena • Am azon EM R • Am azon Redshift Spectrum Integration Predictions
  6. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. ETL as pyspark code with AWS GLUE
  7. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Pyspark notebooks with Amazon EMR
  8. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Data Visualization & Analysis Business Problem – M L problem fram ing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering M odel Training & Param eter Tuning M odel Evaluation Are Business Goals met? M odel Deploym ent M onitoring & Debugging Yes No Data Augmentation Feature Augmentation Retraining Model Training Predictions • Setup and manage Notebook Environments • Setup and manage Training Clusters • Write Data Connectors • Scale ML algorithms to large datasets • Distribute ML training algorithm to multiple machines • Secure Model artifacts
  9. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Data Visualization & Analysis Business Problem – M L problem fram ing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering M odel Training & Param eter Tuning M odel Evaluation Are Business Goals met? M odel Deploym ent M onitoring & Debugging Yes No Data Augmentation Feature Augmentation Retraining Model Deployment Predictions • Setup and manage Model Inference Clusters • Manage and Scale Model Inference APIs • Monitor and Debug Model Predictions • Models versioning and performance tracking • Automate New Model version promotion to production (A/B testing)
  10. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. End-to-End Machine Learning Platform Zero setup Flexible Model Training Pay by the second $ Amazon SageMaker Build, train, and deploy machine learning models at scale
  11. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Highly-optimized machine learning algorithms Build Pre-built notebook instances Amazon SageMaker
  12. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Highly-optimized machine learning algorithms One-click training for ML, DL, and custom algorithms Build Pre-built notebook instances Easier training with hyperparameter optimization Train Amazon SageMaker
  13. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. NVIDIA Tesla Volta V100 GPUs “At the limit of photolithography” - Jen-Hsun Huang, Nvidia's CEO 5K FP32 GPU Cores | 15.7 Peak FP32 TFLOPS 640 Tensor Cores, 64 FMA/clock, up to 125 Tensor Core TFLOPS 16 GB GPU memory with 900 GB/sec peak 8xGPU memory bandwidth
  14. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. One-click training for ML, DL, and custom algorithms Easier training with hyperparameter optimization Highly-optimized machine learning algorithms Deployment without engineering effort Fully-managed hosting at scale Build Pre-built notebook instances Deploy Train Amazon SageMaker
  15. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon ECR Model Training (on EC2) Model Hosting (on EC2) Training data Model artifacts T r a in in g c o d e H e lp e r c o d e H e lp e r c o d e In f e r e n c e c o d e Ground Truth C lie n t a p p lic a t io n In f e r e n c e c o d e T r a in in g c o d e In f e r e n c e r e q u e s t In f e r e n c e r e s p o n s e In f e r e n c e E n d p o in t Amazon SageMaker
  16. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Open Source Containers for TF and MXNet https://github.com/aws/sagemaker-tensorflow-containers https://github.com/aws/sagemaker-mxnet-containers • Customize them • Run them locally for development and testing • Run them on SageMaker for training and prediction at scale © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Open Source Containers for TF and MXNet https://github.com/aws/sagemaker-tensorflow-containers https://github.com/aws/sagemaker-mxnet-containers • Customize them • Run them locally for development and testing • Run them on SageMaker for training and prediction at scale
  17. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Bring your own container https://github.com/aws/sagemaker-container-support • Integration with SageMaker Python SDK Estimators, including: • Downloading user-provided Python code • Deserializing hyperparameters (preserving their Python types) • bin/entry.py, the Docker entrypoint required by SageMaker • Reading in the metadata files provided to the container during training • nginx + Gunicorn HTTP server for serving inference requests https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/scikit_bring_your_own https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/r_bring_your_own
  18. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Use Label Maker and Amazon SageMaker to automatically map buildings in Vietnam
  19. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. End-to-End Machine Learning Platform Zero setup Flexible Model Training Pay by the second $ Amazon SageMaker Build, train, and deploy machine learning models at scale
  20. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Example #1: binary classifier with Linear Learner built-in algo https://github.com/awslabs/amazon-sagemaker- examples/tree/master/introduction_to_amazon_algorithms/linear_learner_mnist Example #2: multi-class classifier with XGBoost built-in algo (low-level API) https://github.com/awslabs/amazon- sagemaker-examples/tree/master/introduction_to_amazon_algorithms/xgboost_mnist Example #3: bring your own code (TensorFlow or MXNet, we should let the participants choose) • https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python- sdk/tensorflow_distributed_mnist • https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python- sdk/mxnet_gluon_sentiment Example #4: bring your own model (TensorFlow) and deploy it on SageMaker: https://github.com/awslabs/amazon- sagemaker-examples/tree/master/advanced_functionality/tensorflow_iris_byom Example #5 (optional): bring your own container https://github.com/awslabs/amazon-sagemaker- examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
  21. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Resources https://aws.amazon.com/machine-learning https://aws.amazon.com/blogs/ai https://aws.amazon.com/sagemaker https://github.com/awslabs/amazon-sagemaker-examples https://github.com/aws/sagemaker-python-sdk https://github.com/aws/sagemaker-spark https://medium.com/@julsimon https://youtube.com/juliensimonfr/ https://www.youtube.com/watch?v=ym7NEYEx9x4 - An overview of Amazon SageMaker