Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Amazon SageMaker

Amazon SageMaker

AWS User Group Mumbai

February 11, 2019
Tweet

More Decks by AWS User Group Mumbai

Other Decks in Technology

Transcript

  1. Julien Simon Global Evangelist, AI & Machine Learning @julsimon From

    Notebook to Production with Amazon SageMaker
  2. Application Services Platform Services Frameworks & Infrastructure API-driven services: Vision,

    Language & Speech Services, Chatbots AWS ML Stack h t t p s : / / m l . a w s h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / a - m a p - f o r - m a c h i n e - l e a r n i n g - o n - a w s - a 2 8 5 f c d 8 d 9 3 2 Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference. Develop sophisticated models with any framework, create managed, auto-scaling clusters of GPUs for large scale training, or run prediction on trained models.
  3. Application Services Platform Services Frameworks & Infrastructure API-driven services: Vision,

    Language & Speech Services, Chatbots Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference. Develop sophisticated models with any framework, create managed, auto-scaling clusters of GPUs for large scale training, or run prediction on trained models. AWS ML Stack h t t p s : / / m l . a w s h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / a - m a p - f o r - m a c h i n e - l e a r n i n g - o n - a w s - a 2 8 5 f c d 8 d 9 3 2
  4. Data Visualization & Analysis Business Problem ML problem framing Data

    Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging Yes No Data Augmentation Feature Augmentation The Machine Learning Process Re-training Predictions
  5. Amazon SageMaker Pre-built notebooks for common problems K-Means Clustering Principal

    Component Analysis Neural Topic Modelling Factorization Machines Linear Learner XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq, And more! ALGORITHMS Apache MXNet, Chainer TensorFlow, PyTorch, scikit-learn FRAMEWORKS S e t u p a n d m a n a g e e n v i r o n m e n t s f o r t r a i n i n g T r a i n a n d t u n e m o d e l ( t r i a l a n d e r r o r ) D e p l o y m o d e l i n p r o d u c t i o n S c a l e a n d m a n a g e t h e p r o d u c t i o n e n v i r o n m e n t Built-in, high- performance algorithms Build Git integration Elastic inference
  6. Amazon SageMaker Pre-built notebooks for common problems K-Means Clustering Principal

    Component Analysis Neural Topic Modelling Factorization Machines Linear Learner XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq, And more! ALGORITHMS Apache MXNet, Chainer TensorFlow, PyTorch, scikit-learn FRAMEWORKS S e t u p a n d m a n a g e e n v i r o n m e n t s f o r t r a i n i n g T r a i n a n d t u n e m o d e l ( t r i a l a n d e r r o r ) D e p l o y m o d e l i n p r o d u c t i o n S c a l e a n d m a n a g e t h e p r o d u c t i o n e n v i r o n m e n t Built-in, high- performance algorithms Build New built-in algorithms scikit-learn environment Model marketplace Search
  7. Amazon SageMaker Pre-built notebooks for common problems Built-in, high- performance

    algorithms One-click training Hyperparameter optimization Train Deploy model in production Scale and manage the production environment P3DN, C5N TensorFlow on 256 GPUs Resume HPO tuning job Build
  8. Amazon SageMaker Fully managed hosting with auto- scaling One-click deployment

    Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Deploy Model compilation Elastic inference Inference pipelines Train Build
  9. Amazon SageMaker Fully managed hosting with auto- scaling One-click deployment

    Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Deploy Model compilation Elastic inference Inference pipelines Train Build P3DN, C5N TensorFlow on 256 GPUs Resume HPO tuning job New built-in algorithms scikit-learn environment Model marketplace Search Git integration Elastic inference
  10. The Amazon SageMaker API • Python SDK orchestrating all Amazon

    SageMaker activity • High-level objects for algorithm selection, training, deploying, automatic model tuning, etc. • Spark SDK (Python & Scala) • AWS CLI: ‘aws sagemaker’ • AWS SDK: boto3, etc.
  11. Model Training (on EC2) Model Hosting (on EC2) Training data

    Model artifacts Training code Helper code Helper code Inference code Ground Truth Client application Inference code Training code Inference request Inference response Inference Endpoint
  12. Training code Factorization Machines Linear Learner Principal Component Analysis K-Means

    Clustering XGBoost And more Built-in Algorithms Bring Your Own Container Bring Your Own Script Model options
  13. Built-in algorithms orange: supervised, yellow: unsupervised Linear Learner: regression, classification

    Image Classification: Deep Learning (ResNet) Factorization Machines: regression, classification, recommendation Object Detection (SSD): Deep Learning (VGG or ResNet) K-Nearest Neighbors: non-parametric regression and classification Neural Topic Model: topic modeling XGBoost: regression, classification, ranking https://github.com/dmlc/xgboost Latent Dirichlet Allocation: topic modeling (mostly) K-Means: clustering Blazing Text: GPU-based Word2Vec, and text classification Principal Component Analysis: dimensionality reduction Sequence to Sequence: machine translation, speech to text and more Random Cut Forest: anomaly detection DeepAR: time-series forecasting (RNN) Object2Vec: general-purpose embedding IP Insights: usage patterns for IP addresses Semantic Segmentation: Deep Learning
  14. XGBoost • Open Source project • Popular tree-based algorithm for

    regression, classification and ranking • Builds a collection of trees. • Handles missing values and sparse data • Supports distributed training • Can work with data sets larger than RAM https://github.com/dmlc/xgboost https://xgboost.readthedocs.io/en/latest/ https://arxiv.org/abs/1603.02754
  15. Optimizing TensorFlow https://aws.amazon.com/blogs/machine-learning/faster-training- with-optimized-tensorflow-1-6-on-amazon-ec2-c5-and-p3- instances/ (March 2018) Training a ResNet-50

    benchmark with the synthetic ImageNet dataset using our optimized build of TensorFlow 1.11 on a c5.18xlarge instance type is 11x faster than training on the stock binaries. https://aws.amazon.com/about-aws/whats- new/2018/10/chainer4-4_theano_1-0- 2_launch_deep_learning_ami/ (October 2018)
  16. © 2018, Amazon Web Services, Inc. or Its Affiliates. All

    rights reserved. Automatic Model Tuning Finding the optimal set of hyper parameters 1. Manual Search (”I know what I’m doing”) 2. Grid Search (“X marks the spot”) • Typically training hundreds of models • Slow and expensive 3. Random Search (“Spray and pray”) • Works better and faster than Grid Search • But… but… but… it’s random! 4. HPO: use Machine Learning • Training fewer models • Gaussian Process Regression and Bayesian Optimization • You can now resume from a previous tuning job
  17. © 2018, Amazon Web Services, Inc. or Its Affiliates. All

    rights reserved. Optimizing for the underlying hardware https://aws.amazon.com/blogs/aws/amazon-sagemaker-neo-train-your-machine-learning-models-once-run-them-anywhere/ • Train once, run anywhere • Frameworks and algorithms • TensorFlow, Apache MXNet, PyTorch, ONNX, and XGBoost • Hardware architectures • ARM, Intel, and NVIDIA starting today • Cadence, Qualcomm, and Xilinx hardware coming soon • Amazon SageMaker Neo will be released as open source enabling hardware vendors to customize it for their processors and devices.
  18. Demo: Compiling ResNet-50 for the Raspberry Pi Configure the compilation

    job { "RoleArn":$ROLE_ARN, "InputConfig": { "S3Uri":"s3://jsimon-neo/model.tar.gz", "DataInputConfig": "{\"data\": [1, 3, 224, 224]}", "Framework": "MXNET" }, "OutputConfig": { "S3OutputLocation": "s3://jsimon-neo/", "TargetDevice": "rasp3b" }, "StoppingCondition": { "MaxRuntimeInSeconds": 300 } } Compile the model $ aws sagemaker create-compilation-job --cli-input-json file://config.json --compilation-job-name resnet50-mxnet-pi $ aws s3 cp s3://jsimon-neo/model- rasp3b.tar.gz . $ gtar tfz model-rasp3b.tar.gz compiled.params compiled_model.json compiled.so Predict with the compiled model from dlr import DLRModel model = DLRModel('resnet50', input_shape, output_shape, device) out = model.run(input_data)
  19. © 2018, Amazon Web Services, Inc. or Its Affiliates. All

    rights reserved. Inference Pipelines • Linear sequence of 2-5 containers that process inference requests • Feature engineering with scikit-learn or SparkML (on AWS Glue or Amazon EMR) • Predict with built-in or custom containers • The sequence is deployed as a a single model • Useful to preprocess, predict, and post-process • Available for real-time prediction and batch transform
  20. Amazon SageMaker Fully managed hosting with auto- scaling One-click deployment

    Pre-built notebooks for common problems Built-in, high- performance algorithms One-click training Hyperparameter optimization Build Train Deploy FREE TIER
  21. © 2018, Amazon Web Services, Inc. or Its Affiliates. All

    rights reserved. Getting started http://aws.amazon.com/free https://ml.aws https://aws.amazon.com/sagemaker https://github.com/aws/sagemaker-python-sdk https://github.com/aws/sagemaker-spark https://github.com/awslabs/amazon-sagemaker-examples https://gitlab.com/juliensimon/ent321 https://medium.com/@julsimon https://gitlab.com/juliensimon/dlnotebooks