Language & Speech Services, Chatbots AWS ML Stack h t t p s : / / m l . a w s h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / a - m a p - f o r - m a c h i n e - l e a r n i n g - o n - a w s - a 2 8 5 f c d 8 d 9 3 2 Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference. Develop sophisticated models with any framework, create managed, auto-scaling clusters of GPUs for large scale training, or run prediction on trained models.
Language & Speech Services, Chatbots Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference. Develop sophisticated models with any framework, create managed, auto-scaling clusters of GPUs for large scale training, or run prediction on trained models. AWS ML Stack h t t p s : / / m l . a w s h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / a - m a p - f o r - m a c h i n e - l e a r n i n g - o n - a w s - a 2 8 5 f c d 8 d 9 3 2
Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging Yes No Data Augmentation Feature Augmentation The Machine Learning Process Re-training Predictions
Component Analysis Neural Topic Modelling Factorization Machines Linear Learner XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq, And more! ALGORITHMS Apache MXNet, Chainer TensorFlow, PyTorch, scikit-learn FRAMEWORKS S e t u p a n d m a n a g e e n v i r o n m e n t s f o r t r a i n i n g T r a i n a n d t u n e m o d e l ( t r i a l a n d e r r o r ) D e p l o y m o d e l i n p r o d u c t i o n S c a l e a n d m a n a g e t h e p r o d u c t i o n e n v i r o n m e n t Built-in, high- performance algorithms Build Git integration Elastic inference
Component Analysis Neural Topic Modelling Factorization Machines Linear Learner XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq, And more! ALGORITHMS Apache MXNet, Chainer TensorFlow, PyTorch, scikit-learn FRAMEWORKS S e t u p a n d m a n a g e e n v i r o n m e n t s f o r t r a i n i n g T r a i n a n d t u n e m o d e l ( t r i a l a n d e r r o r ) D e p l o y m o d e l i n p r o d u c t i o n S c a l e a n d m a n a g e t h e p r o d u c t i o n e n v i r o n m e n t Built-in, high- performance algorithms Build New built-in algorithms scikit-learn environment Model marketplace Search
algorithms One-click training Hyperparameter optimization Train Deploy model in production Scale and manage the production environment P3DN, C5N TensorFlow on 256 GPUs Resume HPO tuning job Build
regression, classification and ranking • Builds a collection of trees. • Handles missing values and sparse data • Supports distributed training • Can work with data sets larger than RAM https://github.com/dmlc/xgboost https://xgboost.readthedocs.io/en/latest/ https://arxiv.org/abs/1603.02754
benchmark with the synthetic ImageNet dataset using our optimized build of TensorFlow 1.11 on a c5.18xlarge instance type is 11x faster than training on the stock binaries. https://aws.amazon.com/about-aws/whats- new/2018/10/chainer4-4_theano_1-0- 2_launch_deep_learning_ami/ (October 2018)
rights reserved. Automatic Model Tuning Finding the optimal set of hyper parameters 1. Manual Search (”I know what I’m doing”) 2. Grid Search (“X marks the spot”) • Typically training hundreds of models • Slow and expensive 3. Random Search (“Spray and pray”) • Works better and faster than Grid Search • But… but… but… it’s random! 4. HPO: use Machine Learning • Training fewer models • Gaussian Process Regression and Bayesian Optimization • You can now resume from a previous tuning job
rights reserved. Optimizing for the underlying hardware https://aws.amazon.com/blogs/aws/amazon-sagemaker-neo-train-your-machine-learning-models-once-run-them-anywhere/ • Train once, run anywhere • Frameworks and algorithms • TensorFlow, Apache MXNet, PyTorch, ONNX, and XGBoost • Hardware architectures • ARM, Intel, and NVIDIA starting today • Cadence, Qualcomm, and Xilinx hardware coming soon • Amazon SageMaker Neo will be released as open source enabling hardware vendors to customize it for their processors and devices.
rights reserved. Inference Pipelines • Linear sequence of 2-5 containers that process inference requests • Feature engineering with scikit-learn or SparkML (on AWS Glue or Amazon EMR) • Predict with built-in or custom containers • The sequence is deployed as a a single model • Useful to preprocess, predict, and post-process • Available for real-time prediction and batch transform