Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Build, train, and deploy machine learning models at scale

Build, train, and deploy machine learning models at scale

##As presented at the AWS Summit in Cape Town 2018 - with customer Absa on stage.

Machine learning often feels a lot harder than it should be to most developers because the process to build and train models, and then deploy them into production is too complicated and too slow. Amazon SageMaker is a fully-managed service that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. In this session, I will make a quick introduction to machine learning and walk through leveraging Sagemaker for your machine learning projects.

Adrian Hornsby

July 12, 2018
Tweet

More Decks by Adrian Hornsby

Other Decks in Technology

Transcript

  1. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Adrian Hornsby Cloud Architecture Evangelist @ AWS @adhorn Build, train, and deploy machine learning models at scale
  2. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. What are we talking about? AI Machine Learning Deep Learning
  3. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What is a Deep Learning?
  4. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Predicting the price of a house with humans Price City ZipCode Life Quality Parking Size # Room Accessibility Family Friendly
  5. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Predicting the price of a house with neural network Price City ZipCode Life Quality Parking Size # Room Accessibility Family Friendly Input Output Discovered by the neural network
  6. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Deep Learning – Neural Networks Output Layer Input Layer Hidden Layers Many More…
  7. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How do we ”build” Machine Learning models?
  8. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Deep Learning Training Dataset The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. Human-in-the-loop
  9. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. https://ml4a.github.io/ml4a/neural_networks/ Deep Learning Training
  10. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. https://ml4a.github.io/ml4a/neural_networks/ 784 x 10 + 10 x 10 = 7940 weights Deep Learning Training
  11. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Learning = Minimizing the loss (error) function Backpropagation
  12. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Early stopping Training accuracy Loss function Accuracy 100% Epochs Validation accuracy Loss Best epoch OVERFITTING « Deep Learning ultimately is about finding a minimum that generalizes well, with bonus points for finding one fast and reliably », Sebastian Ruder
  13. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Deep Learning is a Big Deal It’s able to do better than other ML and Humans
  14. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. https://github.com/precedenceguo/mx-rcnn https://github.com/zhreshold/mxnet-yolo CNN: Object Detection
  15. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. CNN: Object Detection
  16. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. FDA-approved medical imaging https://www.periscope.tv/AWSstartups/1vAGRgevBXRJl https://www.youtube.com/watch?v=WE81dncwnIc CNN: Object Segmentation
  17. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. CNN: Text Detection and Recognition https://github.com/Bartzi/stn-ocr
  18. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. LSTM: Long Short Term Memory Networks Oh I remember! https://github.com/awslabs/sockeye Amazon Polly
  19. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Generative Adversarial Networks (GAN) The future at work (already) today Generating new ”celebrity” faces https://github.com/tkarras/progressive_growing_of_gans
  20. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Wait! There’s more! Models can also generate text from text, text from images, text from video, images from text, sound from video, 3D models from 2D images, etc.
  21. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Model Zoos • Full implementations of many state-of-the-art models reported in the academic literature. • Complete models, with scripts, pre-trained weights and instructions on how to build and fine tune these models.
  22. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Say hello to Transfer Learning (hidden gem) • Initialise parameter with pre-trained model • Use pre-trained model as fixed feature extractor and build model based on feature. • Why? • It takes a long time and a lot of resources to train a neural network from scratch.
  23. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. https://www.youtube.com/watch?v=qGotULKg8e0 • Over 10 million images from 300,000 hotels • Fine-tuned a pre-trained Convolutional Neural Network using 100,000 images • Hotel descriptions now automatically feature the best available images Expedia Ranking hotel images using deep learning https://news.developer.nvidia.com/expedia-ranking-hotel-images-with-deep-learning/
  24. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. ML is still too complicated for everyday developers Collect and prepare training data Choose and optimize your ML algorithm Set up and manage environments for training Train and tune model (trial and error) Deploy model in production Scale and manage the production environment
  25. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker A fully-managed platform that provides a quick and easy way to get models from idea to production. https://aws.amazon.com/sagemaker/
  26. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Workflow Amazon’s fast, scalable algorithms Distributed TensorFlow, Apache MXNet, Chainer, PyTorch Bring your own algorithm Hyperparameter Tuning Building Hosting Training
  27. Brand is Pervasive Across the Enterprise Data Structure Naming Conventions

    Callable Units Sub-routines Functions Digital Signage Bulk Printing Electronic Statements Digital Certificates AD Forests Domains Branches Posters Statements Correspondence Promotions Websites Online Print Media Electronic Media Broadcast email Disclaimers Documents Presentations Spreadsheets Reports Layout Content Charts Electronic Messaging
  28. 1 789 549 8 582 3 391 10 040 10

    109 Messaging Voice Documents Reports Forms Screens Removal of Barclays From Systems in South Africa (Overall) 99,99% 0,01% Complete In Progress 97,93% 2,07% 15,32% 15,32% 15,32% 15,32% 15,32% 15,21% 15,32% 4,56% 4,56% 4,56% 4,56% 4,56% 4,53% 4,56% 53,28% 53,28% 53,28% 53,25% 53,28% 52,88% 53,28% 26,83% 26,83% 26,83% 26,81% 26,83% 26,63% 26,83% Constant CIB CTO DIST FT RBB WIMI 15,32% 15,31% 15,32% 15,32% 4,40% 15,30% 15,32% 4,56% 4,56% 4,56% 4,56% 1,07% 4,42% 4,56% 53,28% 53,24% 53,28% 53,28% 12,26% 51,59% 53,28% 26,83% 26,81% 26,83% 26,83% 6,18% 25,98% 26,83% Constant CIB CTO DIST FT RBB WIMI Effort Required (External) Effort Required (Internal) Combined View Burndown [% of Total Outstanding] Analysis Effort Completed Design Effort Completed Development Effort Completed Deployment Completed (Live) Key Metrics 0,00% 100,00% Certification External Internal 100% 92% 77% 78% 78% 78% 76%79% 78% 82% 70%68% 72% 60% 59% 29% 30% 29%27% 25% 5% 1% 0% PGM External PGM Overall [##%] 39,074 Items (Total) 24,250 Collective Workdays 36 SA Business Units 293 Systems (SA Only) 442 Touchpoints (2x Touch)
  29. Master Data Store Registry Master Data Rules Automated Analysis Messaging

    Generators S3 Analysis ODS Email Submission Manual Submission Crowdsourced Submission Images Logos Text Type Categorisation Item Deconstruction Submission Gateway (ETL & Manual) Sagemaker Analysis Brand Rules Learned Modelling Historic (Retired) Brand Elements Disallowed Brand Elements Approved Brand Elements Current Templates, Campaigns, etc Case Management Exceptions Logs Image Master Document Master Other Data Master Analytics Data Store À Archive 1 À Archive 2 À Archive …n Consolidate Match Enhance Reporting (Metrics) Reporting (Insights) BUR Data Engine (Inventory) Tableau
  30. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Workflow Amazon’s fast, scalable algorithms Distributed TensorFlow, Apache MXNet, Chainer, PyTorch Bring your own algorithm Hyperparameter Tuning Building Hosting Training
  31. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Building Machine Learning Models Create and manage cloud-based notebooks Use SageMaker‘s web interface to get started Or build models locally, then upload to SageMaker
  32. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Workflow Amazon’s fast, scalable algorithms Distributed TensorFlow, Apache MXNet, Chainer, PyTorch Bring your own algorithm Hyperparameter Tuning Building Hosting Training
  33. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Training Zero setup Streaming datasets + distributed compute Hyperparameter optimization One-click training
  34. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Workflow Amazon’s fast, scalable algorithms Distributed TensorFlow, Apache MXNet, Chainer, PyTorch Bring your own algorithm Hyperparameter Tuning Building Hosting Training
  35. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Hosting One-click deployment Low latency, high throughput, and high reliability A/B testing Bring your own model
  36. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Workflow Amazon’s fast, scalable algorithms Distributed TensorFlow, Apache MXNet, Chainer, PyTorch Bring your own algorithm Hyperparameter Tuning Building Hosting Training
  37. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. SageMaker built-in algorithms XGBoost, FM, Linear, and Forecasting for supervised learning Kmeans, PCA, and Word2Vec for clustering and pre- processing Image classification with convolutional neural networks LDA and NTM for topic modeling, seq2seq for translation
  38. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Workflow Amazon’s fast, scalable algorithms Distributed TensorFlow, Apache MXNet, Chainer, PyTorch Bring your own algorithm Hyperparameter Tuning Building Hosting Training
  39. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Workflow Amazon’s fast, scalable algorithms Distributed TensorFlow, Apache MXNet, Chainer, PyTorch Bring your own algorithm Hyperparameter Tuning Building Hosting Training
  40. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Bring your own algorithm ... add algorithm code to a Docker container... Pick your preferred framework... ... publish your Docker image to ECR
  41. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Amazon SageMaker Workflow Amazon’s fast, scalable algorithms Distributed TensorFlow, Apache MXNet, Chainer, PyTorch Bring your own algorithm Hyperparameter Tuning Building Hosting Training
  42. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Hyperparameter Tuning (Automated Model Tuning) Run a large set of training jobs with varying hyperparameters... ... and search the hyperparameter space for improved accuracy.
  43. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Pay as you go ML compute by the second starting at $0.0464/hr ML storage by the second at $0.14 per GB-month Data processed in notebooks and hosting at $0.016 per GB Free trial to get started quickly
  44. © 2017, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Demo Sentiment Analysis with SageMaker and MXNet https://github.com/lupesko/sentiment-analysis-with-sagemaker-mxnet