Slide 1

Slide 1 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Marc Cabocel Startup Solutions Architect November 28th, 2019 Industrialisation du Machine Learning Mise en place d’une CI/CD pour vos modèles de machine learning

Slide 2

Slide 2 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Customer Goals Business Outcomes I N N O V A T I O N M O D E R N I Z A T I O N & M I G R A T I O N Successful enterprises follow a common path F O U N D A T I O N

Slide 3

Slide 3 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Challenges

Slide 4

Slide 4 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Definition of DevOps Strong focus on automation and monitoring at all steps of software construction. Software Operations Software Development

Slide 5

Slide 5 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Definition of MLOps Strong focus on automation and monitoring at all steps of ML System construction. ML System Operations ML System Development

Slide 6

Slide 6 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Specific problem areas Development Deployment Operation ML System Construction Process Common anti-patterns in …

Slide 7

Slide 7 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Specific problem areas Development Deployment Operation ML System Construction Process Common anti-patterns in …

Slide 8

Slide 8 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Development Anti Pattern: Superhero-Dependence ML Researcher ML Full Stack Data Engineer DevOps Engineer Infrastructure Architect Product Manager Exec Partner POC Production

Slide 9

Slide 9 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Development Anti Pattern: Superhero-Dependence ML Full Stack Scalability?

Slide 10

Slide 10 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Scalable Team Development Anti Pattern: Superhero-Dependence ML Researcher Data Engineer Infra & Ops Engineer Product Manager Business Leader

Slide 11

Slide 11 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Development Anti Pattern: Blackbox Problem ML Researcher Engineer POC Production

Slide 12

Slide 12 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Development Anti Pattern: Blackbox Problem ML Researcher Engineer POC Production

Slide 13

Slide 13 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Development Anti Pattern: Deep Embedded Failure „just changing one feature“ A change of • Feature • Hyper-parameter • Regularization • Learning Rate • Sampling • Thresholds can affect the whole result.

Slide 14

Slide 14 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Semantically Interpretable Models Development Anti Pattern: Deep Embedded Failure Model I Model II Model III Ensemble Model Result: Better robustness and easier troubleshooting

Slide 15

Slide 15 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Specific problem areas Development Deployment Operation ML System Construction Process Common anti-patterns in …

Slide 16

Slide 16 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Deployment Anti Pattern: No Deployment Transparency

Slide 17

Slide 17 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Deployment Anti Pattern: No ML Lifecycle Management ML Full Stack POC Production

Slide 18

Slide 18 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Can we solve this?

Slide 19

Slide 19 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon SageMaker Data Scientist DevOps Amazon SageMaker

Slide 20

Slide 20 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Amazon Sagemaker Flow Fully managed hosting with auto- scaling One-click deployment Pre-built notebooks for common problems Built-in, high performance algorithms One-click training Hyperparameter optimization BUILD TRAIN DEPLOY MOST OF THE TIME THIS IS JUST PROTOTYPING

Slide 21

Slide 21 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Only one small part of a ML project ML Code Serving Infrastructure Feature Extraction Data Collection

Slide 22

Slide 22 text

© 2019, Amazon Web Services, Inc. or its Affiliates. How do you … - Ensure code quality? Maintainability? - Test your ML code? - Version code? - Re-train model? - Version models? - Decide when to (re-)deploy? - Rollback? - Audit? - Security? - …

Slide 23

Slide 23 text

Treat ML code as “normal” software code!

Slide 24

Slide 24 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Why Sagemaker is great?

Slide 25

Slide 25 text

Build, Train and Deploy your Machine Learning Models Amazon SageMaker Models are abstracted into docker containers Fast & accurate data labeling Built-in, high performance algorithms & notebooks B U I L D 1 One-click training and tuning T R A I N Model tuning and optimization 2 Fully managed hosting with auto-scaling and elastic inference One-click Deployment of model or Inference Pipeline D E P L O Y 3

Slide 26

Slide 26 text

Amazon SageMaker: algorithms A. Train and deploy a built-in algorithm B. Train and deploy your own code in a framework container C. Train and deploy your own container D. Deploy a pre-trained model E. NEW – Train and Deploy a Marketplace Algorithm F. NEW – Deploy a Marketplace Model Package

Slide 27

Slide 27 text

Amazon SageMaker: interface A. Console B. AWS Command Line Interface (CLI): aws sagemaker create-endpoint --endpoint-name --endpoint-config-name C. SageMaker Python SDK model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge') D. Boto3 (python) SageMaker client client.create_endpoint(EndpointName='string', EndpointConfigName='string') E. Spark SDK

Slide 28

Slide 28 text

In summary:

Slide 29

Slide 29 text

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Architecture Patterns From MVP to Production

Slide 30

Slide 30 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Architecture – Let’s start simple for a POC SageMaker EndPoints Raw Data Lake Training flow Inference flow SageMaker Training Platform User Interface End User Source C 4. Build Source B Source A Sagemaker Jupyter Notebooks Model Repository DDB S3

Slide 31

Slide 31 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Architecture – Let’s look at the data pipeline SageMaker EndPoints Raw Data Lake Training flow Inference flow User Interface End User Source C 4. Build Source B Source A Model Repository DDB S3 ETL Glue EMR Feature Store Catalog Analytics Data DDB S3 Model Input Store Metadata Data DDB S3 ETL SageMaker Training Platform

Slide 32

Slide 32 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Inference Architecture – Inference SageMaker EndPoints Raw Data Lake Training flow Inference flow API Gateway Lambda Optimization compute User Interface End User Source C 4. Build Source B Source A Model Repository DDB S3 ETL Glue EMR Feature Store Catalog Analytics Data DDB S3 Model Input Store Metadata Data DDB S3 ETL SageMaker Training Platform

Slide 33

Slide 33 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Inference Architecture – Training Pipeline SageMaker EndPoints Raw Data Lake Training flow Inference flow API Gateway Lambda Optimization compute User Interface End User Source C 4. Build Source B Source A Model Repository DDB S3 ETL Glue EMR Feature Store Catalog Analytics Data DDB S3 Model Input Store Metadata Data DDB S3 ETL SageMaker Training Platform Evaluate Train ECR Config Store S3

Slide 34

Slide 34 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Inference Architecture – CI for ML Code SageMaker EndPoints Raw Data Lake API Gateway Lambda Optimization compute User Interface End User Source C 4. Build Source B Source A Model Repository DDB S3 ETL Glue EMR Feature Store Catalog Analytics Data DDB S3 Model Input Store Metadata Data DDB S3 ETL SageMaker Training Platform Evaluate Train ECR Config Store S3 CodePipeline Git Code Build Test Training flow Inference flow Build Flow

Slide 35

Slide 35 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Inference Architecture – 4 Blocks SageMaker EndPoints Raw Data Lake API Gateway Lambda Optimization compute User Interface End User Source C 4. Build Source B Source A Model Repository DDB S3 ETL Glue EMR Feature Store Catalog Analytics Data DDB S3 Model Input Store Metadata Data DDB S3 ETL SageMaker Training Platform Evaluate Train Pre Inference Post ECR DDB Config Store S3 CodePipeline Git Code Build Test Training flow Inference flow Build Flow Data Pipeline Inference Training Pipeline Build Pipeline

Slide 36

Slide 36 text

© 2019, Amazon Web Services, Inc. or its Affiliates. Takeaways https://aws.amazon.com/fr/blogs/machine-learning/automated-and-continuous- deployment-of-amazon-sagemaker-models-with-aws-step-functions/ Treat ML code as normal software code!

Slide 37

Slide 37 text

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its Affiliates. Thank You!