Slide 1

Slide 1 text

Continuous Delivery for Machine Learning Deploying ML Systems to Production safely and quickly in a sustainable way Adarsh Shah Engineering Leader, Coach, Hands-on Architect Independent Consultant @shahadarsh 
 https://shahadarsh.com nycdevops

Slide 2

Slide 2 text

https://shahadarsh.com @shahadarsh Adarsh Shah Engineering Leader, Coach, Hands-on Architect Independent Consultant @shahadarsh 
 https://shahadarsh.com

Slide 3

Slide 3 text

https://shahadarsh.com @shahadarsh Hidden Technical Debt in ML Systems From the paper Hidden Technical Debt in Machine Learning Systems

Slide 4

Slide 4 text

https://shahadarsh.com @shahadarsh 1 0 1 0 1 0 1 0 1 Results Traditional Software Development Machine Learning Program Data { } 1 0 1 0 1 0 1 0 1 Desired Results Model Training Data { } Program { } 1 0 1 0 1 0 1 0 1 Live Data Training Prediction Results

Slide 5

Slide 5 text

https://shahadarsh.com @shahadarsh Data Acquisition Data Preparation Model Development Training Prediction Accuracy Evaluation Data Management Experimentation Production Deployment Validation Monitoring / Alerting Accuracy not reached Retrain Data Drift Fix Accuracy reached

Slide 6

Slide 6 text

shahadarsh.com @shahadarsh Challenges Unique to ML

Slide 7

Slide 7 text

https://shahadarsh.com @shahadarsh #1: Data Management Data Location Large Datasets Security Compliance Data Quality Tracking Dataset

Slide 8

Slide 8 text

https://shahadarsh.com @shahadarsh #2: Experimentation Code Quality Research & 
 Experimentation Tracking experiments Training Time 
 & Troubleshooting Infrastructure 
 Requirements Model Accuracy Evaluation

Slide 9

Slide 9 text

https://shahadarsh.com @shahadarsh #3: Production Deployment Offline/Online 
 Prediction Monitoring & Alerting

Slide 10

Slide 10 text

https://shahadarsh.com @shahadarsh #4: Dependency Hell Dependency Hell ARM architecture

Slide 11

Slide 11 text

https://shahadarsh.com @shahadarsh What is Continuous Delivery? Continuous Delivery is the ability to get changes of all types—including new features, configuration changes, bug fixes and experiments—into production, or into the hands of users, safely and quickly in a sustainable way. - Jez Humble & Dave Farley 
 (Continuous Delivery Book Authors)

Slide 12

Slide 12 text

https://shahadarsh.com @shahadarsh Continuous Delivery

Slide 13

Slide 13 text

https://shahadarsh.com @shahadarsh Continuous Integration Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. - Martin Fowler

Slide 14

Slide 14 text

https://shahadarsh.com @shahadarsh Continuous Delivery Push Code Unit Tests Auto Auto Integration Tests Auto Acceptance Tests Auto Deploy to Production Continuous Deployment Push Code Unit Tests Auto Auto Integration Tests Auto Acceptance Tests Auto Deploy to Production Auto Manual

Slide 15

Slide 15 text

https://shahadarsh.com @shahadarsh Principles of Continuous Delivery ๏ Build quality in ๏ Work in small batches ๏ Computers perform repetitive tasks, people solve problems ๏ Relentlessly pursue continuous improvement (Kaizen) ๏ Everyone is responsible

Slide 16

Slide 16 text

https://shahadarsh.com @shahadarsh Toyota Production System

Slide 17

Slide 17 text

https://shahadarsh.com @shahadarsh Data pipeline Data Source A Data Source B Data Source C Data Acquisition A Data Validation
 A Data Preparation
 A Training 
 Dataset Versioned Training Process Testing Data Acquisition B Data Validation
 B Data Preparation
 B Data Acquisition C Data Validation
 C Data Preparation
 C Bias & Fairness —— Security 
 & Compliance

Slide 18

Slide 18 text

https://shahadarsh.com @shahadarsh Static Analysis Unit Tests Training Code Linting etc. Artifact Repository Build Artifact Continuous Integration (Training Code) Dev Environment Validation Tests Merge to 
 Main Branch

Slide 19

Slide 19 text

https://shahadarsh.com @shahadarsh Data Pipeline Continuous Integration 
 (Training Code) Configuration Training 
 Dataset Training Environment Accuracy Evaluation Monitoring/ Alerting Testing (Bias & Fairness) Model Trigger Log Aggregation Automated 
 Provisioning/De-provisioning Data Scientist Training

Slide 20

Slide 20 text

https://shahadarsh.com @shahadarsh Static Analysis Unit Tests Application Code Linting, Security Scan etc. Artifact Repository Build Artifact Ephemeral Environment Integration Tests Tag as Tested Model Continuous Integration (Application Code) Training

Slide 21

Slide 21 text

https://shahadarsh.com @shahadarsh Data Management Experimentation Production Deployment Data Pipeline Continuous Integration 
 (Training Code) Data Scientist Configuration Training Model Continuous Integration 
 (Application Code) Deployment Production Environment Smoke Tests Monitoring /Alerting Application 
 Developer Bringing it all together Training 
 Dataset

Slide 22

Slide 22 text

https://shahadarsh.com @shahadarsh Machine Learning Roles ML Researcher ML Engineer Data Engineer MLOps Engineer

Slide 23

Slide 23 text

https://shahadarsh.com @shahadarsh Team Structure Considerations Cross Functional Team Separate Data Science Team ML Platform Engineering Team

Slide 24

Slide 24 text

shahadarsh.com @shahadarsh Platforms available

Slide 25

Slide 25 text

https://shahadarsh.com @shahadarsh Platforms

Slide 26

Slide 26 text

https://shahadarsh.com @shahadarsh Kubeflow

Slide 27

Slide 27 text

https://shahadarsh.com @shahadarsh References • continuousdelivery.com • Dr. Deming’s 14 Points for Management • Challenges Deploying Machine Learning Models to Production • State of DevOps Report • martinfowler.com • Large image datasets: A pyrrhic win for computer vision?

Slide 28

Slide 28 text

https://shahadarsh.com @shahadarsh Book Recommendations

Slide 29

Slide 29 text

Questions Adarsh Shah Engineering Leader, Coach, Hands-on Architect Independent Consultant @shahadarsh 
 https://shahadarsh.com nycdevops