Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Continuous Delivery for Machine Learning - nycdevops

Adarsh Shah
September 17, 2020

Continuous Delivery for Machine Learning - nycdevops

Continuous Delivery has been a key approach for deploying changes for Traditional Software to Production safely and quickly in a sustainable way.

Machine Learning (ML) is fundamentally different than Traditional Software. Typical ML workflow includes Data Management, Experimentation (Model Training & Development), Model Deployment, and Prediction. Training a model takes hours & sometimes days & typically deals with a large dataset. Training & Model Prediction also requires special resources like high-density cores & GPUs. Due to these reasons & others, ML systems have their own challenges deploying to Production.

In this presentation, we will look at those top challenges deploying ML systems to Production and how Continuous Delivery Principles can help solve those challenges so that ML systems can also be deployed safely and quickly in a sustainable way to Production. We will also be looking at different tools available to enable Continuous Delivery for Machine Learning.

Adarsh Shah

September 17, 2020
Tweet

More Decks by Adarsh Shah

Other Decks in Technology

Transcript

  1. Continuous Delivery for Machine Learning Deploying ML Systems to Production

    safely and quickly in a sustainable way Adarsh Shah Engineering Leader, Coach, Hands-on Architect Independent Consultant @shahadarsh 
 https://shahadarsh.com nycdevops
  2. https://shahadarsh.com @shahadarsh Hidden Technical Debt in ML Systems From the

    paper Hidden Technical Debt in Machine Learning Systems
  3. https://shahadarsh.com @shahadarsh 1 0 1 0 1 0 1 0

    1 Results Traditional Software Development Machine Learning Program Data { } 1 0 1 0 1 0 1 0 1 Desired Results Model Training Data { } Program { } 1 0 1 0 1 0 1 0 1 Live Data Training Prediction Results
  4. https://shahadarsh.com @shahadarsh Data Acquisition Data Preparation Model Development Training Prediction

    Accuracy Evaluation Data Management Experimentation Production Deployment Validation Monitoring / Alerting Accuracy not reached Retrain Data Drift Fix Accuracy reached
  5. https://shahadarsh.com @shahadarsh #2: Experimentation Code Quality Research & 
 Experimentation

    Tracking experiments Training Time 
 & Troubleshooting Infrastructure 
 Requirements Model Accuracy Evaluation
  6. https://shahadarsh.com @shahadarsh What is Continuous Delivery? Continuous Delivery is the

    ability to get changes of all types—including new features, configuration changes, bug fixes and experiments—into production, or into the hands of users, safely and quickly in a sustainable way. - Jez Humble & Dave Farley 
 (Continuous Delivery Book Authors)
  7. https://shahadarsh.com @shahadarsh Continuous Integration Continuous Integration is a software development

    practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. - Martin Fowler
  8. https://shahadarsh.com @shahadarsh Continuous Delivery Push Code Unit Tests Auto Auto

    Integration Tests Auto Acceptance Tests Auto Deploy to Production Continuous Deployment Push Code Unit Tests Auto Auto Integration Tests Auto Acceptance Tests Auto Deploy to Production Auto Manual
  9. https://shahadarsh.com @shahadarsh Principles of Continuous Delivery ๏ Build quality in

    ๏ Work in small batches ๏ Computers perform repetitive tasks, people solve problems ๏ Relentlessly pursue continuous improvement (Kaizen) ๏ Everyone is responsible
  10. https://shahadarsh.com @shahadarsh Data pipeline Data Source A Data Source B

    Data Source C Data Acquisition A Data Validation
 A Data Preparation
 A Training 
 Dataset Versioned Training Process Testing Data Acquisition B Data Validation
 B Data Preparation
 B Data Acquisition C Data Validation
 C Data Preparation
 C Bias & Fairness —— Security 
 & Compliance
  11. https://shahadarsh.com @shahadarsh Static Analysis Unit Tests Training Code Linting etc.

    Artifact Repository Build Artifact Continuous Integration (Training Code) Dev Environment Validation Tests Merge to 
 Main Branch
  12. https://shahadarsh.com @shahadarsh Data Pipeline Continuous Integration 
 (Training Code) Configuration

    Training 
 Dataset Training Environment Accuracy Evaluation Monitoring/ Alerting Testing (Bias & Fairness) Model Trigger Log Aggregation Automated 
 Provisioning/De-provisioning Data Scientist Training
  13. https://shahadarsh.com @shahadarsh Static Analysis Unit Tests Application Code Linting, Security

    Scan etc. Artifact Repository Build Artifact Ephemeral Environment Integration Tests Tag as Tested Model Continuous Integration (Application Code) Training
  14. https://shahadarsh.com @shahadarsh Data Management Experimentation Production Deployment Data Pipeline Continuous

    Integration 
 (Training Code) Data Scientist Configuration Training Model Continuous Integration 
 (Application Code) Deployment Production Environment Smoke Tests Monitoring /Alerting Application 
 Developer Bringing it all together Training 
 Dataset
  15. https://shahadarsh.com @shahadarsh References • continuousdelivery.com • Dr. Deming’s 14 Points

    for Management • Challenges Deploying Machine Learning Models to Production • State of DevOps Report • martinfowler.com • Large image datasets: A pyrrhic win for computer vision?