Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Continuous Delivery for Machine Learning - nycdevops

Adarsh Shah
September 17, 2020

Continuous Delivery for Machine Learning - nycdevops

Continuous Delivery has been a key approach for deploying changes for Traditional Software to Production safely and quickly in a sustainable way.

Machine Learning (ML) is fundamentally different than Traditional Software. Typical ML workflow includes Data Management, Experimentation (Model Training & Development), Model Deployment, and Prediction. Training a model takes hours & sometimes days & typically deals with a large dataset. Training & Model Prediction also requires special resources like high-density cores & GPUs. Due to these reasons & others, ML systems have their own challenges deploying to Production.

In this presentation, we will look at those top challenges deploying ML systems to Production and how Continuous Delivery Principles can help solve those challenges so that ML systems can also be deployed safely and quickly in a sustainable way to Production. We will also be looking at different tools available to enable Continuous Delivery for Machine Learning.

Adarsh Shah

September 17, 2020
Tweet

More Decks by Adarsh Shah

Other Decks in Technology

Transcript

  1. Continuous Delivery
    for Machine Learning
    Deploying ML Systems to
    Production safely and quickly
    in a sustainable way
    Adarsh Shah
    Engineering Leader, Coach, Hands-on Architect
    Independent Consultant
    @shahadarsh 

    https://shahadarsh.com
    nycdevops

    View Slide

  2. https://shahadarsh.com @shahadarsh
    Adarsh Shah
    Engineering Leader, Coach, Hands-on Architect
    Independent Consultant
    @shahadarsh 

    https://shahadarsh.com

    View Slide

  3. https://shahadarsh.com @shahadarsh
    Hidden Technical Debt in ML Systems
    From the paper Hidden Technical Debt in Machine Learning Systems

    View Slide

  4. https://shahadarsh.com @shahadarsh
    1 0 1
    0 1 0
    1 0 1
    Results
    Traditional Software Development Machine Learning
    Program Data
    { } 1 0 1
    0 1 0
    1 0 1
    Desired
    Results
    Model
    Training
    Data
    { }
    Program
    { } 1 0 1
    0 1 0
    1 0 1
    Live Data
    Training Prediction
    Results

    View Slide

  5. https://shahadarsh.com @shahadarsh
    Data
    Acquisition
    Data
    Preparation
    Model
    Development
    Training Prediction
    Accuracy
    Evaluation
    Data Management Experimentation Production Deployment
    Validation
    Monitoring
    / Alerting
    Accuracy not reached
    Retrain
    Data Drift Fix
    Accuracy
    reached

    View Slide

  6. shahadarsh.com @shahadarsh
    Challenges Unique to ML

    View Slide

  7. https://shahadarsh.com @shahadarsh
    #1: Data Management
    Data Location
    Large Datasets
    Security
    Compliance
    Data Quality
    Tracking Dataset

    View Slide

  8. https://shahadarsh.com @shahadarsh
    #2: Experimentation
    Code Quality
    Research & 

    Experimentation
    Tracking
    experiments
    Training Time 

    & Troubleshooting
    Infrastructure 

    Requirements
    Model Accuracy
    Evaluation

    View Slide

  9. https://shahadarsh.com @shahadarsh
    #3: Production Deployment
    Offline/Online 

    Prediction
    Monitoring & Alerting

    View Slide

  10. https://shahadarsh.com @shahadarsh
    #4: Dependency Hell
    Dependency Hell
    ARM architecture

    View Slide

  11. https://shahadarsh.com @shahadarsh
    What is Continuous Delivery?
    Continuous Delivery is the ability to get changes of all
    types—including new features, configuration changes,
    bug fixes and experiments—into production, or into the
    hands of users, safely and quickly in a sustainable
    way.
    - Jez Humble & Dave Farley 

    (Continuous Delivery Book Authors)

    View Slide

  12. https://shahadarsh.com @shahadarsh
    Continuous Delivery

    View Slide

  13. https://shahadarsh.com @shahadarsh
    Continuous Integration
    Continuous Integration is a software development
    practice where members of a team integrate their work
    frequently, usually each person integrates at least daily -
    leading to multiple integrations per day.
    - Martin Fowler

    View Slide

  14. https://shahadarsh.com @shahadarsh
    Continuous Delivery
    Push
    Code
    Unit Tests
    Auto Auto
    Integration
    Tests
    Auto
    Acceptance
    Tests
    Auto
    Deploy to
    Production
    Continuous Deployment
    Push
    Code
    Unit Tests
    Auto Auto
    Integration
    Tests
    Auto
    Acceptance
    Tests
    Auto
    Deploy to
    Production
    Auto
    Manual

    View Slide

  15. https://shahadarsh.com @shahadarsh
    Principles of Continuous Delivery
    ๏ Build quality in
    ๏ Work in small batches
    ๏ Computers perform repetitive tasks, people solve
    problems
    ๏ Relentlessly pursue continuous improvement (Kaizen)
    ๏ Everyone is responsible

    View Slide

  16. https://shahadarsh.com @shahadarsh
    Toyota Production System

    View Slide

  17. https://shahadarsh.com @shahadarsh
    Data pipeline
    Data
    Source A
    Data
    Source B
    Data
    Source C
    Data
    Acquisition
    A
    Data
    Validation

    A
    Data
    Preparation

    A
    Training 

    Dataset
    Versioned
    Training
    Process
    Testing
    Data
    Acquisition
    B
    Data
    Validation

    B
    Data
    Preparation

    B
    Data
    Acquisition
    C
    Data
    Validation

    C
    Data
    Preparation

    C
    Bias & Fairness
    ——
    Security 

    & Compliance

    View Slide

  18. https://shahadarsh.com @shahadarsh
    Static
    Analysis
    Unit Tests
    Training Code
    Linting etc.
    Artifact
    Repository
    Build
    Artifact
    Continuous Integration (Training Code)
    Dev
    Environment
    Validation
    Tests
    Merge to 

    Main Branch

    View Slide

  19. https://shahadarsh.com @shahadarsh
    Data Pipeline Continuous Integration 

    (Training Code) Configuration
    Training 

    Dataset
    Training
    Environment
    Accuracy
    Evaluation
    Monitoring/
    Alerting
    Testing (Bias
    & Fairness)
    Model
    Trigger
    Log
    Aggregation
    Automated 

    Provisioning/De-provisioning
    Data
    Scientist
    Training

    View Slide

  20. https://shahadarsh.com @shahadarsh
    Static
    Analysis
    Unit Tests
    Application
    Code
    Linting, Security Scan etc.
    Artifact
    Repository
    Build
    Artifact
    Ephemeral
    Environment
    Integration
    Tests
    Tag as
    Tested
    Model
    Continuous Integration (Application Code)
    Training

    View Slide

  21. https://shahadarsh.com @shahadarsh
    Data Management Experimentation Production Deployment
    Data Pipeline Continuous Integration 

    (Training Code)
    Data
    Scientist
    Configuration
    Training Model
    Continuous Integration 

    (Application Code)
    Deployment
    Production
    Environment
    Smoke
    Tests
    Monitoring
    /Alerting
    Application 

    Developer
    Bringing it all together
    Training 

    Dataset

    View Slide

  22. https://shahadarsh.com @shahadarsh
    Machine Learning Roles
    ML Researcher
    ML Engineer
    Data Engineer
    MLOps Engineer

    View Slide

  23. https://shahadarsh.com @shahadarsh
    Team Structure Considerations
    Cross Functional Team Separate Data Science Team ML Platform Engineering Team

    View Slide

  24. shahadarsh.com @shahadarsh
    Platforms available

    View Slide

  25. https://shahadarsh.com @shahadarsh
    Platforms

    View Slide

  26. https://shahadarsh.com @shahadarsh
    Kubeflow

    View Slide

  27. https://shahadarsh.com @shahadarsh
    References
    • continuousdelivery.com
    • Dr. Deming’s 14 Points for Management
    • Challenges Deploying Machine Learning Models to
    Production
    • State of DevOps Report
    • martinfowler.com
    • Large image datasets: A pyrrhic win for computer vision?

    View Slide

  28. https://shahadarsh.com @shahadarsh
    Book Recommendations

    View Slide

  29. Questions
    Adarsh Shah
    Engineering Leader, Coach, Hands-on Architect
    Independent Consultant
    @shahadarsh 

    https://shahadarsh.com
    nycdevops

    View Slide