$30 off During Our Annual Pro Sale. View Details »

Continuous Delivery for Machine Learning Systems - ADDO

Adarsh Shah
November 12, 2020

Continuous Delivery for Machine Learning Systems - ADDO

Machine Learning workflow includes data management, experiment management (model training & development), model deployment, serving, and retraining. Training a model takes hours & some times days & typically deals with a large dataset. Training & serving a model also require special resources like high-density cores & GPUs.

In this talk, we will look at how Continuous Delivery for Machine Learning looks like using anecdotes and how to use cloud-native technologies to perform various steps in a Machine Learning workflow. We will also be talking about how it is different from deploying other software and what are the various aspects to consider. We will also be looking at different tools available to enable Continuous Delivery for machine learning.

Adarsh Shah

November 12, 2020
Tweet

More Decks by Adarsh Shah

Other Decks in Technology

Transcript

  1. Continuous Delivery for
    Machine Learning Systems
    Deploying ML Systems to
    Production safely and quickly
    in a sustainable way
    Adarsh Shah
    Engineering Leader, Coach, Hands-on Architect
    Independent Consultant
    @shahadarsh 

    https://shahadarsh.com

    View Slide

  2. https://shahadarsh.com @shahadarsh
    Hidden Technical Debt in ML Systems
    From the paper Hidden Technical Debt in Machine Learning Systems

    View Slide

  3. https://shahadarsh.com @shahadarsh
    1 0 1
    0 1 0
    1 0 1
    Results
    Traditional Software Development Machine Learning
    Program Data
    { } 1 0 1
    0 1 0
    1 0 1
    Desired
    Results
    Model
    Training
    Data
    { }
    Program
    { } 1 0 1
    0 1 0
    1 0 1
    Live Data
    Training Prediction
    Results

    View Slide

  4. https://shahadarsh.com @shahadarsh
    Data
    Acquisition
    Data
    Preparation
    Model
    Development
    Training Prediction
    Accuracy
    Evaluation
    Data Management Experimentation Production Deployment
    Validation
    Monitoring
    / Alerting
    Accuracy not reached
    Retrain
    Data Drift Fix
    Accuracy
    reached

    View Slide

  5. shahadarsh.com @shahadarsh
    Challenges Unique to ML

    View Slide

  6. https://shahadarsh.com @shahadarsh
    #1: Data Management
    Data Location
    Large Datasets
    Security
    Compliance
    Data Quality
    Tracking Dataset

    View Slide

  7. https://shahadarsh.com @shahadarsh
    #2: Experimentation
    Code Quality
    Research & 

    Experimentation
    Tracking
    experiments
    Training Time 

    & Troubleshooting
    Infrastructure 

    Requirements
    Model Accuracy
    Evaluation

    View Slide

  8. https://shahadarsh.com @shahadarsh
    #3: Production Deployment
    Offline/Online 

    Prediction
    Monitoring & Alerting

    View Slide

  9. https://shahadarsh.com @shahadarsh
    What is Continuous Delivery?
    Continuous Delivery is the ability to get changes of all
    types—including new features, configuration changes,
    bug fixes and experiments—into production, or into the
    hands of users, safely and quickly in a sustainable
    way.
    - Jez Humble & Dave Farley 

    (Continuous Delivery Book Authors)

    View Slide

  10. https://shahadarsh.com @shahadarsh
    Continuous Delivery

    View Slide

  11. https://shahadarsh.com @shahadarsh
    Continuous Integration
    Continuous Integration is a software development
    practice where members of a team integrate their work
    frequently, usually each person integrates at least daily -
    leading to multiple integrations per day.
    - Martin Fowler

    View Slide

  12. https://shahadarsh.com @shahadarsh
    Principles of Continuous Delivery
    ๏ Build quality in
    ๏ Work in small batches
    ๏ Computers perform repetitive tasks, people solve
    problems
    ๏ Relentlessly pursue continuous improvement (Kaizen)
    ๏ Everyone is responsible

    View Slide

  13. https://shahadarsh.com @shahadarsh
    Data pipeline
    Data
    Source A
    Data
    Source B
    Data
    Source C
    Data
    Acquisition
    A
    Data
    Validation

    A
    Data
    Preparation

    A
    Training 

    Dataset
    Versioned
    Training
    Process
    Testing
    Data
    Acquisition
    B
    Data
    Validation

    B
    Data
    Preparation

    B
    Data
    Acquisition
    C
    Data
    Validation

    C
    Data
    Preparation

    C
    Bias & Fairness
    ——
    Security 

    & Compliance

    View Slide

  14. https://shahadarsh.com @shahadarsh
    Static
    Analysis
    Unit Tests
    Training Code
    Linting etc.
    Artifact
    Repository
    Build
    Artifact
    Continuous Integration (Training Code)
    Dev
    Environment
    Validation
    Tests
    Merge to 

    Main Branch

    View Slide

  15. https://shahadarsh.com @shahadarsh
    Data Pipeline Continuous Integration 

    (Training Code) Configuration
    Training 

    Dataset
    Training
    Environment
    Accuracy
    Evaluation
    Monitoring/
    Alerting
    Testing (Bias
    & Fairness)
    Model
    Trigger
    Log
    Aggregation
    Automated 

    Provisioning/De-provisioning
    Data
    Scientist
    Training

    View Slide

  16. https://shahadarsh.com @shahadarsh
    Static
    Analysis
    Unit Tests
    Application
    Code
    Linting, Security Scan etc.
    Artifact
    Repository
    Build
    Artifact
    Ephemeral
    Environment
    Integration
    Tests
    Tag as
    Tested
    Model
    Continuous Integration (Application Code)
    Training

    View Slide

  17. https://shahadarsh.com @shahadarsh
    Data Management Experimentation Production Deployment
    Data Pipeline Continuous Integration 

    (Training Code)
    Data
    Scientist
    Configuration
    Training Model
    Continuous Integration 

    (Application Code)
    Deployment
    Production
    Environment
    Smoke
    Tests
    Monitoring
    /Alerting
    Application 

    Developer
    Bringing it all together
    Training 

    Dataset

    View Slide

  18. https://shahadarsh.com @shahadarsh
    Machine Learning Roles
    ML Researcher
    ML Engineer
    Data Engineer
    MLOps Engineer

    View Slide

  19. https://shahadarsh.com @shahadarsh
    Team Structure Considerations
    Cross Functional Team Separate Data Science Team ML Platform Engineering Team

    View Slide

  20. https://shahadarsh.com @shahadarsh
    Platforms

    View Slide

  21. https://shahadarsh.com @shahadarsh
    References
    • continuousdelivery.com
    • Dr. Deming’s 14 Points for Management
    • Challenges Deploying Machine Learning Models to
    Production
    • State of DevOps Report
    • martinfowler.com
    • Large image datasets: A pyrrhic win for computer vision?

    View Slide

  22. https://shahadarsh.com @shahadarsh
    Book Recommendations

    View Slide

  23. https://shahadarsh.com @shahadarsh
    Adarsh Shah
    Engineering Leader, Coach, Hands-on Architect
    Independent Consultant
    @shahadarsh 

    https://shahadarsh.com

    View Slide