Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deploying Machine Learning Models to Production: Challenges & Solutions using MLOps

Deploying Machine Learning Models to Production: Challenges & Solutions using MLOps

For the most part Machine Learning is similar to traditional Software Development and most of the principles and practices that apply to Traditional Software Development also apply to Machine Learning. But there are also certain unique challenges that come with deploying ML models to Production.

In this presentation, we will look at the top Challenges you face deploying Machine Learning Models to Production and how to tackle those Challenges using MLOps.

Key Takeaways:

* How is Machine Learning different than Traditional Software Development
* Top challenges deploying ML Models to Production
* What is MLOps and how to tackle ML specific challenges using that
* Anecdotes about deploying ML Models using industry principles and best practices

Adarsh Shah

April 15, 2020
Tweet

More Decks by Adarsh Shah

Other Decks in Technology

Transcript

  1. Deploying Machine
    Learning Systems to
    Production
    Challenges & Solutions
    using MLOps
    Adarsh Shah
    Engineering Leader, Coach, Hands-on Architect
    Independent Consultant
    @shahadarsh
    shahadarsh.com

    View Slide

  2. shahadarsh.com @shahadarsh
    Traditional Software Development vs Machine
    Learning

    View Slide

  3. shahadarsh.com @shahadarsh
    Machine Learning Workflow
    Data Acquisition Data Preparation
    Model
    Development
    Model Training Model Serving
    Accuracy
    Evaluation
    Code

    Changes
    Retraining
    Data Management Experimentation Production Deployment

    View Slide

  4. shahadarsh.com @shahadarsh
    Fun Fact about Model training
    Everybody has. They just don’t realize it.
    Have you ever helped train a ML model?

    View Slide

  5. shahadarsh.com @shahadarsh
    Hidden Technical Debt in ML Systems
    From the paper Hidden Technical Debt in Machine Learning Systems

    View Slide

  6. shahadarsh.com @shahadarsh
    Challenges Unique to ML

    View Slide

  7. shahadarsh.com @shahadarsh
    #1: Data Management
    Data Location
    Large Datasets
    Security
    Compliance

    View Slide

  8. shahadarsh.com @shahadarsh
    #2: Constant Research and Experimentation
    Code Quality
    Experimentation
    Notebooks Tracking
    experiments

    View Slide

  9. shahadarsh.com @shahadarsh
    #3: Training Process
    Retraining
    Training Time
    Reproducibility

    View Slide

  10. shahadarsh.com @shahadarsh
    #4: Infrastructure Requirements
    Edge devices
    GPU & 

    high density cores
    Costs
    Elasticity

    View Slide

  11. shahadarsh.com @shahadarsh
    #5: Testing
    Model Accuracy
    Data Validation
    Model

    Bias & Fairness

    View Slide

  12. shahadarsh.com @shahadarsh
    #6: Dependency Hell
    Dependency Hell
    ARM architecture

    View Slide

  13. shahadarsh.com @shahadarsh
    MLOps

    View Slide

  14. shahadarsh.com @shahadarsh
    MLOps
    MLOps = Machine Learning + DevOps
    People Process
    +
    Technology
    +

    View Slide

  15. shahadarsh.com @shahadarsh
    Roles
    ML Researcher
    ML Engineer
    Data Engineer
    MLOps Engineer

    View Slide

  16. shahadarsh.com @shahadarsh
    Team Structure considerations
    Cross functional Team Separate Data Science Team ML Platform Engineering Team

    View Slide

  17. shahadarsh.com @shahadarsh
    Data pipeline
    Data
    Source A
    Data Acquisition
    A
    Data Preparation
    A
    Training 

    Dataset
    Data Validation
    A
    Data
    Source B
    Data
    Source N
    Data Acquisition
    B
    Data Acquisition
    N
    Data Preparation
    B
    Data Preparation
    N
    Data Validation
    B
    Data Validation
    N
    Input
    Training
    Process
    Input

    View Slide

  18. shahadarsh.com @shahadarsh
    Training Pipeline
    Training

    Code
    Continuous 

    Integration
    Training Data
    Data
    Pipeline
    Pre-trained

    Weights
    Validation
    Artifact
    Repository
    Push image
    Training Environment
    Cloud, On-Prem 

    or Edge location
    Infra
    provisioning
    automation
    GPU support
    Monitoring/
    Logging/Alerting
    UI or 

    command
    Schedule Training Bias & Fairness
    Testing
    Build & Version

    Model

    View Slide

  19. shahadarsh.com @shahadarsh
    Deployment Pipeline
    GitOps
    Monitoring/
    Logging/Alerting
    Artifact
    Repository
    Pull Model

    Image
    Model 

    Training
    Retrain Model

    (if accuracy 

    below acceptable %)
    Push to
    Master
    Infra
    provisioning
    automation
    GPU support
    Model Serving 

    Environment
    Cloud, On-Prem 

    or Edge location
    Deploy
    Model
    Evaluate Model
    Accuracy
    Periodic

    View Slide

  20. shahadarsh.com @shahadarsh
    Platforms available

    View Slide

  21. shahadarsh.com @shahadarsh
    Platforms

    View Slide

  22. shahadarsh.com @shahadarsh
    Kubeflow

    View Slide

  23. shahadarsh.com @shahadarsh
    To sum it up
    • Machine Learning Workflow
    • Traditional Software Development vs Machine Learning
    • Unique ML Challenges
    • Data Management
    • Constant Research and Experimentation
    • Training Process
    • Infrastructure Requirements
    • Testing
    • Dependency Hell
    • MLOps
    • Roles & Team Structure Considerations
    • Data, Training & Deployment Pipeline
    • Platforms Available

    View Slide

  24. Questions
    Adarsh Shah
    Engineering Leader, Coach, Hands-on Architect
    Independent Consultant
    @shahadarsh
    shahadarsh.com

    View Slide