@useautomation | DevOpsDays Austin
A Tale of Two Ops: How MLOps
can learn from DevOps
Andre Elizondo
Solutions Architect @ WhyLabs
Slide 2
Slide 2 text
@useautomation | DevOpsDays Austin
Who am I?
● Seattle, WA
● Recovering Sysadmin, SRE, Evangelist
○ Chef, Adobe, Datadog, Big Fish Games, Lacework, etc.
● >10 yrs part of the DevOps community
@useautomation | DevOpsDays Austin
Machine Learning is the new OpenStack
Slide 9
Slide 9 text
@useautomation | DevOpsDays Austin
Slide 10
Slide 10 text
@useautomation | DevOpsDays Austin
We’re here
Slide 11
Slide 11 text
@useautomation | DevOpsDays Austin
We’re here
We’ll get
here soon
Slide 12
Slide 12 text
@useautomation | DevOpsDays Austin
Machine Learning is the new shadow IT
Slide 13
Slide 13 text
@useautomation | DevOpsDays Austin
We’re at risk of another pig over the fence
Slide 14
Slide 14 text
@useautomation | DevOpsDays Austin
We’re at risk of another pig over the fence
Slide 15
Slide 15 text
@useautomation | DevOpsDays Austin
What is MLOps?
● Applying DevOps practices & culture
in Machine Learning
● A process that involves multiple
teams/silos
https://ml-ops.org/content/mlops-principles
Slide 16
Slide 16 text
@useautomation | DevOpsDays Austin
What is MLOps?
● Applying DevOps practices & culture
in Machine Learning
● A process that involves multiple
teams/silos
● Not AIOps
https://ml-ops.org/content/mlops-principles
@useautomation | DevOpsDays Austin
How is it similar to DevOps?
● Shared concepts, but different
○ CI/CD
○ Observability
○ Automation
○ Containers
● Huge silos between teams
○ Data Engineering
○ Data Scientists
○ ML Engineers
○ Product Managers
○ DevOps/SRE/Operations
https://ml-ops.org/content/mlops-principles
Slide 20
Slide 20 text
@useautomation | DevOpsDays Austin
CI/CD in MLOps
● Deploying your model
○ Testing is different
○ Scaling is different(ish)
○ Packaging is more or less the same
○ Continuous delivery is possible but
harder
● ML Data Pipelines
○ Training
○ Feature
○ Inference
https://ml-ops.org/content/mlops-principles
Slide 21
Slide 21 text
@useautomation | DevOpsDays Austin
CI/CD in MLOps
● Deploying your model
○ Testing is different
○ Scaling is different(ish)
○ Packaging is more or less the same
○ Continuous delivery is possible but
harder
● ML Data Pipelines
○ Training
○ Feature
○ Inference
https://ml-ops.org/content/mlops-principles
Slide 22
Slide 22 text
@useautomation | DevOpsDays Austin
Observability in MLOps
● Performance is important
○ Some similar metrics, some different ones
○ Threshold baselines are different
● Availability is important
○ Service availability isn’t enough
● External dependencies need to be
monitored upstream
● Sometimes batch, sometimes real time
https://www.oreilly.com/library/view/reliable-machine-learning/9781098106218/
Slide 23
Slide 23 text
@useautomation | DevOpsDays Austin
Automation in MLOps
● Response workflows
○ Retraining
○ Roll-back
● Infrastructure as code
○ Terraform
● Monitoring as code
https://ml-ops.org/content/mlops-principles
Slide 24
Slide 24 text
@useautomation | DevOpsDays Austin
Containers in MLOps
● Dependency isolation
● Yes, it’s still kubernetes.
○ With all of it’s usual complaints.
● Sometimes controlled directly, most
times through a platform
○ Kubeflow
○ Sagemaker
○ AzureML
○ Vertex
● Scaled for model serving, training,
and pipelines
Slide 25
Slide 25 text
@useautomation | DevOpsDays Austin
What is unique about MLOps?
● Machine learning systems tend to be:
○ Fragile to changes in data
○ Harder to test
○ Harder to scale
● Complex to measure
○ What is good vs what is bad?
○ You may not know if something is good or
bad for a while
● Models get worse over time, not
better
● Data Scientists <3 Jupyter notebooks
https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
Slide 26
Slide 26 text
@useautomation | DevOpsDays Austin
Why should you be excited about MLOps?
● There’s a TON of new innovation happening
● There is a desperate need for operating experience
● MLOps is where DevOps was ~8-10 years ago
● Open source development is happening fast
● ML is here to stay
Slide 27
Slide 27 text
@useautomation | DevOpsDays Austin
We have a LOT of knowledge to share
Slide 28
Slide 28 text
@useautomation | DevOpsDays Austin
What should you do?
● Find out what models you’re running (or planning to run) in production
● Get involved, share knowledge and experiences
● Start experimenting with open source models & examples
● Talk about this with your team and think about how you can avoid surprises