Operationalizing Ray Serve
Shreyas Krishnaswamy @ Anyscale
Slide 2
Slide 2 text
● Existing workflow for deploying Ray Serve
● Ray Serve’s new ops-friendly workflow
● Walk-through examples from the new Serve CLI
● Integration with deployment graphs
Outline
Slide 3
Slide 3 text
As you might have seen…
Slide 4
Slide 4 text
Configuring Deployments
Slide 5
Slide 5 text
Goal: Python-Centric API + Centralized Config
CLI REST
Version
Control MLOps
Slide 6
Slide 6 text
Current Ray Serve Workflow
MLOps
Developers
Config Change
Deploy to Prod
Develop App
Slide 7
Slide 7 text
Challenges with Config in Python
● No source of truth
● Configuration mixed with code
● Tough to build custom ops tooling on top of Serve
Slide 8
Slide 8 text
Ray Serve’s New Ops-Friendly Workflow
MLOps
Developers Deploy to Prod
Develop App
Slide 9
Slide 9 text
Operational Loop
Current Workflow New Workflow
Slide 10
Slide 10 text
Operational Advantages
● Structured config is single source of truth
● Automation: easier access to configurations options
● Enables custom ops tooling for Serve using the new YAML
config interface
Slide 11
Slide 11 text
Structured Config File
Slide 12
Slide 12 text
Ray Serve Offers the Best of Both Worlds
Developer Operator
● Quick updates
● Few Replicas
● Python
● Consistent updates
● Many Replicas
● YAML
Slide 13
Slide 13 text
Updates: While Developing
Slide 14
Slide 14 text
Updates: In Production
New!
Slide 15
Slide 15 text
Deployment Info: While Developing
Slide 16
Slide 16 text
Deployment Info: In Production
New!
Slide 17
Slide 17 text
Deployment Health Monitoring
New!
Slide 18
Slide 18 text
Deployment Graphs: serve run
New!
Slide 19
Slide 19 text
Deployment Graphs: serve build
New!
Slide 20
Slide 20 text
Future Plans: Improved Kubernetes Support
● Structured config is basis for a better Kubernetes integration
● Easily deploy, update, and monitor Ray Serve on K8s
● Enable automated workflows like CI/CD, continual learning
Slide 21
Slide 21 text
Future Plans: MLOps Integrations
● Ray Serve is a scalable, compute layer
● Integrations with best-in-breed MLOps tooling
● Model monitoring
● Drift detection
● Experiment tracking
● Model management
Slide 22
Slide 22 text
● Join the community
○ discuss.ray.io
○ github.com/ray-project/ray
○ @raydistributed and @anyscalecompute
● Fill out our survey (QR code) for:
○ Feedback to help shape the future of Ray Serve
○ One-on-one sessions with developers
○ Updates about upcoming features
Please get in touch
22