Slide 1

Slide 1 text

Operationalizing Ray Serve Shreyas Krishnaswamy @ Anyscale

Slide 2

Slide 2 text

● Existing workflow for deploying Ray Serve ● Ray Serve’s new ops-friendly workflow ● Walk-through examples from the new Serve CLI ● Integration with deployment graphs Outline

Slide 3

Slide 3 text

As you might have seen…

Slide 4

Slide 4 text

Configuring Deployments

Slide 5

Slide 5 text

Goal: Python-Centric API + Centralized Config CLI REST Version Control MLOps

Slide 6

Slide 6 text

Current Ray Serve Workflow MLOps Developers Config Change Deploy to Prod Develop App

Slide 7

Slide 7 text

Challenges with Config in Python ● No source of truth ● Configuration mixed with code ● Tough to build custom ops tooling on top of Serve

Slide 8

Slide 8 text

Ray Serve’s New Ops-Friendly Workflow MLOps Developers Deploy to Prod Develop App

Slide 9

Slide 9 text

Operational Loop Current Workflow New Workflow

Slide 10

Slide 10 text

Operational Advantages ● Structured config is single source of truth ● Automation: easier access to configurations options ● Enables custom ops tooling for Serve using the new YAML config interface

Slide 11

Slide 11 text

Structured Config File

Slide 12

Slide 12 text

Ray Serve Offers the Best of Both Worlds Developer Operator ● Quick updates ● Few Replicas ● Python ● Consistent updates ● Many Replicas ● YAML

Slide 13

Slide 13 text

Updates: While Developing

Slide 14

Slide 14 text

Updates: In Production New!

Slide 15

Slide 15 text

Deployment Info: While Developing

Slide 16

Slide 16 text

Deployment Info: In Production New!

Slide 17

Slide 17 text

Deployment Health Monitoring New!

Slide 18

Slide 18 text

Deployment Graphs: serve run New!

Slide 19

Slide 19 text

Deployment Graphs: serve build New!

Slide 20

Slide 20 text

Future Plans: Improved Kubernetes Support ● Structured config is basis for a better Kubernetes integration ● Easily deploy, update, and monitor Ray Serve on K8s ● Enable automated workflows like CI/CD, continual learning

Slide 21

Slide 21 text

Future Plans: MLOps Integrations ● Ray Serve is a scalable, compute layer ● Integrations with best-in-breed MLOps tooling ● Model monitoring ● Drift detection ● Experiment tracking ● Model management

Slide 22

Slide 22 text

● Join the community ○ discuss.ray.io ○ github.com/ray-project/ray ○ @raydistributed and @anyscalecompute ● Fill out our survey (QR code) for: ○ Feedback to help shape the future of Ray Serve ○ One-on-one sessions with developers ○ Updates about upcoming features Please get in touch 22