Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ray Serve: Overview and future roadmap

Ray Serve: Overview and future roadmap

In this introductory session, we’ll discuss the motivation behind Ray Serve, who’s using Ray Serve and why, and recent features and updates, including a look at the future feature roadmap as we approach Ray 2.0.

Anyscale

April 14, 2022
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Outline • Ray Serve Introduction • Feedback from the Community

    • Plans for Ray 2.0 and beyond ◦ Preview for other talks today! 2
  2. Ray Serve TL;DR Flexible, scalable compute for model serving 1.

    Scalable 2. Low latency 3. Efficient First-class support for multi-model serving Python-native: mix business logic & ML 4
  3. Write a unified Python program Use your favorite tools &

    libraries Scale across CPUs and GPUs 6 Multi-model Serving
  4. Feedback from the Community Multi-model serving is a big need

    and key strength 💸💸 💸 ML inference is expensive! Efficiency is key. We need better support & documentation for CI/CD • Emerging pattern: continual learning 7
  5. In-progress for Ray 2.0 Double down on multi-model: Deployment Graph

    API 1. REST API & improved Kubernetes support 2. Integrations with best-in-breed MLOps tooling Seamless interoperability with Ray AIR Hear from Jiao later today! 8 Hear from Shreyas later today! 🤩🤩 🤩
  6. Extended Roadmap • Scale-to-zero • gRPC support • Model multiplexing

    (100s-1000s of small models) • Shared memory for model weights • … 9 We want to hear from you!
  7. • Join the community ◦ discuss.ray.io ◦ github.com/ray-project/ray ◦ @raydistributed

    and @anyscalecompute • Fill out our survey (QR code) for: ◦ Feedback to help shape the future of Ray Serve ◦ One-on-one sessions with developers ◦ Updates about upcoming features Please get in touch 10