Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RLlib: Scalable RL training and serving in the cloud with Ray

RLlib: Scalable RL training and serving in the cloud with Ray

RLlib on Ray is an industrial-strength reinforcement learning (RL) framework with the power of Ray autoscaling built in.

Get an overview of RLlib, and learn why organizations like Wildlife Studios and Dow Chemicals are using it to apply RL to business problems like recommendations and supply chain optimization.

This webinar will also show how to set up an environment, train a model, and deploy to an HTTPS endpoint to serve your policy in production.



December 01, 2021


  1. Sven Mika, ML Engineer @ Anyscale Will Drevo, PM for

    ML @ Anyscale Scalable RL training and serving in the cloud with Ray 11/30/2021
  2. 1. What brings you here 2. When to use RL?

    3. Challenges in using RL 4. Why use Ray RLlib? 5. Demo! Agenda
  3. About us! Hi! We are: • Sven Mika ◦ Lead

    Ray RLlib core developer @ Anyscale • Will Drevo ◦ Product Manager, open-source ML @ Anyscale We’ve worked with countless RL practitioners in the field to use real, scalable, production-ready RL in the process of building RLlib. Today we’d love to share those insights.
  4. (1) What brings you here? What is your use case?

  5. (2) When to use RL?

  6. http://rllib.io • Reinforcement Learning (RL) can learn decision making “policies”

    in an end-to-end fashion. agent Inputs Why Reinforcement learning? Actions
  7. http://rllib.io Big categories • Offline learning – Recommendations – Training

    from block-box simulator • Multi-agent – Simulating game play with multiple agents for game fairness / leveling • Locomotion – Moving robotic arms, self-driving cars • Game play – AlphaGo, playing Atari-like games • Scheduling
  8. http://rllib.io RL “industrial-strength” use cases (Collected from RLlib’s industry users)

  9. http://rllib.io Other (Deep) RL application examples 9 AlphaGo Control systems

    Offer campaigns Scheduling Trading Decisioning
  10. http://rllib.io 1 0 RL application examples II (RLlib is now

    part of several industry RL platforms)
  11. (3) Challenges in using RL

  12. http://rllib.io Challenges in using RL • Difficult to stabilize and

    run deterministically • Difficult to tune and find best policy • Difficult to scale out training in distributed setting • Difficult to implement newest state-of-the-art algorithms • Framework-specific serving solutions are inflexible and require proxying through additional services, increasing complexity to end Python users
  13. (4) Why use Ray RLlib?

  14. http://rllib.io 1 4 Why we built RLlib on top of

    Ray Fig. courtesy OpenAI 100 fold’ish over 2 years 3-4 fold’ish over 2 years -> not enough: we need the cloud! Fig. courtesy NVidia Inc.
  15. What is Ray Native Libraries 3rd Party Libraries Your app

    here! Universal framework for distributed computing Run anywhere Library + app ecosystem
  16. http://rllib.io RLlib reference algorithms 1 6 • Cohesive API across

    – 14 TensorFlow algorithms – 20 PyTorch algorithms • Scale algorithms from laptop to a cluster • Customize algorithms for complex use cases, e.g., multi-agent RL
  17. A universal framework for distributed computing Notable users of Ray

    Ray RLlib users!
  18. Some of RLlib’s Industry Users

  19. RLlib Benchmarks @ github.com/ray-project/rl-experiments https://github.com/ray-project/rl-experiments

  20. (5) Demo!

  21. RLlib demo • Training a policy from scratch • Using

    distributed, GPU cluster
  22. Thanks! Ray Slack Channel: https://forms.gle/9TSdDYUgxYs8SA9e8 Forum: https://discuss.ray.io/