RLlib: Scalable RL training and serving in the cloud with Ray

Sven Mika, ML Engineer @ Anyscale Will Drevo, PM for
ML @ Anyscale Scalable RL training and serving in the cloud with Ray 11/30/2021

1. What brings you here 2. When to use RL?
3. Challenges in using RL 4. Why use Ray RLlib? 5. Demo! Agenda

About us! Hi! We are: • Sven Mika ◦ Lead
Ray RLlib core developer @ Anyscale • Will Drevo ◦ Product Manager, open-source ML @ Anyscale We’ve worked with countless RL practitioners in the field to use real, scalable, production-ready RL in the process of building RLlib. Today we’d love to share those insights.

(1) What brings you here? What is your use case?

(2) When to use RL?

http://rllib.io • Reinforcement Learning (RL) can learn decision making “policies”
in an end-to-end fashion. agent Inputs Why Reinforcement learning? Actions

http://rllib.io Big categories • Oﬄine learning – Recommendations – Training
from block-box simulator • Multi-agent – Simulating game play with multiple agents for game fairness / leveling • Locomotion – Moving robotic arms, self-driving cars • Game play – AlphaGo, playing Atari-like games • Scheduling

http://rllib.io RL “industrial-strength” use cases (Collected from RLlib’s industry users)
8

http://rllib.io Other (Deep) RL application examples 9 AlphaGo Control systems
Offer campaigns Scheduling Trading Decisioning

http://rllib.io 1 0 RL application examples II (RLlib is now
part of several industry RL platforms)

(3) Challenges in using RL

http://rllib.io Challenges in using RL • Difficult to stabilize and
run deterministically • Difficult to tune and find best policy • Difficult to scale out training in distributed setting • Difficult to implement newest state-of-the-art algorithms • Framework-specific serving solutions are inflexible and require proxying through additional services, increasing complexity to end Python users

(4) Why use Ray RLlib?

http://rllib.io 1 4 Why we built RLlib on top of
Ray Fig. courtesy OpenAI 100 fold’ish over 2 years 3-4 fold’ish over 2 years -> not enough: we need the cloud! Fig. courtesy NVidia Inc.

What is Ray Native Libraries 3rd Party Libraries Your app
here! Universal framework for distributed computing Run anywhere Library + app ecosystem

http://rllib.io RLlib reference algorithms 1 6 • Cohesive API across
– 14 TensorFlow algorithms – 20 PyTorch algorithms • Scale algorithms from laptop to a cluster • Customize algorithms for complex use cases, e.g., multi-agent RL

A universal framework for distributed computing Notable users of Ray
Ray RLlib users!

Some of RLlib’s Industry Users

RLlib Benchmarks @ github.com/ray-project/rl-experiments https://github.com/ray-project/rl-experiments

(5) Demo!

RLlib demo • Training a policy from scratch • Using
distributed, GPU cluster

Thanks! Ray Slack Channel: https://forms.gle/9TSdDYUgxYs8SA9e8 Forum: https://discuss.ray.io/

RLlib: Scalable RL training and serving in the ...

RLlib: Scalable RL training and serving in the cloud with Ray

Anyscale

More Decks by Anyscale

Other Decks in Technology

Featured

Transcript

Sven Mika, ML Engineer @ Anyscale Will Drevo, PM for

1. What brings you here 2. When to use RL?

About us! Hi! We are: • Sven Mika ◦ Lead

(1) What brings you here? What is your use case?

(2) When to use RL?

http://rllib.io • Reinforcement Learning (RL) can learn decision making “policies”

http://rllib.io Big categories • Oﬄine learning – Recommendations – Training

http://rllib.io RL “industrial-strength” use cases (Collected from RLlib’s industry users)

http://rllib.io Other (Deep) RL application examples 9 AlphaGo Control systems

http://rllib.io 1 0 RL application examples II (RLlib is now

(3) Challenges in using RL

http://rllib.io Challenges in using RL • Diﬃcult to stabilize and

(4) Why use Ray RLlib?

http://rllib.io 1 4 Why we built RLlib on top of

What is Ray Native Libraries 3rd Party Libraries Your app

http://rllib.io RLlib reference algorithms 1 6 • Cohesive API across

A universal framework for distributed computing Notable users of Ray

Some of RLlib’s Industry Users

RLlib Benchmarks @ github.com/ray-project/rl-experiments https://github.com/ray-project/rl-experiments

(5) Demo!

RLlib demo • Training a policy from scratch • Using

Thanks! Ray Slack Channel: https://forms.gle/9TSdDYUgxYs8SA9e8 Forum: https://discuss.ray.io/