Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RLlib: Scalable RL training and serving in the ...

Anyscale
December 01, 2021

RLlib: Scalable RL training and serving in the cloud with Ray

RLlib on Ray is an industrial-strength reinforcement learning (RL) framework with the power of Ray autoscaling built in.

Get an overview of RLlib, and learn why organizations like Wildlife Studios and Dow Chemicals are using it to apply RL to business problems like recommendations and supply chain optimization.

This webinar will also show how to set up an environment, train a model, and deploy to an HTTPS endpoint to serve your policy in production.

Anyscale

December 01, 2021
Tweet

More Decks by Anyscale

Other Decks in Technology

Transcript

  1. Sven Mika, ML Engineer @ Anyscale Will Drevo, PM for

    ML @ Anyscale Scalable RL training and serving in the cloud with Ray 11/30/2021
  2. 1. What brings you here 2. When to use RL?

    3. Challenges in using RL 4. Why use Ray RLlib? 5. Demo! Agenda
  3. About us! Hi! We are: • Sven Mika ◦ Lead

    Ray RLlib core developer @ Anyscale • Will Drevo ◦ Product Manager, open-source ML @ Anyscale We’ve worked with countless RL practitioners in the field to use real, scalable, production-ready RL in the process of building RLlib. Today we’d love to share those insights.
  4. http://rllib.io • Reinforcement Learning (RL) can learn decision making “policies”

    in an end-to-end fashion. agent Inputs Why Reinforcement learning? Actions
  5. http://rllib.io Big categories • Offline learning – Recommendations – Training

    from block-box simulator • Multi-agent – Simulating game play with multiple agents for game fairness / leveling • Locomotion – Moving robotic arms, self-driving cars • Game play – AlphaGo, playing Atari-like games • Scheduling
  6. http://rllib.io 1 0 RL application examples II (RLlib is now

    part of several industry RL platforms)
  7. http://rllib.io Challenges in using RL • Difficult to stabilize and

    run deterministically • Difficult to tune and find best policy • Difficult to scale out training in distributed setting • Difficult to implement newest state-of-the-art algorithms • Framework-specific serving solutions are inflexible and require proxying through additional services, increasing complexity to end Python users
  8. http://rllib.io 1 4 Why we built RLlib on top of

    Ray Fig. courtesy OpenAI 100 fold’ish over 2 years 3-4 fold’ish over 2 years -> not enough: we need the cloud! Fig. courtesy NVidia Inc.
  9. What is Ray Native Libraries 3rd Party Libraries Your app

    here! Universal framework for distributed computing Run anywhere Library + app ecosystem
  10. http://rllib.io RLlib reference algorithms 1 6 • Cohesive API across

    – 14 TensorFlow algorithms – 20 PyTorch algorithms • Scale algorithms from laptop to a cluster • Customize algorithms for complex use cases, e.g., multi-agent RL