Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using RLLib in an Enterprise Scale Reinforcement Learning Solution (Jeroen Bédorf & Ishaan Sood, minds.ai)

Using RLLib in an Enterprise Scale Reinforcement Learning Solution (Jeroen Bédorf & Ishaan Sood, minds.ai)

DeepSim is an optimization platform that can use advanced Reinforcement Learning (RL) methods to develop neural network-based controller software. DeepSim supports various RL libraries, including RLLib. In this talk, we discuss how RLLib, as well as the Tune hyperparameter optimizer, are used to develop controller software. Next, to the default set of features that RLLib offers, DeepSim offers its users a set of custom loggers, actions distributions and network architectures for improved performance of the controllers. The training runs, required to train the neural network, are executed on a Kubernetes based Ray cluster and can be monitored via command line interface tools as well as via TensorBoard. Finally, we show how the trained neural network can be exported, for example via Keras, to be deployed on target hardware.

All the above is demonstrated using two concrete examples, in the first the fuel efficiency of a Hybrid Electric Vehicle is optimized and in the second we develop cruise control software using the Ansys VRXPERIENCE autonomous driving simulator.



July 21, 2021


  1. Using RLlib in an enterprise scale reinforcement learning solution Ray

    Summit 2021 Jeroen Bédorf, jeroen@minds.ai Ishaan Sood, ishaan@minds.ai
  2. ©minds.ai Problem Statements Integration and usage of RLlib and Tune

    DeepSim Platform Adaptive Cruise Control Demo Hybrid Electric Vehicle Demo Outline
  3. ©minds.ai Trend: Exploding complexity and proliferation of smart systems DeepSim:

    Bring RL to Subject Matter Experts Electrification Autonomy Automation Renewables
  4. ©minds.ai DeepSim: Bring RL to Subject Matter Experts Controllers: Brains

    behind complex systems Reinforcement Learning Controllers: Trained for operating complex systems PID Controller Process Feedback Input Output RL Agent (neural network) Environment Input Output
  5. ©minds.ai DeepSim: Platform Overview Environment Integration & Scenario support Training

    libraries Data Analysis & Visualization Toolkit HPO & NAS Neural Network Models & definition method Front end TFAgents RLlib Ray Internal MPI Horovod Tune Internal Public Cloud Backend Algorithms Distribution method
  6. ©minds.ai DeepSim: Usage of Ray, RLlib and Tune Custom Action

    Distributions Easy Model Definition Method Custom Logging Custom Models Export Methods Analysis Tools RLlib Tune Ray Inference Methods
  7. ©minds.ai Typical end-user workflow Configure simulation, reward, etc. Status &

    Progress information Export trained Agent 1 2 3 Set up training runs (HPO & NAS) Tune Train
  8. ©minds.ai Optimizer (ONNX, TensorRT, etc.) Trained Agent Ray Serve Embedded

    Laptop/Workstation Inference System / Controller RLlib Checkpoint Inference Library Deployment
  9. ©minds.ai. Use Cases