Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Applying Ray and RLlib to Real-life Industrial Use Cases (Edward Junprung & Sahar Esmaeilzadeh, Pathmind)

Applying Ray and RLlib to Real-life Industrial Use Cases (Edward Junprung & Sahar Esmaeilzadeh, Pathmind)

In this session, we’ll explore industrial applications of reinforcement learning and compare the performance of an RL policy to traditional heuristics and optimizers. We have found that in certain use cases, RL can outperform all other approaches by more than 10%. We will cover the following topics:

1. A comparison of reinforcement learning versus heuristics and optimizers.
2. Bridging Ray with a simulation IDE such as AnyLogic to train a reinforcement learning policy.
3. A demo using a heating, ventilation, and air condition (HVAC) system.

At the conclusion of this session, you should be able to identify use cases suitable for RL and gain intuition on how reinforcement learning can be applied to industrial use cases.



July 21, 2021


  1. Applying RLLib to Real-Life Industrial Use Cases |

  2. Agenda - Pathmind and Ray - Industrial Engineering workflow and

    optimization tactics. - Heuristics vs Optimizers vs RL - When and why you should try Reinforcement Learning - Simulation + RL Demo
  3. | Pathmind and Ray - Bridge between real-life industrial processes

    and RL - We use RLLib out-of-box (PPO, Population-Based Training) Digital Twin
  4. | Industrial Engineering Workflow Simulation/Digital Twin (Observations) Decisions (Action) Outcomes

    (Reward) Heuristics Static rules defined by a domain expert (if this, then that). Optimizers RL Automatically finds the best parameters of a model, with respect to certain constraints.
  5. • Heuristics ◦ Pros ▪ Easy to understand and implement.

    ▪ Factory manager knows exactly what is going on. ◦ Cons ▪ Not scalable. Can grow to hundreds of rules. ▪ Inflexible and doesn’t react to change. | Heuristics vs Optimizers vs RL
  6. • Optimizers ◦ Pros ▪ Relatively easy to set up

    ▪ Not a black box ◦ Cons ▪ Clunky and slow in complex scenarios. ▪ Static in nature. Have to re-run optimizer whenever things change. | Heuristics vs Optimizers vs RL
  7. • Reinforcement Learning ◦ Pros ▪ Good with high variability

    ▪ Handles large state spaces ▪ Can navigate multiple contradictory objectives ◦ Cons ▪ Needs a data scientist ▪ Black box, hard to explain why a policy made a decision | When and Why Reinforcement Learning
  8. | Use Cases

  9. Simulation Demo

  10. THANK YOU edward@pathmind.com sahar@pathmind.com