Applying Ray and RLlib to Real-life Industrial Use Cases (Edward Junprung & Sahar Esmaeilzadeh, Pathmind)

Slide 1

Slide 1 text

Applying RLLib to Real-Life Industrial Use Cases |

Slide 2

Slide 2 text

Agenda - Pathmind and Ray - Industrial Engineering workﬂow and optimization tactics. - Heuristics vs Optimizers vs RL - When and why you should try Reinforcement Learning - Simulation + RL Demo

Slide 3

Slide 3 text

| Pathmind and Ray - Bridge between real-life industrial processes and RL - We use RLLib out-of-box (PPO, Population-Based Training) Digital Twin

Slide 4

Slide 4 text

| Industrial Engineering Workflow Simulation/Digital Twin (Observations) Decisions (Action) Outcomes (Reward) Heuristics Static rules defined by a domain expert (if this, then that). Optimizers RL Automatically finds the best parameters of a model, with respect to certain constraints.

Slide 5

Slide 5 text

● Heuristics ○ Pros ■ Easy to understand and implement. ■ Factory manager knows exactly what is going on. ○ Cons ■ Not scalable. Can grow to hundreds of rules. ■ Inﬂexible and doesn’t react to change. | Heuristics vs Optimizers vs RL

Slide 6

Slide 6 text

● Optimizers ○ Pros ■ Relatively easy to set up ■ Not a black box ○ Cons ■ Clunky and slow in complex scenarios. ■ Static in nature. Have to re-run optimizer whenever things change. | Heuristics vs Optimizers vs RL

Slide 7

Slide 7 text

● Reinforcement Learning ○ Pros ■ Good with high variability ■ Handles large state spaces ■ Can navigate multiple contradictory objectives ○ Cons ■ Needs a data scientist ■ Black box, hard to explain why a policy made a decision | When and Why Reinforcement Learning