Applying Ray and RLlib to Real-life Industrial Use Cases (Edward Junprung & Sahar Esmaeilzadeh, Pathmind)

Applying RLLib to Real-Life Industrial Use Cases |

Agenda - Pathmind and Ray - Industrial Engineering workﬂow and
optimization tactics. - Heuristics vs Optimizers vs RL - When and why you should try Reinforcement Learning - Simulation + RL Demo

| Pathmind and Ray - Bridge between real-life industrial processes
and RL - We use RLLib out-of-box (PPO, Population-Based Training) Digital Twin

| Industrial Engineering Workflow Simulation/Digital Twin (Observations) Decisions (Action) Outcomes
(Reward) Heuristics Static rules defined by a domain expert (if this, then that). Optimizers RL Automatically finds the best parameters of a model, with respect to certain constraints.

• Heuristics ◦ Pros ▪ Easy to understand and implement.
▪ Factory manager knows exactly what is going on. ◦ Cons ▪ Not scalable. Can grow to hundreds of rules. ▪ Inﬂexible and doesn’t react to change. | Heuristics vs Optimizers vs RL

• Optimizers ◦ Pros ▪ Relatively easy to set up
▪ Not a black box ◦ Cons ▪ Clunky and slow in complex scenarios. ▪ Static in nature. Have to re-run optimizer whenever things change. | Heuristics vs Optimizers vs RL

• Reinforcement Learning ◦ Pros ▪ Good with high variability
▪ Handles large state spaces ▪ Can navigate multiple contradictory objectives ◦ Cons ▪ Needs a data scientist ▪ Black box, hard to explain why a policy made a decision | When and Why Reinforcement Learning

| Use Cases

Simulation Demo

THANK YOU [email protected] [email protected]

Applying Ray and RLlib to Real-life Industrial ...

Applying Ray and RLlib to Real-life Industrial Use Cases (Edward Junprung & Sahar Esmaeilzadeh, Pathmind)

Anyscale

More Decks by Anyscale

Other Decks in Technology

Featured

Transcript

Applying RLLib to Real-Life Industrial Use Cases |

Agenda - Pathmind and Ray - Industrial Engineering workﬂow and

| Pathmind and Ray - Bridge between real-life industrial processes

| Industrial Engineering Workﬂow Simulation/Digital Twin (Observations) Decisions (Action) Outcomes

• Heuristics ◦ Pros ▪ Easy to understand and implement.

• Optimizers ◦ Pros ▪ Relatively easy to set up

• Reinforcement Learning ◦ Pros ▪ Good with high variability

| Use Cases

Simulation Demo

THANK YOU [email protected] [email protected]