Slide 1

Slide 1 text

0 Fast Approximations for Job Shop Scheduling: A Lagrangian Dual Deep Learning Method AAAI2022_reading Khang Pham Data Scientist @BCG Gamma

Slide 2

Slide 2 text

1 Motivation • Job Shop Scheduling is an interesting problem I had chance to challenge before • I have been wondering how to use DL to solve numerical optimization problems: Vehicle routing problem, scheduling etc

Slide 3

Slide 3 text

2 Agenda 1. Job Shop Scheduling Problem 2. Lagrangian Dual 3. Model design 4. Result and conclusion

Slide 4

Slide 4 text

3 Job Shop Scheduling: popular NP-hard problem in industry • Job Shop Scheduling Problem (JSP): to build a schedule for multiple "machines" executing multiple jobs, each job contains several tasks required to perform on some machines so that the total time to finish all job is minimum • Manufacture one automobile, build a house etc • Formula: This paper study a classical setting where each job requires number of tasks equals to number of machine so that each job has one task assigned to each machine https://developers.google.com/optimization/scheduling/job_shop J: job, T: task, M: machines u: makespan (total time to finish all jobs) 𝑠𝑠𝑡𝑡:start time of task t 𝑑𝑑𝑡𝑡:process time of task t

Slide 5

Slide 5 text

4 Lagrangian relaxation • Optimization problem: Ρ = 𝑎𝑎𝑎𝑎𝑎𝑎min 𝑦𝑦 𝑓𝑓(𝑦𝑦) 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑡𝑡𝑡𝑡 𝑔𝑔𝑖𝑖 𝑦𝑦 ≤ 0 (∀𝑖𝑖 ∈ 𝑚𝑚 ) • Lagrangian function: 𝑓𝑓𝝀𝝀 𝑦𝑦 = 𝑓𝑓 𝑦𝑦 + � 𝑖𝑖=1 𝑚𝑚 𝜆𝜆𝑖𝑖 max 0, 𝑔𝑔𝑖𝑖 𝑦𝑦 𝜆𝜆𝑖𝑖 ≥ 0, 𝝀𝝀 = (𝜆𝜆1 … , 𝜆𝜆𝑚𝑚 ) • Then L𝑅𝑅𝝀𝝀 = argmin 𝑦𝑦 𝑓𝑓𝜆𝜆 (𝑦𝑦) is the lower bound for the original function P, hence we can obtain the strongest approximation for P by finding the best Lagrangian multiplier: L𝐷𝐷 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑚𝑚𝑚𝑚𝑚𝑚 𝜆𝜆≥0 𝑓𝑓(𝐿𝐿𝑅𝑅𝝀𝝀 ) • In JSP context, functions 𝑔𝑔𝑖𝑖 𝑦𝑦 are the violation degree of constraints 𝜐𝜐2𝑏𝑏 , 𝜐𝜐2𝑐𝑐 as:

Slide 6

Slide 6 text

5 Model design: objective function • Objective: to predict start times 𝑠𝑠𝑖𝑖 with input is task processing time of each task given fixed machine and jobs 𝑠𝑠𝑖𝑖 = Ρ(𝑑𝑑𝑖𝑖 ) and objective function become: min 𝜃𝜃 � 𝑖𝑖=1 𝑁𝑁 𝐿𝐿(𝑠𝑠𝑖𝑖 , � 𝑃𝑃𝜃𝜃 (𝑑𝑑𝑖𝑖 ) 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑡𝑡𝑡𝑡: 𝐶𝐶(� 𝑃𝑃𝜃𝜃 𝑑𝑑𝑖𝑖 , 𝑑𝑑𝑖𝑖 ) • Using Lagrangian relaxation, the objective function become: L 𝑠𝑠, ̂ 𝑠𝑠, 𝑑𝑑 = 𝐿𝐿 𝑠𝑠, ̂ 𝑠𝑠 + � 𝑐𝑐𝜖𝜖𝜖𝜖 𝜆𝜆𝑐𝑐 𝜈𝜈𝑐𝑐 ( ̂ 𝑠𝑠, 𝑑𝑑) • During learning process, model parameters 𝜃𝜃𝑖𝑖 and Lagrangian multiplier 𝜆𝜆𝑖𝑖 will be updated in order as following algo:

Slide 7

Slide 7 text

6 Model design: network structure • Network contains 2 parts: Machine layers and Job layers, each machine has its own hidden layer so that there is no connection between machines • Machine layers: • M components • Each components has 4 layers: J, 2J, 2J, J • Job layers: • J components • Each components has 4 layers: M, 2M, 2M, M • Shared layers: • Concat from all layers from Machine, Jobs • Size: 2MJ, 2MJ, X (not mentioned), MJ (output) This paper study a classical setting where each job requires number of tasks equals to number of machine so that each job has one task assigned to each machine

Slide 8

Slide 8 text

7 Construct feasible schedule from model result • Neural Network model result gives an approximation of starting time for each tasks but with some degree of constraint violation • From the approximation, the schedule is built with greedy algorithm as:

Slide 9

Slide 9 text

8 Experimental result • Data: chosen from JPBLib 2014, each instance is solved using IBM CP-Opt software with time limit 1800s to get sub-optimal solution • Compared target: some heuristics algo, IBM software, 1fully connected (FC) network model with same number of parameters • Result Model inference time less than 30 ms • Better result than all compared heuristics method • Reached really closed result to SoTA solver within significantly less time

Slide 10

Slide 10 text

9 Conclusions • From paper: • Introduced deep-learning approach to produce approximations of JSPs that runs in milliseconds • Combined with Lagrangian dual to include the constraints into learning process • Introduced efficient recovery techniques to build schedule from model result • Model showed promising result where SOTA commercial CP takes significant amount of time to obtain solutions with same quality -> can use this model as warm start • My personal view: • Interesting idea on model structure and utilizing Lagrangian to incorporate troublesome constraints of the problem • The model is solving a simplified setting that is not common in real world problem so would need further research and adjustment in design – Each job has exactly 1 task for each machine – Number of Jobs and Machines is fixed for 1 model