I had chance to challenge before • I have been wondering how to use DL to solve numerical optimization problems: Vehicle routing problem, scheduling etc
Job Shop Scheduling Problem (JSP): to build a schedule for multiple "machines" executing multiple jobs, each job contains several tasks required to perform on some machines so that the total time to finish all job is minimum • Manufacture one automobile, build a house etc • Formula: This paper study a classical setting where each job requires number of tasks equals to number of machine so that each job has one task assigned to each machine https://developers.google.com/optimization/scheduling/job_shop J: job, T: task, M: machines u: makespan (total time to finish all jobs) 𝑠𝑠𝑡𝑡:start time of task t 𝑑𝑑𝑡𝑡:process time of task t
𝑓𝑓(𝑦𝑦) 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑡𝑡𝑡𝑡 𝑔𝑔𝑖𝑖 𝑦𝑦 ≤ 0 (∀𝑖𝑖 ∈ 𝑚𝑚 ) • Lagrangian function: 𝑓𝑓𝝀𝝀 𝑦𝑦 = 𝑓𝑓 𝑦𝑦 + � 𝑖𝑖=1 𝑚𝑚 𝜆𝜆𝑖𝑖 max 0, 𝑔𝑔𝑖𝑖 𝑦𝑦 𝜆𝜆𝑖𝑖 ≥ 0, 𝝀𝝀 = (𝜆𝜆1 … , 𝜆𝜆𝑚𝑚 ) • Then L𝑅𝑅𝝀𝝀 = argmin 𝑦𝑦 𝑓𝑓𝜆𝜆 (𝑦𝑦) is the lower bound for the original function P, hence we can obtain the strongest approximation for P by finding the best Lagrangian multiplier: L𝐷𝐷 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑚𝑚𝑚𝑚𝑚𝑚 𝜆𝜆≥0 𝑓𝑓(𝐿𝐿𝑅𝑅𝝀𝝀 ) • In JSP context, functions 𝑔𝑔𝑖𝑖 𝑦𝑦 are the violation degree of constraints 𝜐𝜐2𝑏𝑏 , 𝜐𝜐2𝑐𝑐 as:
times 𝑠𝑠𝑖𝑖 with input is task processing time of each task given fixed machine and jobs 𝑠𝑠𝑖𝑖 = Ρ(𝑑𝑑𝑖𝑖 ) and objective function become: min 𝜃𝜃 � 𝑖𝑖=1 𝑁𝑁 𝐿𝐿(𝑠𝑠𝑖𝑖 , � 𝑃𝑃𝜃𝜃 (𝑑𝑑𝑖𝑖 ) 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑡𝑡𝑡𝑡: 𝐶𝐶(� 𝑃𝑃𝜃𝜃 𝑑𝑑𝑖𝑖 , 𝑑𝑑𝑖𝑖 ) • Using Lagrangian relaxation, the objective function become: L 𝑠𝑠, ̂ 𝑠𝑠, 𝑑𝑑 = 𝐿𝐿 𝑠𝑠, ̂ 𝑠𝑠 + � 𝑐𝑐𝜖𝜖𝜖𝜖 𝜆𝜆𝑐𝑐 𝜈𝜈𝑐𝑐 ( ̂ 𝑠𝑠, 𝑑𝑑) • During learning process, model parameters 𝜃𝜃𝑖𝑖 and Lagrangian multiplier 𝜆𝜆𝑖𝑖 will be updated in order as following algo:
Machine layers and Job layers, each machine has its own hidden layer so that there is no connection between machines • Machine layers: • M components • Each components has 4 layers: J, 2J, 2J, J • Job layers: • J components • Each components has 4 layers: M, 2M, 2M, M • Shared layers: • Concat from all layers from Machine, Jobs • Size: 2MJ, 2MJ, X (not mentioned), MJ (output) This paper study a classical setting where each job requires number of tasks equals to number of machine so that each job has one task assigned to each machine
model result gives an approximation of starting time for each tasks but with some degree of constraint violation • From the approximation, the schedule is built with greedy algorithm as:
instance is solved using IBM CP-Opt software with time limit 1800s to get sub-optimal solution • Compared target: some heuristics algo, IBM software, 1fully connected (FC) network model with same number of parameters • Result Model inference time less than 30 ms • Better result than all compared heuristics method • Reached really closed result to SoTA solver within significantly less time
produce approximations of JSPs that runs in milliseconds • Combined with Lagrangian dual to include the constraints into learning process • Introduced efficient recovery techniques to build schedule from model result • Model showed promising result where SOTA commercial CP takes significant amount of time to obtain solutions with same quality -> can use this model as warm start • My personal view: • Interesting idea on model structure and utilizing Lagrangian to incorporate troublesome constraints of the problem • The model is solving a simplified setting that is not common in real world problem so would need further research and adjustment in design – Each job has exactly 1 task for each machine – Number of Jobs and Machines is fixed for 1 model