Upgrade to PRO for Only $50/Yearโ€”Limited-Time Offer! ๐Ÿ”ฅ

Fast Approximations for Job Shop Scheduling A L...

Fast Approximations for Job Shop Scheduling A Lagragian Dual Deep Learning Method_AAAIย Reading

AAAI Reading material for paper:
Fast Approximations for Job Shop Scheduling A Lagragian Dual Deep Learning Method

Avatar for Khang Pham

Khang Pham

April 06, 2022
Tweet

More Decks by Khang Pham

Other Decks in Technology

Transcript

  1. 0 Fast Approximations for Job Shop Scheduling: A Lagrangian Dual

    Deep Learning Method AAAI2022_reading Khang Pham Data Scientist @BCG Gamma
  2. 1 Motivation โ€ข Job Shop Scheduling is an interesting problem

    I had chance to challenge before โ€ข I have been wondering how to use DL to solve numerical optimization problems: Vehicle routing problem, scheduling etc
  3. 2 Agenda 1. Job Shop Scheduling Problem 2. Lagrangian Dual

    3. Model design 4. Result and conclusion
  4. 3 Job Shop Scheduling: popular NP-hard problem in industry โ€ข

    Job Shop Scheduling Problem (JSP): to build a schedule for multiple "machines" executing multiple jobs, each job contains several tasks required to perform on some machines so that the total time to finish all job is minimum โ€ข Manufacture one automobile, build a house etc โ€ข Formula: This paper study a classical setting where each job requires number of tasks equals to number of machine so that each job has one task assigned to each machine https://developers.google.com/optimization/scheduling/job_shop J: job, T: task, M: machines u: makespan (total time to finish all jobs) ๐‘ ๐‘ ๐‘ก๐‘ก:start time of task t ๐‘‘๐‘‘๐‘ก๐‘ก:process time of task t
  5. 4 Lagrangian relaxation โ€ข Optimization problem: ฮก = ๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Žmin ๐‘ฆ๐‘ฆ

    ๐‘“๐‘“(๐‘ฆ๐‘ฆ) ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘  ๐‘ก๐‘ก๐‘ก๐‘ก ๐‘”๐‘”๐‘–๐‘– ๐‘ฆ๐‘ฆ โ‰ค 0 (โˆ€๐‘–๐‘– โˆˆ ๐‘š๐‘š ) โ€ข Lagrangian function: ๐‘“๐‘“๐€๐€ ๐‘ฆ๐‘ฆ = ๐‘“๐‘“ ๐‘ฆ๐‘ฆ + ๏ฟฝ ๐‘–๐‘–=1 ๐‘š๐‘š ๐œ†๐œ†๐‘–๐‘– max 0, ๐‘”๐‘”๐‘–๐‘– ๐‘ฆ๐‘ฆ ๐œ†๐œ†๐‘–๐‘– โ‰ฅ 0, ๐€๐€ = (๐œ†๐œ†1 โ€ฆ , ๐œ†๐œ†๐‘š๐‘š ) โ€ข Then L๐‘…๐‘…๐€๐€ = argmin ๐‘ฆ๐‘ฆ ๐‘“๐‘“๐œ†๐œ† (๐‘ฆ๐‘ฆ) is the lower bound for the original function P, hence we can obtain the strongest approximation for P by finding the best Lagrangian multiplier: L๐ท๐ท = ๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘Ž๐‘š๐‘š๐‘š๐‘š๐‘š๐‘š ๐œ†๐œ†โ‰ฅ0 ๐‘“๐‘“(๐ฟ๐ฟ๐‘…๐‘…๐€๐€ ) โ€ข In JSP context, functions ๐‘”๐‘”๐‘–๐‘– ๐‘ฆ๐‘ฆ are the violation degree of constraints ๐œ๐œ2๐‘๐‘ , ๐œ๐œ2๐‘๐‘ as:
  6. 5 Model design: objective function โ€ข Objective: to predict start

    times ๐‘ ๐‘ ๐‘–๐‘– with input is task processing time of each task given fixed machine and jobs ๐‘ ๐‘ ๐‘–๐‘– = ฮก(๐‘‘๐‘‘๐‘–๐‘– ) and objective function become: min ๐œƒ๐œƒ ๏ฟฝ ๐‘–๐‘–=1 ๐‘๐‘ ๐ฟ๐ฟ(๐‘ ๐‘ ๐‘–๐‘– , ๏ฟฝ ๐‘ƒ๐‘ƒ๐œƒ๐œƒ (๐‘‘๐‘‘๐‘–๐‘– ) ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘ ๐‘  ๐‘ก๐‘ก๐‘ก๐‘ก: ๐ถ๐ถ(๏ฟฝ ๐‘ƒ๐‘ƒ๐œƒ๐œƒ ๐‘‘๐‘‘๐‘–๐‘– , ๐‘‘๐‘‘๐‘–๐‘– ) โ€ข Using Lagrangian relaxation, the objective function become: L ๐‘ ๐‘ , ฬ‚ ๐‘ ๐‘ , ๐‘‘๐‘‘ = ๐ฟ๐ฟ ๐‘ ๐‘ , ฬ‚ ๐‘ ๐‘  + ๏ฟฝ ๐‘๐‘๐œ–๐œ–๐œ–๐œ– ๐œ†๐œ†๐‘๐‘ ๐œˆ๐œˆ๐‘๐‘ ( ฬ‚ ๐‘ ๐‘ , ๐‘‘๐‘‘) โ€ข During learning process, model parameters ๐œƒ๐œƒ๐‘–๐‘– and Lagrangian multiplier ๐œ†๐œ†๐‘–๐‘– will be updated in order as following algo:
  7. 6 Model design: network structure โ€ข Network contains 2 parts:

    Machine layers and Job layers, each machine has its own hidden layer so that there is no connection between machines โ€ข Machine layers: โ€ข M components โ€ข Each components has 4 layers: J, 2J, 2J, J โ€ข Job layers: โ€ข J components โ€ข Each components has 4 layers: M, 2M, 2M, M โ€ข Shared layers: โ€ข Concat from all layers from Machine, Jobs โ€ข Size: 2MJ, 2MJ, X (not mentioned), MJ (output) This paper study a classical setting where each job requires number of tasks equals to number of machine so that each job has one task assigned to each machine
  8. 7 Construct feasible schedule from model result โ€ข Neural Network

    model result gives an approximation of starting time for each tasks but with some degree of constraint violation โ€ข From the approximation, the schedule is built with greedy algorithm as:
  9. 8 Experimental result โ€ข Data: chosen from JPBLib 2014, each

    instance is solved using IBM CP-Opt software with time limit 1800s to get sub-optimal solution โ€ข Compared target: some heuristics algo, IBM software, 1fully connected (FC) network model with same number of parameters โ€ข Result Model inference time less than 30 ms โ€ข Better result than all compared heuristics method โ€ข Reached really closed result to SoTA solver within significantly less time
  10. 9 Conclusions โ€ข From paper: โ€ข Introduced deep-learning approach to

    produce approximations of JSPs that runs in milliseconds โ€ข Combined with Lagrangian dual to include the constraints into learning process โ€ข Introduced efficient recovery techniques to build schedule from model result โ€ข Model showed promising result where SOTA commercial CP takes significant amount of time to obtain solutions with same quality -> can use this model as warm start โ€ข My personal view: โ€ข Interesting idea on model structure and utilizing Lagrangian to incorporate troublesome constraints of the problem โ€ข The model is solving a simplified setting that is not common in real world problem so would need further research and adjustment in design โ€“ Each job has exactly 1 task for each machine โ€“ Number of Jobs and Machines is fixed for 1 model