Quick Multi-Agent Path Planning

Slide 1

Slide 1 text

7th Sep. 2022 at VUB Artificial Intelligence Lab Quick Multi-Agent Path Planning Keisuke Okumura Tokyo Institute of Technology, Japan ౦ژ޻ۀେֶ 5PLZP*OTUJUVUFPG5FDIOPMPHZ

Slide 2

Slide 2 text

/64 2 Ph.D. student at Tokyo Institute of Technology, Japan (Apr. 2020–) Advisor: Prof. Xavier DEFAGO Keisuke Okumura Research Interests Controlling Multiple Moving Agents AI & Robotics / Multi-Agent Planning / Multi-Robot Coordination https://kei18.github.io/ visiting researcher at LIP6, Sorbonne Univ. (Mar. 2023–) Advisor: Prof. Sebastien TIXEUIL

Slide 3

Slide 3 text

/64 3 YouTube/Mind Blowing Videos logistics YouTube/WIRED manufacturing YouTube/Tokyo 2020 entertainment Swarm { is, will be } necessary in everywhere

Slide 4

Slide 4 text

/64 4 objective-1 Representation objective-2 Planning Common Knowledge? Cooperation? (increased) Uncertainty Execution Navigation for a Team of Agents Who Plans? Huge Search Space

Slide 5

Slide 5 text

/64 TO DAY 5 representation planning execution integration domain-independent planning [preprint-22] [AAMAS-22] data-driven roadmap construction ≥1000 agents within 1 sec [IJCAI-19, IROS-21, ICAPS-22*, AIJ-22, …] [AAAI-21, ICRA-21, ICAPS-22*, IJCAI-22, …] async, (decentralized) *Best Student Paper Award How do we control multiple moving agents adaptively & smoothly?

Slide 6

Slide 6 text

/64 Outline multi-agent path planning in: continuous spaces CTRMs: data-driven approach [AAMAS-22] SSSP: domain-independent planning [preprint-22] discretized spaces PIBT: scalable & convenient algorithm [IJCAI-19 => AIJ-22] LaCAM: complete algorithm [unpublished yet]

Slide 7

Slide 7 text

Slide 8

Slide 8 text

/64 8 MAPF: Multi-Agent Path Finding given agents (starts) graph goals solution paths without collisions optimization is intractable in various criteria [Yu+ AAAI-13, Ma+ AAAI-16, Banfi+ RA-L-17, Geft+ AAMAS-22]

Slide 9

Slide 9 text

/64 10 Quality vs Speed with MAPF Benchmark [Stern+ SOCS-19] example; 194x194; |V|=13,214 *limiting to popular methods solution quality optimal ≥1000 agents in seconds speed & scalability ~100 agents in minutes Push & Swap/Rotate [Luna+ IJCAI-11, de Wilde+ AAMAS-13] EECBS [Li+ AAAI-21] HCA* [Silver AIIDE-05] BCP [Lam+ COR-22] CBS [Sharon+ AIJ-15, Li+ AIJ-21] frontier line

Slide 10

Slide 10 text

/64 11 Put Quality Aside solution quality optimal ≥1000 agents in seconds speed & scalability ~100 agents in minutes Push & Swap/Rotate [Luna+ IJCAI-11, de Wilde+ AAMAS-13] EECBS [Li+ AAAI-21] HCA* [Silver AIIDE-05] BCP [Lam+ COR-22] CBS [Sharon+ AIJ-15, Li+ AIJ-21] [Okumura+ IROS-21] iterative refinement for known solutions planning time (sec) cost / lower bond 300 agents

Slide 11

Slide 11 text

/64 12 Take Speed! solution quality optimal ≥1000 agents in seconds speed & scalability ~100 agents in minutes Push & Swap/Rotate [Luna+ IJCAI-11, de Wilde+ AAMAS-13] EECBS [Li+ AAAI-21] HCA* [Silver AIIDE-05] BCP [Lam+ COR-22] CBS [Sharon+ AIJ-15, Li+ AIJ-21] [Okumura+ IROS-21] iterative refinement developing quick & scalable sub-optimal methods

Slide 12

Slide 12 text

/64 13 solution quality optimal ≥1000 agents in seconds speed & scalability ~100 agents in minutes Push & Swap/Rotate [Luna+ IJCAI-11, de Wilde+ AAMAS-13] EECBS [Li+ AAAI-21] HCA* [Silver AIIDE-05] BCP [Lam+ COR-22] CBS [Sharon+ AIJ-15, Li+ AIJ-21] PIBT [Okumura+ AIJ-22] my work! Take Speed!

Slide 13

Slide 13 text

/64 14 planning online planning applicable to lifelong scenarios ≥500 agents within 50ms scalable sub-optimal algorithm (PIBT) to solve MAPF iteratively ensuring that all agents eventually reach their destinations https://kei18.github.io/pibt2 IJCAI-19 => AIJ-22 Priority Inheritance with Backtracking for Iterative Multi-agent Path Finding KO, Manao Machida, Xavier Defago & Yasumasa Tamura

Slide 14

Slide 14 text

/64 15 locations at t=1 t=2 t=3 repeat one-timestep prioritized planning high low mid How PIBT works – 1/6 … 1 2 3 4 5 6 7 8 9 decision order time-window

Slide 15

Slide 15 text

/64 16 How PIBT works – 2/6 simple prioritized planning is incomplete high low mid stuck

Slide 16

Slide 16 text

/64 17 How PIBT works – 3/6 high low mid as high priority inheritance [Sha+ IEEE Trans Comput-90]

Slide 17

Slide 17 text

/64 18 high low mid How PIBT works – 4/6 1 3 2 decision order … …

Slide 18

Slide 18 text

/64 19 How PIBT works – 5/6 high as high as high as high as high stuck but still not feasible

Slide 19

Slide 19 text

/64 20 How PIBT works – 6/6 invalid valid re-plan re-plan valid You can move invalid You must re-plan, I will stay introduce backtracking repeat this one-timestep planning until termination

Slide 20

Slide 20 text

/64 21 Theoretical Result With dynamic priorities, in biconnected graphs, all agents reach their destinations within finite timestep convenient in lifelong scenarios important note: PIBT is incomplete for MAPF unsolvable, but the theorem is valid

Slide 21

Slide 21 text

/64 22 Multi-agent Pickup & Delivery Sushi Sushi plates are ensured to be delivered

Slide 22

Slide 22 text

/64 23 prioritized planning w/distance-based heuristics [Silver AIIDE-05] A* with operator decomposition greedy version [Standley AAAI-10] EECBS CBS-based, bounded sub-optimal [Li+ AAAI-21] LNS2 large neighborhood search for MAPF [Li+ AAAI-22] PIBT Performance on one-shot MAPF 25 instances 30sec timeout on desktop PC sufficiently long timestep limit 194x194 four-connected grid sum-of-costs (normalized; min:1) 50 250 500 750 1000 1.0 1.2 1.4 1.6 1.8 runtime (sec) 50 250 500 750 1000 0 10 20 30 success rate 50 250 500 750 1000 0.0 0.2 0.4 0.6 0.8 1.0 agents not so bad not so bad blazing fast! worst: 550ms

Slide 23

Slide 23 text

/64 24 PIBT is great but… agents 50 250 500 750 1000 0.0 0.2 0.4 0.6 0.8 1.0 room-64-64-8 64x64 |V|=3,232 50 250 500 750 1000 0.0 0.2 0.4 0.6 0.8 1.0 ost003d 194x194 |V|=13,214 agents 50 100 200 300 400 0.0 0.2 0.4 0.6 0.8 1.0 random-32-32-20 32x32 |V|=819 agents important note: PIBT is incomplete for MAPF We need one more jump! PIBT 0% success rate in 30sec

Slide 24

Slide 24 text

/64 25 planning LaCAM: Search-Based Algorithm for Quick Multi-Agent Pathfinding KO (under review) agents 50 250 500 750 1000 0.0 0.2 0.4 0.6 0.8 1.0 room-64-64-8 50 250 500 750 1000 0.0 0.2 0.4 0.6 0.8 1.0 ost003d agents 50 100 200 300 400 0.0 0.2 0.4 0.6 0.8 1.0 random-32-32-20 agents success rate in 30sec 100% LaCAM worst: 699ms worst: 394ms worst: 11sec quick & complete algorithm for MAPF (LaCAM; lazy constraints addition search)

Slide 25

Slide 25 text

/64 26 … … … … … search node (configuration) Vanilla A* for MAPF greedy search: 44 nodes in general: (5^N)xT nodes N: agents, T: depth intractable even with perfect heuristics goal configuration

Slide 26

Slide 26 text

/64 27 PIBT for MAPF PIBT PIBT PIBT greedy search: 44 nodes only 4 configurations repeat one-timestep planning until termination use PIBT to guide exhaustive search initial configuration goal configuration

Slide 27

Slide 27 text

/64 28 … … … … … Concept of LaCAM PIBT PIBT PIBT the algorithm is beautiful but complicated use other MAPF algorihtms to generate a promising configuration configurations are generated in a lazy manner by two-level search scheme exhaustive search but node generation are dramatically reduced => quick & complete MAPF

Slide 28

Slide 28 text

/64 29 sum-of-costs (normalized; min:1) 50 250 500 750 1000 1.0 1.2 1.4 1.6 1.8 runtime (sec) 50 250 500 750 1000 0 10 20 30 success rate 50 250 500 750 1000 0.0 0.2 0.4 0.6 0.8 1.0 agents prioritized planning w/distance-based heuristics [Silver AIIDE-05] A* with operator decomposition greedy version [Standley AAAI-10] EECBS CBS-based, bounded sub-optimal [Li+ AAAI-21] LNS2 large neighborhood search for MAPF [Li+ AAAI-22] PIBT [Okumura+ AIJ-22] Performance on one-shot MAPF 25 instances, on laptop 30sec timeout sufficiently long timestep limit 194x194 four-connected grid not so bad LaCAM blazing fast! worst: 699ms perfect!

Slide 29

Slide 29 text

/64 30 2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 1.0 success rate in 1000sec 2 4 6 8 10 101 102 103 runtime (sec) agents (x1000) 25 instances, timeout of1000 sec, on desctop PC warehouse-20-40-10-2-2, 340x164; |V|=38,756 Performance with x1000 agents perfect! blazing fast! worst: 26sec for 10,000 agents LaCAM wrong thought: “centralized algorithms are not scalable” => NO, the game is changing PIBT demo of 10,000 agents

Slide 30

Slide 30 text

/64 31 LaCAM is great but… My research will continue agents 50 100 200 300 0.0 0.2 0.4 0.6 0.8 1.0 maze-32-32-2 32x32 |V|=666 50 250 500 750 1000 0.0 0.2 0.4 0.6 0.8 1.0 lt_gallowstemplar_n 251x180 |V|=10,021 agents 50 250 500 750 1000 0.0 0.2 0.4 0.6 0.8 1.0 warehouse-20-40-10-2-1 321x123 |V|=22,599 agents 0% success rate in 30sec narrow corridors can be bottleneck LaCAM

Slide 31

Slide 31 text

/64 32 Take Speed! solution quality optimal ≥1000 agents in seconds speed & scalability ~100 agents in minutes Push & Swap/Rotate [Luna+ IJCAI-11, de Wilde+ AAMAS-13] EECBS [Li+ AAAI-21] HCA* [Silver AIIDE-05] BCP [Lam+ COR-22] CBS [Sharon+ AIJ-15, Li+ AIJ-21] developing quick & scalable sub-optimal methods [Okumura+ IROS-21] iterative refinement

Slide 32

Slide 32 text

/64 33 Take Speed! solution quality optimal ≥1000 agents in seconds speed & scalability ~100 agents in minutes Push & Swap/Rotate [Luna+ IJCAI-11, de Wilde+ AAMAS-13] EECBS [Li+ AAAI-21] HCA* [Silver AIIDE-05] BCP [Lam+ COR-22] CBS [Sharon+ AIJ-15, Li+ AIJ-21] [Okumura+ IROS-21] iterative refinement LaCAM PIBT [Okumura+ AIJ-22]

Slide 33

Slide 33 text

Slide 34

Slide 34 text

/64 35 MAPF definition again given agents (starts) graph goals solution paths without collisions reality is continuous

Slide 35

Slide 35 text

/64 36 Cool coordination but not efficient! why not move diagonally? robots follow grid [Okumura+ ICAPS-22]

Slide 36

Slide 36 text

/64 37 Multi-Agent Path Planning in Continuous Spaces given agents (starts) goals solution paths without collisions finding solutions itself is tremendously challenging [Spirakis+ 84, Hopcroft+ IJRR, Hearn+ TCS-05] workspace

Slide 37

Slide 37 text

/64 38 artificial potential field sampling-based rule-based goal start Strategies to Solve Single-Agent Path Planning in Continuous Spaces constructing roadmap

Slide 38

Slide 38 text

/64 39 SBMP: Sampling-Based Motion Planning state of robot: (x, y) (x, y) should be in this region configuration space random sampling & construct roadmap pathfinding on roadmap same scheme even in high-dimension

Slide 39

Slide 39 text

/64 40 Naïve Strategy to Solve Multi-Agent Path Planning in Continuous Spaces Construct agent-wise roadmaps by SBMP (sampling-based motion planning) methods 1. Solve MAPF on those roadmaps 2.

Slide 40

Slide 40 text

/64 41 produced by PRM [Kavraki+ 96] Pitfall – There is a trade-off dense sparse large small planning effort high low solution quality big impact in multi-agent scenarios ideal: small roadmaps containing high-quality solutions

Slide 41

Slide 41 text

/64 42 Countermeasure biased sampling sampling from important region of each agent how to identify? agent-specific features + interactions between agents ! design manually?

Slide 42

Slide 42 text

/64 43 Countermeasure biased sampling sampling from important region of each agent how to identify? agent-specific features + interactions between agents This is machine learning problem! supervised learning: planning demonstration as training data

Slide 43

Slide 43 text

/64 44 representation CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces KO,* Ryo Yonetani, Mai Nishimura & Asako Kanezaki https://omron-sinicx.github.io/ctrm AAMAS-22 *work done as an intern at OMRON SINIC X data-driven roadmap construction, reducing planning effort significantly constructed roadmaps MAPF algorithm solution

Slide 44

Slide 44 text

/64 45 !!"#$ model training instances & solutions predict next locations MAPF algorithm new instance !!"#$ random walk sampling module next locations for all agents starts path generation compositing solution … t=0 t=1 t=2 CTRMs Workflow Online Inference Offline Training CVAE: Conditional Variational Autoencoder [Sohn+ NeurIPS-15] +importance sampling [Salzmann+ ECCV-20] +multi-agent attention [Hoshen NeurIPS-17] !"#$% :

Slide 45

Slide 45 text

/64 !"#$% 46 next position instance & solution occupancy cost-to-go env. info Offline Training & Model Arch. [Sohn+ NeurIPS-15] CVAE features ?

Slide 46

Slide 46 text

/64 !"#$% 47 + + goal-driven features relative positions, size, speeds, etc Offline Training & Model Arch.

Slide 47

Slide 47 text

/64 !"#$% 48 + + comm. features attention Offline Training & Model Arch. [Hoshen NeurIPS-17]

Slide 48

Slide 48 text

/64 !"#$% 49 go right [0,0,1] indicator feature Offline Training & Model Arch.

Slide 49

Slide 49 text

/64 !"#$% 50 + + + + go right [0,0,1] next position goal-driven features comm. features indicator feature instance & solution occupancy cost-to-go env. info relative positions, size, speeds, etc attention Offline Training & Model Arch. [Sohn+ NeurIPS-15] CVAE [Hoshen NeurIPS-17]

Slide 50

Slide 50 text

/64 51 observations for agent-i next predicted location for agent-i trained model likely to be used by planners Online Inference

Slide 51

Slide 51 text

/64 52 Online Inference timestep t timestep t+1 next predicted locations for all agents observations for all agents

Slide 52

Slide 52 text

/64 53 Online Inference t=0 t=1 t=2 t=T t=T-1 … initial locations timed path for agent-i each path is agent-specific and cooperative hyperparameter

Slide 53

Slide 53 text

/64 54 … … … … compositing t=0 t=1 t=2 t=T t=T-1 timed roadmap for agent-i each roadmap is agent-specific and cooperative hyperparameter: #(path generation) Online Inference

Slide 54

Slide 54 text

/64 55 SPARS [Dobson & Bekris, IJRR-14] (random) simplified PRM [Karaman & Frazzoli, IJRR-11] square as agent-specific roadmaps grid as used in MAPF studies CTRMs 20-30 homo agents corresponding to 32x32 grids CTRMs produce small but effective roadmaps specific to each agent Roadmap Visualization

Slide 55

Slide 55 text

/64 56 Quantitative Results 103 104 105 exSanded nRdes / agents 0 10 20 30 40 suP-Rf-cRsts / agents average Rver 40/100 instances CT50s randRP grid S3A5S sTuare 20-30 homo agents corresponding to 32x32 grids 100 instances solved by prioritized planning [Silver, AIIDE-05, Van Den Berg & Overmars IROS-05, etc] CTRMs reduce planning effort while keeping solution qualities params of CTRMs: #(path generations) sparse dense

Slide 56

Slide 56 text

/64 57 CT50 s randRP grid S3A5S sTuare 0 100 200 300 400 500 runtiPe (sec) x average Rver 40/100 instances rRadPaS Slanner 20-30 homo agents corresponding to 32x32 grids 100 instances solved by prioritized planning [Silver, AIIDE-05, Van Den Berg & Overmars IROS-05, etc] CTRMs achieve efficient path-planning from the end-to-end perspective sparse dense Roadmap construction can be much faster. Check our latest implementation: https://github.com/omron-sinicx/jaxmapp Quantitative Results quick!

Slide 57

Slide 57 text

/64 58 representation of environment is critical for planning develop agent-wise roadmaps according to multi-agent search progress

Slide 58

Slide 58 text

/64 59 Quick Multi-Robot Motion Planning by Combining Sampling & Search KO & Xavier Defago (under review) planning representation algorithm (SSSP) to solve multi-robot motion planning quickly simultaneously perform roadmap construction & collision-free pathfinding https://kei18.github.io/sssp 32 robots

Slide 59

Slide 59 text

/64 60 MRMP: Multi-Robot Motion Planning MAPF is a special case of MRMP Solution: collision-free trajectorie Each agent has its own configuration space To make MRPP domain-independent, only five utility functions are available sample collide steer dist connect free space true false true false 0.18

Slide 60

Slide 60 text

/64 61 Proposed Algorithm: SSSP search progress solution MAPF A* with operator decomposition [Standley AAAI-10] SBMP EST: expansive space trees [Hsu+ ICRA-97] integration +many tricks inspired by SBMP & MAPF studies exhaustive search like vanilla A* while constructing roadmaps through random walks

Slide 61

Slide 61 text

/64 62 0 200 400 600 800 1000 solved ins ances 0 100 200 300 run ime (sec) PRM RRT RRT-C PP CBS SSSP Point2d DOF: 2N 0 200 400 600 800 1000 solved ins ances 0 100 200 300 run ime (sec) PRM RRT RRT-C PP CBS SSSP Point3d DOF: 3N 0 200 400 600 800 1000 solved ins ances 0 100 200 300 run ime (sec) PRM RRT-C RRT CBS PP SSSP Line2d DOF: 3N 0 200 400 600 800 1000 solved ins ances 0 100 200 300 run ime (sec) PRM RRT PP CBS RRT-C SSSP Capsule3d DOF: 6N 0 200 400 600 800 1000 olved in tance 0 100 200 300 runtime ( ec) PRM PP/CBS RRT-C RRT SSSP Arm22 DOF: 2N 0 200 400 600 800 1000 solved ins ances 0 100 200 300 run ime (sec) PRM PP CBS RRT RRT-C SSSP Arm33 DOF: 6N 0 200 400 600 800 1000 solved ins ances 0 100 200 300 run ime (sec) PRM RRT RRT-C PP CBS SSSP Dubins2d DOF: 3N 0 200 400 600 800 1000 solved ins ances 0 100 200 300 run ime (sec) RRT-C CBS PP/RRT SSSP Snake2d DOF: 6N Performance of SSSP very promising quick!

Slide 62

Slide 62 text

/64 63 ML Machine Learning as Heuristics integration SBMP Sampling-Based Motion Planning MAPF Multi-Agent Path Finding My Future Research? establish practical methodologies to MRMP

Slide 63

Slide 63 text

/64 64 Takeaways difficult, but will be possible Quick Multi-Agent Path Planning still being hot in the next decade, e.g., MRMP is not matured at all

Slide 64

Slide 64 text

/64 More Info? => Check My Website! https://kei18.github.io/ Thank You for Listening! Collaboration is welcome!