Fault-Tolerant Offline Multi-Agent Path Planning

Slide 1

Slide 1 text

Fault-Tolerant Offline Multi-Agent Path Planning Keisuke Okumura Tokyo Institute of Technology, Japan ౦ژ޻ۀେֶ 5PLZP*OTUJUVUFPG5FDIOPMPHZ Feb. 7st – 14th, 2023 Washington, DC, USA AAAI-23 https://kei18.github.io/mappcf Sebastien Tixeuil Sorbonne University, CNRS, LIP6, Institut Universitaire de France, France fault solution multiple paths assuming crashes

Slide 2

Slide 2 text

/38 2 Motivation multi-agent path planning (MAPP) is important necessity of building reliable systems cutting-edge studies assume perfect agents robot faults are common => fault-tolerance e.g., mean time between failure of one robot: 125days* *https://fr.autostoresystem.com/benefits/reliable aws.amazon.com path planning where agents may unexpectedly crash at runtime MAPPCF (w/crash faults)

Slide 3

Slide 3 text

/38 3 goal start graph 0 Conventional Solution Concept solution

Slide 4

Slide 4 text

/38 4 1 Conventional Solution Concept

Slide 5

Slide 5 text

/38 5 2 Conventional Solution Concept

Slide 6

Slide 6 text

/38 6 3 Conventional Solution Concept done!

Slide 7

Slide 7 text

/38 7 0 With Unforeseen Crash

Slide 8

Slide 8 text

/38 8 online replanning offline approach: preparing backup paths from the beginning or crash detected => then? 1 With Unforeseen Crash crashed (forever stop)

Slide 9

Slide 9 text

/38 9 primary path Solution Concept of MAPPCF backup path when is detected transition rule & 0

Slide 10

Slide 10 text

/38 10 Solution Concept of MAPPCF 1

Slide 11

Slide 11 text

/38 11 Solution Concept of MAPPCF 2

Slide 12

Slide 12 text

/38 12 Solution Concept of MAPPCF more than two agents may crash => backup path of backup path 3 done!

Slide 13

Slide 13 text

/38 13 Problem Formulation of MAPPCF given solution s.t. all non-crashed agents eventually reach their destination, regardless of crashes (up to f ) & transition rules & maximum number of crashes f defined with failure detector & execution model centralized planning followed by decentralized execution

Slide 14

Slide 14 text

/38 14 Failure Detectors oracle that tells status of neighboring vertices response: 1. no agent query 2. non-crashed agent named FD 3. crashed agent anonymous FD unable to identify who crashes c.f., [Chandra+ JACM-96]

Slide 15

Slide 15 text

/38 15 Execution Models how agents are scheduled at runtime synchronous model all agents act simultaneously solutions avoid collisions MAPF: multi-agent pathfinding [Stern+ SOCS-19] solution solutions avoid deadlocks each agent acts spontaneously while locally avoiding collisions offline time-independent MAPP [Okumura+ IJCAI-22] sequential model (async) solution possible schedule

Slide 16

Slide 16 text

/38 16 Model Power Analyses SYN + AFD SYN + NFD SEQ + NFD SEQ + AFD synchronous model sequential model named failure detector anonymous FD SYN SEQ NFD AFD strictly stronger SEQ+AFD SYN+AFD solvable instances weakly stronger SYN+AFD SYN+NFD SYN+AFD SYN+NFD or solvable in SYN unsolvable in SEQ

Slide 17

Slide 17 text

/38 17 Computational Complexity 1. finding solutions is NP-hard 2. verification is co-NP-complete regardless of FD types or execution models the proofs are reductions from 3-SAT MAPPCF is computationally intractable

Slide 18

Slide 18 text

/38 18 Solving MAPPCF proposal: decoupled crash faults resolution framework (DCRF) synchronous model + named FD number of maximum crashes f =2 example* *DCRF is applicable to other models 1. find initial paths 2. identify unresolved events 3. compute backup path & update solution 4. back to step-2

Slide 19

Slide 19 text

/38 19 How DCRF Solves MAPPCF find initial paths

Slide 20

Slide 20 text

/38 20 How DCRF Solves MAPPCF identify unresolved events unresolved events queue

Slide 21

Slide 21 text

/38 21 How DCRF Solves MAPPCF identify unresolved events unresolved events queue

Slide 22

Slide 22 text

/38 22 How DCRF Solves MAPPCF unresolved events queue identify unresolved events

Slide 23

Slide 23 text

/38 23 How DCRF Solves MAPPCF resolve event

Slide 24

Slide 24 text

/38 24 How DCRF Solves MAPPCF resolve event

Slide 25

Slide 25 text

/38 25 How DCRF Solves MAPPCF update solution

Slide 26

Slide 26 text

/38 26 How DCRF Solves MAPPCF identify unresolved events

Slide 27

Slide 27 text

/38 27 How DCRF Solves MAPPCF resolve event

Slide 28

Slide 28 text

/38 28 How DCRF Solves MAPPCF resolve event

Slide 29

Slide 29 text

/38 29 How DCRF Solves MAPPCF update solution

Slide 30

Slide 30 text

/38 30 How DCRF Solves MAPPCF identify unresolved events

Slide 31

Slide 31 text

/38 31 How DCRF Solves MAPPCF resolve event

Slide 32

Slide 32 text

/38 32 How DCRF Solves MAPPCF resolve event

Slide 33

Slide 33 text

/38 33 How DCRF Solves MAPPCF update solution

Slide 34

Slide 34 text

/38 34 How DCRF Solves MAPPCF identify unresolved events

Slide 35

Slide 35 text

/38 35 How DCRF Solves MAPPCF empty obtain solution DCRF is correct but incomplete unresolved events queue

Slide 36

Slide 36 text

/38 36 Empirical Results success rate #agents fixed #crashes: f =1 SYN SEQ success rate #crashes f fixed #agents: 15 SYN SEQ solving MAPPCF becomes difficult with more {agents, crashes} SEQ is harder than SYN random-32-32-10 32x32 (|V|=922) from [Stern+ SOCS-19] 30sec timeout with named FD synchronous SYN v.s. sequential SEQ

Slide 37

Slide 37 text

/38 37 Empirical Results MAPPCF provides better solution concept than finding disjoint paths random-32-32-10 32x32 (|V|=922) from [Stern+ SOCS-19] 30sec timeout with named FD success rate #agents DCRF/SYN disjoint paths costs / lower bound #agents DCRF/SYN disjoint paths fixed #crashes: f =1 adapted from CBS [Sharon+ AIJ-15] v.s. finding vertex disjoint paths traveling time when no crashes

Slide 38

Slide 38 text

/38 38 Concluding Remarks MAPPCF novel path planning problem for multiple agents that may crash at runtime fault solution multiple paths assuming crashes https://kei18.github.io/mappcf future directions: complete algorithms, optimization, other types of failure detectors