Koh Takeuchi2,1 Keisuke Fujii3,1 Yasuo Tabei1 1RIKEN AIP 2Kyoto Univ. 3Nagoya Univ. 28th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems November 3–6, 2020, Seattle, Washington, USA (virtual)
trajectories are ubiquitous in research and industry ▹ Similarity search of a huge collection of trajectories is indispensable for turning these datasets into knowledge • Our contribution ▹ Develop an efficient trajectory similarity search method – Powerful measure: Fréchet distance – Fast search: Locality sensitive hashing (LSH) + Trie search algorithm – Scalability: Compressed trie implementation using succinct data structures ▹ Experiments using real-world huge datasets – Demonstrate our method performs superiorly compared to state-of-the-art ones
an owner and his dog with a leash ▹ Both walk on their trajectories with their speeds, but cannot go backward ▹ The Fréchet distance is the leash length necessary at least • The computation time is O(traj-length2) by dynamic programing max = Fréchet(owner, dog) The computational demand makes difficult to design an efficient exact solution :( LSH enables us to quickly solve such difficult search problems :)
n sketches S1 , S2 , …, Sn ▹ Query sketch T ▹ Hamming distance threshold K • Output ▹ All sketches Si such that the Hamming distance to T is within K – i.e., { Si : Ham(Si , T) ≤ K } • Issues :( ▹ Most existing methods are designed for binary sketches and inefficient for integer ones ▹ Existing methods for integer sketches are memory-inefficient We develop a novel similarity search method called tSTAT
represented using direct addressable tables H Tree navigation can be performed in O(1) time by Rank/Select queries over H Proposed Method H can be implemented by succinct trit array in bits of compressed space σNin log2 3 + o(Nin ) Close to the theoretically lower-bound space Rank Rank Select We developed an efficient implementation of the succinct trit array supporting Rank/Select (σ: #kinds of integers, Nin : #inner nodes)
games in the 2015/16 seasons • Queryset: 1000 trajectories randomly extracted from the dataset • Competitors ▹ LS: Strawman baseline with linear search (without any auxiliary data structure) ▹ HmSearch: State-of-the-art of similarity search for integer sketches [SSDBM13] ▹ FRESH: State-of-the-art of approximate trajectory similarity search [WADS19] 17x smaller than FRESH 10x smaller than HmSearch Memory usage (GiB) Fréchet radii R to find 1, 10, and 100 solutions on average per query
games in the 2015/16 seasons • Queryset: 1000 trajectories randomly extracted from the dataset • Competitors ▹ LS: Strawman baseline with linear search (without any auxiliary data structure) ▹ HmSearch: State-of-the-art of similarity search for integer sketches [SSDBM13] ▹ FRESH: State-of-the-art of approximate trajectory similarity search [WADS19] Average Search Time (ms/query) Fréchet radii R to find 1, 10, and 100 solutions on average per query 34x faster than FRESH 12x faster than HmSearch
Proposed a novel similarity search method tSTAT ▹ Showed the efficiency through experiments using real-world datasets Date: 12/06/2015 Match: SAC vs OKC PlayerName: Rajon Rondo (No 9) Q4 – 07:09.74 Q4 – 07:00.29 Query Date: 10/31/2015 Match: NOP vs GSW PlayerName: Toney Douglas (No 16) Distance: 0.363737 Result 1 Q3 – 00:36.15 Q3 – 00:31.75 Date: 12/09/2015 Match: SAS vs TOR PlayerName: Tim Duncan (No 21) Distance: 0.423995 Result 2 Q1 – 09:48.32 Q1 – 09:43.59 Date: 01/12/2016 Match: PHX vs IND PlayerName: P. J. Tucker (No 17) Distance: 0.395999 Result 3 Q4 – 06:20.51 Q4 – 06:17.35 Database of 3.3 million trajs • For a short movement of Rajon Rondo in Kings vs. Thunder on Dec. 6, 2015