Slide 4
Slide 4 text
Problem Statement
• Sketch x of length m is an m-dimensional vector of non-negative integers
• We have a dataset X = {x1
, x2
, …, xn
}, which is a dynamic set of n sketches
• Given sketch y and Hamming radius r as a query, we want to quickly find similar
sketches such that {xi
: H(xi
, y) ≤ r}
▹ H(∙, ∙) is the Hamming distance (i.e., # of errors in each dimension)
x1 111020
x2 001020
x3 032021
x4 113021
Dataset X
n
Generality
Dynamics
H(x1, y) = 1
H(x2, y) = 3
H(x3, y) = 3
H(x4, y) = 1
≤ r
≤ r
similar
similar
y = 111021
r = 1
Query