Slide 28
Slide 28 text
Count-Min Sketch: Probability
• The width determines the error rate:
A larger width means more counters to distribute the counts, leading to a lower error rate because there’s
less likelihood of collisions in
fl
ating counts. If we are conservative and say that a counter can get twice the
average amount, then the formula to calculate it is: “e = 2/w"
• The depth determines the con
fi
dence in this error rate: (½)^d
A greater depth means that there are more rows, reducing the likelihood that all rows will simultaneously
overestimate due to collisions. The chance a row will overestimate is of 50%, it either will or not. By
increasing the number of rows, we decrease this chance: (½)^d
For a Sketch of 5/3:
• Error rate: 40%
• Con
fi
dence in this error rate: 99.87%
99.87% of the time, the counter will be
within 40% of the true value
For a Sketch of 2000/10:
• Error rate: 0.1%
• Con
fi
dence in this error rate: 99,99%
99.99% of the time, the counter will be
within 0.1% of the true value