✘ PrestoとHyperLogLogで、大量ログからユニークユーザー数を高速に推定する(実践編) ✘ RedisのBitCountとHyperLogLogを使用した超高速Unique User数集計 ✘ An Improved Data Stream Summary: The Count-Min Sketch and its Applications ✗ Count min sketch の 原典 ✘ A Linear-Time Probabilistic Counting Algorithm for Database Applications ✗ Linear Countingの原典 ✘ Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm ✗ HyperLogLog原典 ✘ Hyperloglog: The analysis of a near-optimal cardinality estimation algorithm ✗ HyperLoglogのGoogleの改良 ✘ All-Distances Sketches, Revisited: HIP Estimators for Massive Graphs Analysis ✘ Hash functions: An empirical comparison - strchr.com ✘ Algebird & Approximation algorithms by Stephan H on Prezi ✗ Algebirdというtwitterのライブラリと絡めた説明 36
3 set palette model XYZ functions gray**0.35, gray**0.5, gray**0.8 set title "sketch size [GB]" set xrange [1e-6:1e-1] set logscale x 10 set xtics auto set xlabel "delta" set yrange [1e-9:1e-7] set logscale y 10 set ytics auto set ylabel "epsilon" splot ceil(2.718/(y))*ceil(log(1/(x)))/(1024*1024*1024) 38