Upgrade to Pro — share decks privately, control downloads, hide ads and more …

HyperLogLog in 15 minutes

Avatar for Paul Mucur Paul Mucur
November 28, 2018

HyperLogLog in 15 minutes

A brief explanation of the HyperLogLog algorithm for estimating the cardinality of large sets given at the Drover Ruby Meetup on Wednesday, 28th November 2018.

Avatar for Paul Mucur

Paul Mucur

November 28, 2018
Tweet

More Decks by Paul Mucur

Other Decks in Technology

Transcript

  1. animals = Set.new => #<Set: {}> animals << "dog" =>

    #<Set: {"dog"}> animals << "dog" => #<Set: {"dog"}> animals << "cat" => #<Set: {"dog", "cat"}> animals.size => 2
  2. 1

  3. 1

  4. 1

  5. 1

  6. 1

  7. 1 2

  8. 2

  9. 2

  10. 2 5

  11. P(0) = 1 21 = 1 2 P(1) = 1

    22 = 1 4 P(2) = 1 23 = 1 8 . . . P(n) = 1 2n+1
  12. > PFADD tweets 1 2 3 4 5 6 (integer)

    1 > PFCOUNT tweets (integer) 6
  13. n 1 x1 + 1 x2 + … + 1

    xn Harmonic mean