Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

HyperLogLog in 15 minutes

Avatar for Paul Mucur Paul Mucur
November 28, 2018

HyperLogLog in 15 minutes

A brief explanation of the HyperLogLog algorithm for estimating the cardinality of large sets given at the Drover Ruby Meetup on Wednesday, 28th November 2018.

Avatar for Paul Mucur

Paul Mucur

November 28, 2018
Tweet

More Decks by Paul Mucur

Other Decks in Technology

Transcript

  1. animals = Set.new => #<Set: {}> animals << "dog" =>

    #<Set: {"dog"}> animals << "dog" => #<Set: {"dog"}> animals << "cat" => #<Set: {"dog", "cat"}> animals.size => 2
  2. 1

  3. 1

  4. 1

  5. 1

  6. 1

  7. 1 2

  8. 2

  9. 2

  10. 2 5

  11. P(0) = 1 21 = 1 2 P(1) = 1

    22 = 1 4 P(2) = 1 23 = 1 8 . . . P(n) = 1 2n+1
  12. > PFADD tweets 1 2 3 4 5 6 (integer)

    1 > PFCOUNT tweets (integer) 6
  13. n 1 x1 + 1 x2 + … + 1

    xn Harmonic mean