Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distinct Query using HyperLogLog

hama_du
October 05, 2018

Distinct Query using HyperLogLog

Distinct Queryを例にHyperLogLogのお気持ちを理解する

hama_du

October 05, 2018
Tweet

More Decks by hama_du

Other Decks in Science

Transcript

  1. ϋογϡ஋ͷܭࢉ hash(AB) = 0x36f… = 0011 0110 1111 … hash(CD)

    = 0xc90… = 1100 1001 0000 … hash(EF) = 0x01e… = 0000 0001 1110 …
  2. ઌ಄ʹ͍ͭ͘ 0 ͕͍ͭͯΔʁ zero(hash(AB)) = zero(0011 0110 1111…) = 2

    zero(hash(CD)) = zero(1100 1001 0000…) = 0 zero(hash(EF)) = zero(0000 0001 1110…) = 7
  3. ͭ·Γ… D = max(?, ?, …, 7, …, ?, ?)

    zero(hash(?)) = zero(0000 0001 …) = 7
  4. ͭ·Γ… D = max(?, ?, …, 7, …, ?, ?)

    zero(hash(?)) = zero(0000 0001 …) = 7 ݁ߏϨΞʂ
  5. Ͳͷఔ౓ϨΞʁ D = max(?, ?, …, 7, …, ?, ?)

    zero(hash(?)) = zero(0000 0001 …) = 7 1/2^7 = 1/128
  6. େ͖͍஋Ͱߋ৽ʂ hash(AB) = 0x36f… = 0011 0110 … 1010 D:

    0 1 9 10 11 14 15 2 1 0 2 0 0 0 … …
  7. ཁૉ਺ͷਪఆ 1 1 2 4 C × 4 × 4

    1 22 + 1 21 + 1 21 + 1 24 ശ1ͭ͋ͨΓͷೱ౓