Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using Monoids for Business Metrics

Using Monoids for Business Metrics

Presentation given at TW Geeknight Chennai, Octobeer '17.
Video can be found at https://www.youtube.com/watch?v=RJepu3sbmkU

Ashwanth Kumar

October 26, 2017
Tweet

More Decks by Ashwanth Kumar

Other Decks in Technology

Transcript

  1. Uniques @ scale Host 1 Host 2 Host 3 Host

    4 Reduce at individual hosts
  2. Uniques @ scale Master Host * Not the right way

    to perform uniques, but the simplest
  3. Wouldn’t it be nice if we can express both kind

    of aggregations in a consistent way?
  4. An operation ( . ) is considered a monoid if:

    (x . y) . z = x . (y . z) (associativity aka semigroup) identity . x = x . identity = x (identity) trait Semigroup[T] { def plus(left: T, right: T): T } trait Monoid[T] extends Semigroup[T] { def zero: T } monoid
  5. ➔ While sum can be achieved in constant memory, distinct

    counts cannot be. ➔ Approximate structures like HyperLogLog can find unique counts in constant memory (under a known error bound). ➔ 2 or more HLLs can be merged and their result is a monoid. approx. monoids
  6. Uniques @ scale Host 1 Host 2 Host 3 Host

    4 Reduce at individual hosts using HLL Scatter
  7. distributed abel architecture 1.1.1.1 1.1.1.2 1.1.1.3 A stats.service.ix 1.1.1.1 1.1.1.2

    1.1.1.3 DNS based Load Balancing Count(“a”, 1L) Unique(“a”, 1L) Count(“c”, 1L) Unique(“a”, 1L) Count(“b”, 1L) monoid.plus monoid.plus monoid.plus