Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using Monoids for Business Metrics

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Using Monoids for Business Metrics

Presentation given at TW Geeknight Chennai, Octobeer '17.
Video can be found at https://www.youtube.com/watch?v=RJepu3sbmkU

Avatar for Ashwanth Kumar

Ashwanth Kumar

October 26, 2017
Tweet

More Decks by Ashwanth Kumar

Other Decks in Technology

Transcript

  1. Uniques @ scale Host 1 Host 2 Host 3 Host

    4 Reduce at individual hosts
  2. Uniques @ scale Master Host * Not the right way

    to perform uniques, but the simplest
  3. Wouldn’t it be nice if we can express both kind

    of aggregations in a consistent way?
  4. An operation ( . ) is considered a monoid if:

    (x . y) . z = x . (y . z) (associativity aka semigroup) identity . x = x . identity = x (identity) trait Semigroup[T] { def plus(left: T, right: T): T } trait Monoid[T] extends Semigroup[T] { def zero: T } monoid
  5. ➔ While sum can be achieved in constant memory, distinct

    counts cannot be. ➔ Approximate structures like HyperLogLog can find unique counts in constant memory (under a known error bound). ➔ 2 or more HLLs can be merged and their result is a monoid. approx. monoids
  6. Uniques @ scale Host 1 Host 2 Host 3 Host

    4 Reduce at individual hosts using HLL Scatter
  7. distributed abel architecture 1.1.1.1 1.1.1.2 1.1.1.3 A stats.service.ix 1.1.1.1 1.1.1.2

    1.1.1.3 DNS based Load Balancing Count(“a”, 1L) Unique(“a”, 1L) Count(“c”, 1L) Unique(“a”, 1L) Count(“b”, 1L) monoid.plus monoid.plus monoid.plus