Using Monoids for Large Scale Business Stats

ashwanth kumar @_ashwanthkumar principal engineer using monoids for large scale
business stats

overview - Stats for Batch Jobs - Stats for Streaming
Workloads - Generalizing aggregations with Monoids - Abel - Some cool logos!

but stats for MR jobs?

Used Scalding (from Twitter) + Simple to express aggregations stats
as map-reduce jobs

Used Scalding (from Twitter) + Simple to express aggregations -
Have to include intermediary data in the output just for stats stats as map-reduce jobs

Have to include intermediary data in the output just for stats - Have to think about writing stats after writing production code Stats as Map-Reduce Jobs

Have to include intermediary data in the output just for stats - Have to think about writing stats after writing production code - Not updated at-least till next run (Not “realtime”) stats as map-reduce jobs

but what about services running forever?

(riemann.io)

Service A Service B Service C

stats for streaming workloads Push computed rollups to InfluxDB +
Realtime

Realtime + Allows arbitrary functions as rollups - code as config

Realtime + Allows arbitrary functions as rollups - code as config - Not distributable - since it allows arbitrary functions

Realtime + Allows arbitrary functions as rollups - code as config - Not distributable - since it allows arbitrary functions - Stats emission and rollups are at separate places - making it difficult to test and keep them in sync

lessons so far

- Have to include intermediary data in the output just
for stats - Have to think about writing stats after writing production code - Not updated at-least till next run (Not “realtime”) - Not distributable - since it allows arbitrary functions - Stats emission and rollups are at separate places - making it difficult to test and keep them in sync stats for map-reduce jobs stats for streaming workloads

what do we really want in stats?

aggregations distribution / parallelism real time

aggregates as monoids

An operation is considered a monoid if: (x . y)
. z = x . (y . z) (associativity aka semigroup) identity . x = x . identity = x (identity) monoids trait Semigroup[T] { def plus(left: T, right: T): T } trait Monoid[T] extends Semigroup[T] { def zero: T }

A monoid can also be commutative x . y =
y . x monoids Commutative property of monoids are used for parallel processing on large datasets

monoids - count / sum sum is associative sum(sum(2, 6),
6) == sum(2, sum(6, 6)) sum(8, 6) == sum(2, 12) 14 == 14

monoids - average Average of an average is not an
average, aka, not associative avg(avg(2, 6), 6) != avg(2, avg(6, 6)) avg(4, 6) != avg(2, 6) 5 != 4

monoids - average But Average can be associative, if we
have total & count individually case class Average(total: Double, count: Long) { def toAvg: Double = total / count }

monoids : themes

monoids : parallelism

parallel aggregations 3 4 ... 7 2 1 3 ...
8 7 5 ... 1 3 4 ... 7 Σ A 2 1 3 ... 8 7 5 ... 1 Σ C Σ B Σ = Σ A +Σ B +Σ C

monoids : approximates

- While sum is accurate, distinct counts in constant memory
are not monoids - approximates

are not - Approximate structures like HyperLogLog can find unique counts in constant memory (and a known error bound) monoids - approximates

are not - Approximate structures like HyperLogLog can find unique counts in constant memory (and a known error bound) - 2 more HLL can be merged and their merge is both associative and commutative - can be expressed as a monoid monoids - approximates

approximate stats now is better than accurate stats tomorrow

- Stats naturally can be expressed as Monoids learnings so
far

- Stats naturally can be expressed as Monoids - Monoids
given they are associative (and some are also commutative), we can exploit them for massive parallel processing learnings so far

- Stats naturally can be expressed as Monoids - Monoids
given they are associative (and some are also commutative), we can exploit them for massive parallel processing - We need our stats to be real time even if they’re approximate for some metrics as long as the error bounds are known learnings so far

Written in Scala Backed by RocksDB Uses twitter/algebird for HLL
Uses Kafka for stats delivery abel Consumes stats in (near) Realtime Expose aggregations over HTTP Crunches 1M events in less than 15 seconds on 1 machine

stats.service.ix abel data flow Count(“a”, 1L) Unique(“a”, 1L) Count(“c”, 1L)
Unique(“a”, 1L) Count(“b”, 1L)

abel: internals

case class Metric[T <: Aggregate[T]] (key: Key, value: T with
Aggregate[T]) abel internals Metric = Key * Aggregate (Semigroup)

trait Aggregate[T <: Aggregate[_]] { self: T => def plus(another:
T): T def show: JsValue } abel internals

case class Time(time: Long, granularity: Long) case class Key(name:String, tags:SortedSet[String],
time:Time = Time.Forever) abel internals Key = Name * Tags * Granularity * Timestamp

client.send(Metric(Key( name = “unique-ups-per-hour”, tags = SortedSet(“site:www.amazon.com”), time = Time.ThisHour
), UniqueCount(“825633348769”)) abel internals

Let’s find Unique count of a UPC occurring per site
and across all sites at the granularities of every hour, every day and overall. That would need 6 metrics per record. abel internals

abel internals client.send(Metrics( “unique-upcs”, (tag(“site:www.amazon.com”) | `#`) * (perhour |
perday | forever) * now, UniqueCount(“825633348769”) ))

abel: distributed (experimental) made possible with suuchi (github.com/ashwanthkumar/suuchi)

- Peer-to-Peer system built using Suuchi - Kafka consumer auto
rebalances the partitions across instances - Uses scatter-gather primitive in Suuchi to perform query time reductions of the metrics before serving it to the users distributed abel

1.1.1.1 1.1.1.2 1.1.1.3 A stats.service.ix 1.1.1.1 1.1.1.2 1.1.1.3 DNS based
Load Balancing distributed abel architecture Count(“a”, 1L) Unique(“a”, 1L) Count(“c”, 1L) Unique(“a”, 1L) Count(“b”, 1L) monoid.plus monoid.plus monoid.plus

twitter/algebird ashwanthkumar/suuchi

questions? https://github.com/ashwanthkumar/large-scale-business-stats-talk

-- . - .-

suuchi toolkit for building distributed function shipping applications github.com/ashwanthkumar/suuchi

rocksdb

Open source by facebook Fast persistent KV store Server Workloads
Embeddable Optimized for SSDs rocksdb Fork of LevelDB Modelled after BigTable LSM Tree based SST files Written in C++

Simple C++ API Has bindings in C Java Go Python
rocksdb

rocksdb @indix

- Serving our API in production for 3+ years -
Search on hierarchical documents - Dynamic fields didn’t scale well on Solr - Brand / Store / Category Counts for a filter - Price History Service - More than a billion prices and serve online to REST queries rocksdb @indix

- Stats (as Monoids) Storage System - All we want
was approximate aggregates real-time - HTML Archive System - Stores ~120TB of url and timestamp indexed HTML pages - Real-time scheduler for our crawlers - Finds out which of the 20 urls to crawl now out of 3+ billion urls - Helps crawler crawl 20+ million urls everyday rocksdb @indix

recursive reduction

- sum / multiplication - (sorted) top-K elements - operations
on a graph - eg. link reach on twitter graph - function should be associative and optionally commutative recursive reduction

Using Monoids for Large Scale Business Stats

Using Monoids for Large Scale Business Stats

More Decks by Ashwanth Kumar

Other Decks in Technology

Featured

Transcript