Summingbird: Streaming MapReduce at Twitter

Summingbird: Streaming MapReduce at Twitter

Summingbird is a platform for streaming map/reduce used at Twitter to build aggregations in real-time or on hadoop. When the programmer describes her job, that job can be run without change on Storm or Hadoop. Additionally, summingbird can manage merging realtime/online computations with offline batches so that small errors in real-time do not accumulate. Put another way, summingbird gives eventual consistency in a manner that is easy for the programmer to reason about.


Sam Ritchie

June 20, 2013