Rise of the Real-time Stack

present The Rise of the Real-‐Time Stack:
An Evening with Open Source Technologists Storm Druid Ka-a Spark @dcvc @metamarkets @foundersden &

Opening Remarks: Mike Driscoll & Fangjin Yang
Metamarkets

ENGINEERING AT MMX: A BRIEF HISTORY Event Streams 2013
Insight Hadoop

ENGINEERING AT MMX: A BRIEF HISTORY Event Streams 2013
Insight Hadoop Hadoop Druid

ENGINEERING AT MMX: A BRIEF HISTORY 2013 Insight Event
Streams Hadoop Hadoop Druid Kafka Storm MR (Spark/ Hadoop) Druid

Next: Andy Feng, Yahoo! STORM

Apache Storm: Distributed RealIme CompuIng Andy Feng ([email protected])
DisInguished Architect, Yahoo CommiPer, Apache Storm

Apache Storm •  “Hadoop for RealIme” –  a
distributed, fault-‐tolerant, and high-‐performance realIme computaIon system that provides strong guarantees on the processing of data •  Emerging standard –  IniIally developed and deployed at BackType in 2011 –  Open sourced via github.com in September 2011 –  Apache incubaIon in September 2013 •  Apache Storm 0.9.1-‐incuba6ng release coming –  7 commiPers, 50 contributors, 12,000 readers

Companies Using Storm and many others (hPps://github.com/nathanmarz/storm/wiki/Powered-‐By)

•  100 Topologies, 400 nodes & growing – 
Ad budget management –  Ad preprocessing –  Content preprocessing –  Social signal processing –  User interest understanding –  Yahoo Finance –  Yahoo Weather –  E-‐Commerce –  System monitoring –  … Storm Use Cases @ Yahoo

Storm Concept •  Data –  Stream • 
Unbounded sequence of tuples –  Tuple •  Ordered list of elements •  ApplicaIon –  Topology •  DAG of spouts & bolts –  Spout •  Source of Stream –  Bolt •  Processes input streams •  Produces new streams

Parallelism & RouIng N tasks for 1 spout/bolt
RouMng •  Shuffle grouping –  Tuples are randomly distributed •  Fields grouping –  The stream is parIIoned by the specified fields •  All grouping –  The stream is replicated to ALL tasks •  Global grouping –  The stream goes to a SINGLE task •  Local or shuffle grouping –  Local tasks within the current process tasks preferred, otherwise apply shuffle grouping •  Direct grouping –  producer selects receiving task •  Customized rouMng

Reliability Guarantees Tuple tree 3 opMons
of guarantees 13 •  BEST EFFORT –  STORM will not track tuple tree •  AT LEAST ONCE –  A spout tuple is not fully processed unIl all tuples in the tree have been completed –  Storm tracks tuple trees for you in an extremely eﬃcient way •  EXACT ONCE –  A small batch of tuples are processed at a Ime –  Each batch completely succeeds or completely fails X

•  High-‐level operaIons –  Filter, Join, Group, AggregaIon,
FuncIon •  Exactly-‐once guarantee –  Stream = Small batches of tuples –  State updates are ordered among batches •  Stateful processing –  Ex., In-‐Memory, Memcached, Hbase Trident: Higher AbstracIon

Storm Topology on Cluster Worker Nimbus Zoo
keepers Supervisor Worker Worker 2 Zoo keepers Supervisor Supervisor Worker Supervisor 1 3 4 5 6 7

Storm: Player in Big-‐Data Ecosystem

AddiIonal Info •  Documents –  hPps://github.com/nathanmarz/storm/wiki – 
hPp://storm-‐project.net/ •  Code –  hPps://github.com/apache/incubator-‐storm •  Mailing Lists –  Storm users … [email protected] –  Storm developers … [email protected]

Backup

Storm Enhanced for MulI-‐tenancy @ Yahoo hPps://github.com/yahoo/incubator-‐storm/tree/security

Next: Eric TscheTer, Creator of Druid
DRUID

DRUID: IMMEDIATE, SLICE-‐N-‐DICE AGGREGATION ENGINE
ERIC TSCHETTER

“ ” “ ” REQUIREMENTS REQUIREMENTS

INGESTION

SLICE-‐N-‐DICE

AVAILABLE

• Ingestion •  Make it queryable in real-time • Slice-N-Dice •  Arbitrary
boolean ﬁlters • Available •  Downtime is evil REQUIREMENTS

“ ” HOW?

“SUM OF PARTS” ARCHITECTURE

WHERE DOES IT FIT? Storm Kafka Druid Spark

BUZZWORD SOUP •Column-oriented • Only scan what you need
•Distributed • Parallelize through horizontal scaling • Fault-tolerance through replication •Real-time • Make data queryable immediately upon ingestion

•Bitmap Indexes • Apply Boolean ﬁlters without looking at data
•Compressed • Dictionary encoding • LZF •Fault-tolerant • Hot Replication BUZZWORD SOUP -‐ PART 2 • Rebalancing

•Approximation • HyperLogLog • Approximate TopN •Caching • Perfect cache invalidation
• Per-segment cache •Rolling Deploys BUZZWORD SOUP -‐ PART 3 • No downtime software deploys

CLUSTER SIZES •Cluster 1 •  > 3 trillion
events (impressions, bids, etc.) •  > 110 billion rows •  80 machines •  90% query latency < 1s, 95% < 2s •Cluster 2 • real-time 150k events/s (7B events/day)

DRUID IS OPEN URL: hPp://www.druid.io LICENSE: GPL
v2 Get Involved: github.com/metamx/druid druid-‐[email protected] hTps://groups.google.com/d/forum/druid-‐ development #druid-‐dev on irc.freenode.net

Next: Jun Rao, LinkedIn KAFKA

Building a Real-‐Time Data Pipeline: Apache Kapa at Linkedin
Rise of the Real-‐Time Stack Jun Rao Feb 2014

We have a lot of data. We want to
leverage this data to build products. Data pipeline

Network update stream

System and applicaIon metrics/ logging

Point-‐to-‐point pipelines Oracle Oracle Oracle User Tracking Hadoop Log
Search Monitoring Data Warehous e Social Graph Rec. Engine Search Email Voldemort Voldemort Voldemort Espresso Espresso Espresso Logs Operational Metrics Production Services ... Security

Oracle Oracle Oracle User Tracking Hadoop Log Search Monitorin g
Data Warehous e Social Graph Rec Engine & Life Search Email Voldemort Voldemort Voldemort Espresso Espresso Espresso Logs Operational Metrics Production Services ... Security Data Pipeline Central data pipeline

Apache Kapa: a messaging system Topic 1 Topic 2
Producers Producers Consumer Consumer Topic partitions Brokers

Features and technologies •  High throughput –  Simple
storage, batch api, zero-‐copy transfer •  Distributed –  Built-‐in cluster management, parallel coordinated consumpIon •  Fault tolerant –  Auto data replicaIon, auto failover on producer/ consumer

Usage at LinkedIn •  16 brokers in each cluster
•  28 billion messages/day •  Peak rates –  Writes: 460,000 messages/second –  Reads: 2,300,000 messages/second •  ~ 700 topics •  Every producIon service is a producer •  ~50 live services consuming user-‐acIvity data •  Many ad hoc consumers •  3k connecIons/broker

Next: Matei Zaharia, Databricks SPARK

Panel Next:

Q&A Next:

Thanks For Coming! THE END

Rise of the Real-time Stack

Rise of the Real-time Stack

More Decks by Druid

Other Decks in Technology

Featured

Transcript