Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rise of the Real-time Stack

Druid
May 06, 2014

Rise of the Real-time Stack

Data Collective and Metamarkets hosted leading technologists in the open source world for drinks, food and conversation about the rise of real-time processing systems, the current limitations of the data engineering space, the value of open source solutions, and where real-time technology is headed.

Druid

May 06, 2014
Tweet

More Decks by Druid

Other Decks in Technology

Transcript

  1. present     The  Rise  of  the  Real-­‐Time  Stack:  

    An  Evening  with  Open  Source  Technologists     Storm            Druid            Ka-a            Spark                     @dcvc   @metamarkets   @foundersden     &  
  2. ENGINEERING  AT  MMX:  A  BRIEF  HISTORY   2013 Insight Event

    Streams Hadoop Hadoop Druid Kafka Storm MR (Spark/ Hadoop) Druid
  3. Apache  Storm:   Distributed  RealIme  CompuIng   Andy  Feng  ([email protected])

      DisInguished  Architect,  Yahoo   CommiPer,  Apache  Storm  
  4. Apache  Storm   •  “Hadoop  for  RealIme”   –  a

     distributed,  fault-­‐tolerant,  and  high-­‐performance  realIme   computaIon  system  that  provides  strong  guarantees  on  the   processing  of  data   •  Emerging  standard   –  IniIally  developed  and  deployed  at  BackType  in  2011   –  Open  sourced  via  github.com  in  September  2011   –  Apache  incubaIon  in  September  2013   •  Apache  Storm  0.9.1-­‐incuba6ng  release  coming   –  7  commiPers,  50  contributors,  12,000  readers  
  5. •  100  Topologies,  400  nodes   &  growing   – 

    Ad  budget  management   –  Ad  preprocessing   –  Content  preprocessing   –  Social  signal  processing   –  User  interest  understanding   –  Yahoo  Finance   –  Yahoo  Weather   –  E-­‐Commerce   –  System  monitoring   –  …   Storm  Use  Cases  @  Yahoo  
  6. Storm  Concept   •  Data   –  Stream   • 

    Unbounded  sequence  of  tuples   –  Tuple   •  Ordered  list  of  elements     •  ApplicaIon   –  Topology   •  DAG  of  spouts  &  bolts   –  Spout   •  Source  of  Stream   –  Bolt   •  Processes  input  streams     •  Produces  new  streams    
  7. Parallelism  &  RouIng   N  tasks  for  1  spout/bolt  

    RouMng   •  Shuffle  grouping   –  Tuples  are  randomly  distributed   •  Fields  grouping   –  The  stream  is  parIIoned  by  the  specified   fields   •  All  grouping   –  The  stream  is  replicated  to  ALL  tasks     •  Global  grouping   –  The  stream  goes  to  a  SINGLE  task   •  Local  or  shuffle  grouping   –  Local  tasks  within  the  current  process   tasks  preferred,  otherwise  apply  shuffle   grouping   •  Direct  grouping   –  producer  selects  receiving  task   •  Customized  rouMng    
  8. Reliability  Guarantees     Tuple  tree     3  opMons

     of  guarantees     13 •  BEST  EFFORT   –  STORM  will  not  track  tuple  tree   •  AT  LEAST  ONCE   –  A  spout  tuple  is  not  fully  processed   unIl  all  tuples  in  the  tree  have  been   completed   –  Storm  tracks  tuple  trees  for  you  in  an   extremely  efficient  way   •  EXACT  ONCE   –  A  small  batch  of  tuples  are  processed   at  a  Ime   –  Each  batch  completely  succeeds  or   completely  fails   X  
  9. •  High-­‐level  operaIons   –  Filter,  Join,  Group,  AggregaIon,  

    FuncIon   •  Exactly-­‐once  guarantee   –  Stream  =  Small  batches  of  tuples   –  State  updates  are  ordered  among   batches     •  Stateful  processing   –  Ex.,  In-­‐Memory,  Memcached,   Hbase     Trident:  Higher  AbstracIon  
  10. Storm  Topology  on  Cluster   Worker   Nimbus   Zoo

      keepers   Supervisor   Worker   Worker   2 Zoo   keepers   Supervisor   Supervisor   Worker   Supervisor   1 3 4 5 6 7
  11. AddiIonal  Info   •  Documents   –  hPps://github.com/nathanmarz/storm/wiki   – 

    hPp://storm-­‐project.net/   •  Code   –  hPps://github.com/apache/incubator-­‐storm   •  Mailing  Lists   –  Storm  users  …  [email protected]   –  Storm  developers  …  [email protected]  
  12. • Ingestion •  Make it queryable in real-time • Slice-N-Dice •  Arbitrary

    boolean filters • Available •  Downtime is evil REQUIREMENTS  
  13. BUZZWORD  SOUP   •Column-oriented   • Only scan what you need

    •Distributed   • Parallelize through horizontal scaling • Fault-tolerance through replication •Real-time   • Make data queryable immediately upon ingestion
  14. •Bitmap Indexes   • Apply Boolean filters without looking at data

    •Compressed   • Dictionary encoding • LZF •Fault-tolerant   • Hot Replication BUZZWORD  SOUP  -­‐  PART  2   • Rebalancing
  15. •Approximation   • HyperLogLog • Approximate TopN •Caching   • Perfect cache invalidation

    • Per-segment cache   •Rolling Deploys BUZZWORD  SOUP  -­‐  PART  3   • No downtime software deploys
  16. CLUSTER  SIZES   •Cluster 1   •  > 3 trillion

    events (impressions, bids, etc.) •  > 110 billion rows •  80 machines •  90% query latency < 1s, 95% < 2s •Cluster 2   • real-time 150k events/s (7B events/day)
  17. DRUID  IS  OPEN   URL:  hPp://www.druid.io     LICENSE:  GPL

     v2     Get  Involved:     github.com/metamx/druid     druid-­‐[email protected]     hTps://groups.google.com/d/forum/druid-­‐ development     #druid-­‐dev  on  irc.freenode.net    
  18. Building  a  Real-­‐Time  Data  Pipeline:   Apache  Kapa  at  Linkedin

      Rise  of  the  Real-­‐Time  Stack     Jun  Rao   Feb  2014  
  19. We  have  a  lot  of  data.   We  want  to

     leverage  this  data  to  build  products.     Data pipeline
  20. Point-­‐to-­‐point  pipelines   Oracle Oracle Oracle User Tracking Hadoop Log

    Search Monitoring Data Warehous e Social Graph Rec. Engine Search Email Voldemort Voldemort Voldemort Espresso Espresso Espresso Logs Operational Metrics Production Services ... Security
  21. Oracle Oracle Oracle User Tracking Hadoop Log Search Monitorin g

    Data Warehous e Social Graph Rec Engine & Life Search Email Voldemort Voldemort Voldemort Espresso Espresso Espresso Logs Operational Metrics Production Services ... Security Data Pipeline Central  data  pipeline  
  22. Apache  Kapa:  a  messaging  system   Topic 1 Topic 2

    Producers Producers Consumer Consumer Topic partitions Brokers
  23. Features  and  technologies   •  High  throughput   –  Simple

     storage,  batch  api,  zero-­‐copy  transfer   •  Distributed   –  Built-­‐in  cluster  management,  parallel  coordinated   consumpIon   •  Fault  tolerant   –  Auto  data  replicaIon,  auto  failover  on  producer/ consumer    
  24. Usage  at  LinkedIn   •  16  brokers  in  each  cluster

      •  28  billion  messages/day   •  Peak  rates   –  Writes:  460,000  messages/second   –  Reads:  2,300,000  messages/second   •  ~  700  topics   •  Every  producIon  service  is  a  producer   •  ~50  live  services  consuming  user-­‐acIvity  data   •  Many  ad  hoc  consumers   •  3k  connecIons/broker