Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Outline of Big Data, Real-time Processing and Storm

Outline of Big Data, Real-time Processing and Storm

Outline for workshop on "Big Data, Real-time Processing and Storm" proposed for The Fifth Elephant, 2013 [http://fifthelephant.in/2013], Bangalore, India

Prashanth Babu

March 28, 2013
Tweet

More Decks by Prashanth Babu

Other Decks in Technology

Transcript

  1. Big Data, Real-time Processing and Storm Prashanth Babu http://About.Me/Prashanth Outline

    for workshop proposed for The Fifth Elephant, 2013, Bangalore Please let me know your feedback and / or comments on this Outline for the Workshop.
  2. Outline of the session  Big Data  Batch processing

     Real-time processing  Real-time vs. Batch processing  Tools / frameworks  Complex Event Processing  Event Stream Processing  CEP vs. ESP  Storm overview  Use cases of Storm  Comparison with other open source Big Data solutions  Storm vs. Hadoop
  3. Outline of the session  Storm Dependencies  Storm Architecture

     Storm Components o Nimbus o Supervisor o ZooKeeper  Storm Concepts o Tuples o Streams o Spouts o Bolts o Topologies
  4. Outline of the session  Storm UI  Modes of

    operation o Local Mode o Production Mode  Storm through an example Word Count demo  Trident  How the combination of Storm and Hadoop can help analyze data  Other features & Community Support  Real-time analysis of tweets with Storm  References  Q n A
  5. Storm Overview  Developed at BackType by Nathan Marz and

    team  Open sourced by Twitter [after acquiring Backtype] in August, 2011  Stream Processing  Distributed RPC  Continuous computation  Guaranteed data Processing  Fault tolerance  Horizontal scalability  CEP  Further discussion in the session. 