Upgrade to Pro — share decks privately, control downloads, hide ads and more …

StormTrooperStreams.pdf

rauluka7
June 10, 2020
27

 StormTrooperStreams.pdf

Processing the unbounded streams of data in a distributed system sounds like a challenge. Fortunately, there is a tool that can make your way easier. Łukasz will share his experience as a "Storm Trooper", a user of Apache Storm framework, announced to be a first streaming engine to break the 1-microsecond latency barrier. His story will start by describing the processing model. He'll tell you how to build your distributed application using spouts, bolts, and topologies. Then he'll move to components that make your apps work in a distributed way. That's the part when three guys: Nimbuses, Supervisors, and Zookeepers join in and help to build a cluster. As a result, he'll be able to show you a demo app, running on Apache Storm. As you know, Storm Troopers are famous for missing targets. Łukasz will sum up the talk by sharing the drawbacks and ideas that he missed when he first met this technology. After the presentation, you can start playing with processing streams or compare your current approach with the Apache Storm model. And who knows, maybe you'll become a Storm Trooper.

rauluka7

June 10, 2020
Tweet

Transcript

  1. The Storm Trooper Way of Processing Streams 10.06.2020 J on

    The beach Post apocalypse session By Łukasz Gebel
  2. Tuple • Set of named fields and values • Single

    chunk of data stream • Any serializable object
  3. Spout • Starting point of a pipeline • Can produce

    data • Can poll data from external sources
  4. Light Side • Easy to start • Flexible • Easily

    integrates with other tools • Minimal performance overhead • Out of the box distributed architecture
  5. Dark Side • Topologies are static • Scaling – run

    multiple topologies or rebalance (it may stop for a moment!) • When task fail whole worker is restarted • Mind serialization and network latency while designing topology
  6. Summary • We know how to build topologies • We

    know how to run them on cluster • We know the main pros and cons
  7. Call to action • If your new to distributed processing

    start your journey with Storm • If your experienced – benchmark its performance • You can get managed cloud Storm cluster on Azure, using HDInsight
  8. Bibliography • Code: https://github.com/rauluka/stormtrooper-streams • Presentation: https://speakerdeck.com/rauluka7/stormtrooperstreams • https://storm.apache.org/releases/2.1.0/Tutorial.html •

    https://www.cloudera.com/tutorials/storm-in-trucking-iot/3.html • https://azure.microsoft.com/en-us/services/hdinsight/ • https://github.com/yahoo/streaming-benchmarks • https://storm.apache.org/releases/2.1.0/Trident-tutorial.html • https://storm.apache.org/releases/2.1.0/Guaranteeing-message-processing.html • https://www.cloudera.com/products/open-source/apache-hadoop/apache-storm.html • https://dzone.com/articles/apache-spark-vs-apache-storm • https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-quickstart
  9. Graphics • https://pixabay.com/photos/stormtrooper-star-wars-lego-storm-2899993/ • https://pixabay.com/photos/stormtrooper-lego-push-effort-4601509/ • https://pixabay.com/photos/stormtrooper-star-wars-lego-storm-2899982/ • https://pixabay.com/photos/stormtrooper-skateboard-lego-1995015/ •

    https://pixabay.com/photos/stormtrooper-star-wars-lego-storm-1343877/ • https://pixabay.com/photos/office-people-accused-accusing-2539844/ • https://pixabay.com/illustrations/ship-cosmos-space-technology-3857479/ • https://pixabay.com/photos/star-wars-storm-trooper-costume-2592430/ • https://i.ytimg.com/vi/Yv93L1KqMMc/maxresdefault.jpg • https://2.allegroimg.com/s1024/0cc3cf/5792fb7d49229863ad2b51536bb2 • https://pm1.narvii.com/6287/eaeff089928ea166e5987e2169201437459488fc_hq.jpg • https://pixabay.com/photos/star-wars-darth-wader-villain-2463926/ • https://pixabay.com/photos/stormtrooper-star-wars-lego-storm-1343772/
  10. Bibliography • Code: https://github.com/rauluka/stormtrooper-streams • Presentation: https://speakerdeck.com/rauluka7/stormtrooperstreams • https://storm.apache.org/releases/2.1.0/Tutorial.html •

    https://www.cloudera.com/tutorials/storm-in-trucking-iot/3.html • https://azure.microsoft.com/en-us/services/hdinsight/ • https://github.com/yahoo/streaming-benchmarks • https://storm.apache.org/releases/2.1.0/Trident-tutorial.html • https://storm.apache.org/releases/2.1.0/Guaranteeing-message-processing.html • https://www.cloudera.com/products/open-source/apache-hadoop/apache-storm.html • https://dzone.com/articles/apache-spark-vs-apache-storm • https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-quickstart