Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Spring XD - Unifying Stream & Batch processing - Paris Data Geeks

Eric Bottard
September 18, 2014

Spring XD - Unifying Stream & Batch processing - Paris Data Geeks

Find more about Spring XD at http://projects.spring.io/spring-xd/

Eric Bottard

September 18, 2014
Tweet

More Decks by Eric Bottard

Other Decks in Programming

Transcript

  1. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Unifying Stream & Batch Processing with Spring XD Eric Bottard Paris Data Geeks - Streaming Platforms Sept. 18, 2014 - Criteo
  2. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 2
  3. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 3 “ One stop shop for developing and deploying Big Data applications. „ ― a guy in the elevator
  4. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ What’s a Big Data Application, anyway? 4
  5. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ What’s a Big Data Application, anyway? 4
  6. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ What’s a Big Data Application, anyway? 5
  7. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ What’s a Big Data Application, anyway? 6
  8. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ What’s a Big Data Application, anyway? 6
  9. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 7 “ Big Data problems are Integration problems „ ― M. Fisher, Spring XD co-lead
  10. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Streams
  11. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Streams
  12. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Streams How can we make this easier?
  13. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Streams http | filter | file
  14. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Streams http | filter | file
  15. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Streams http | filter | file Non-linear stream definitions also supported
  16. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD Features • Unified Platform • Streams ὑ Batch • Developer Productivity • Interactive Shell • DSL for Streams & Jobs • Many pre-built modules • UI for Batch & Stream management • Hadoop distro agnostic 9 • Architecture • Distributed, Scalable • Fault Tolerant • Pluggable Middleware • Portable Runtime • Standalone • Amazon EC2 • Hadoop YARN • Cloud Foundry
  17. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Streams
  18. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ HTTP Tail File   Tcp   Kafka Mail Twitter Gemfire   Syslog   JMS RabbitMQ   MQTT Spring XD – Streams
  19. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ HTTP Tail File   Tcp   Kafka Mail Twitter Gemfire   Syslog   JMS RabbitMQ   MQTT Spring XD – Streams Filter Transformer Aggregator   Splitter   PMML  model   Shell   Groovy  Script Java  Code
  20. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ HTTP Tail File   Tcp   Kafka Mail Twitter Gemfire   Syslog   JMS RabbitMQ   MQTT Spring XD – Streams Filter Transformer Aggregator   Splitter   PMML  model   Shell   Groovy  Script Java  Code File HDFS JDBC TCP   Analytics*   MongoDB Mail RabbitMQ   Gemfire   Splunk   MQTT Dynamic  Router
  21. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ HTTP Tail File   Tcp   Kafka Mail Twitter Gemfire   Syslog   JMS RabbitMQ   MQTT Spring XD – Streams MessageBus Filter Transformer Aggregator   Splitter   PMML  model   Shell   Groovy  Script Java  Code File HDFS JDBC TCP   Analytics*   MongoDB Mail RabbitMQ   Gemfire   Splunk   MQTT Dynamic  Router
  22. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Demo: Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Hello World 11
  23. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 12 “ Spring XD is to Spring Integration & Spring Batch what ElasticSearch is to Lucene „ ― Eric Bottard
  24. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD: Unified Platform for Big Data 13 Spring XD Runtime BIDIRECTIONAL Compute HDFS RDBMS NoSQL R, SAS Streams Jobs ingest workflow export taps Predictive Modeling >_ Redis
  25. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD: Quality of Service • High Availability • Cluster Management built upon ZooKeeper • Leader Election for Admin Nodes • Module Redeployment across Container Nodes • Customizable Deployment • Module Count and Criteria for placement • Data Partitioning by Key • Direct Binding for Co-located Modules 14
  26. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime
  27. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime ! XD Shell
  28. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime ! XD Shell HTTP POST /streams/myStream “http | file”
  29. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime ! XD Shell HTTP POST /streams/myStream “http | file”
  30. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime XD  Admin ! XD Shell HTTP POST /streams/myStream “http | file”
  31. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime XD  Admin ! XD Shell HTTP POST /streams/myStream “http | file” ZooKeeper Container State
  32. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ XD  Container XD  Container Spring XD – Runtime XD  Admin ! XD Shell Assigns Modules to Containers HTTP POST /streams/myStream “http | file” ZooKeeper Container State
  33. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ XD  Container XD  Container Spring XD – Runtime XD  Admin ! XD Shell HTTP Module Spring App Context Assigns Modules to Containers HTTP POST /streams/myStream “http | file” ZooKeeper Container State
  34. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ XD  Container XD  Container Spring XD – Runtime XD  Admin ! XD Shell File   Module HTTP Module Spring App Context Assigns Modules to Containers HTTP POST /streams/myStream “http | file” ZooKeeper Container State
  35. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ XD  Container XD  Container Spring XD – Runtime XD  Admin ! XD Shell File   Module HTTP Module Spring App Context Assigns Modules to Containers Message  Bus HTTP POST /streams/myStream “http | file” ZooKeeper Container State
  36. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime XD  Container XD  Container XD  Admin ! XD Shell Message  Bus File   Module HTTP Module Spring App Context Assigns Modules to Containers ZooKeeper Container State
  37. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime XD  Container XD  Container XD  Admin ! XD Shell Message  Bus File   Module HTTP Module Spring App Context Assigns Modules to Containers HTTP POST /streams/aStream “M1 | M2” ZooKeeper Container State
  38. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime XD  Container XD  Container XD  Admin ! XD Shell Message  Bus File   Module HTTP Module Spring App Context M1 Assigns Modules to Containers HTTP POST /streams/aStream “M1 | M2” ZooKeeper Container State
  39. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime XD  Container XD  Container XD  Admin ! XD Shell Message  Bus File   Module HTTP Module Spring App Context M1 M2 Assigns Modules to Containers HTTP POST /streams/aStream “M1 | M2” ZooKeeper Container State
  40. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime – Fault Tolerance XD  Container XD  Container XD  Admin (leader) ! XD Shell HTTP POST /streams/aStream “M1 | M2” Message  Bus File   Module HTTP Module Spring App Context M1 M2 ZooKeeper Container State XD  Admin XD  Admin
  41. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring XD – Runtime – Fault Tolerance XD  Container XD  Admin (leader) ! XD Shell HTTP POST /streams/aStream “M1 | M2” Message  Bus File   Module M2 ZooKeeper Container State XD  Admin XD  Admin
  42. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ XD  Container Spring XD – Runtime – Fault Tolerance XD  Admin (leader) ! XD Shell HTTP POST /streams/aStream “M1 | M2” Message  Bus File   Module M2 ZooKeeper Container State XD  Admin XD  Admin HTTP Module M1
  43. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Demo Domain: Smart Grid Data Inspired by ACM DEBS 2014 Grand Challenge* Demonstrate the applicability of event-based systems to provide scalable, real-time analytics over high volume sensor data to compute load forecasting. http://www.cse.iitb.ac.in/debs2014 * 20 Field Name Description ID Unique ID of the measurement Timestamp Number of seconds since epoch Load Load in watts House ID The house where the plug is located Household ID The household inside the house Plug ID The unique ID of the smart plug
  44. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Demo Objectives • Ingest household energy data to HDFS via HTTP • Real-time aggregation and visualization • Real-time model evaluation to predict energy demand 21 ingest: http | hdfs count: tap:stream:ingest > aggregate-counter predict: tap:stream:ingest > pmml | aggregate-counter partition on house
  45. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ XD Admin Zoo Keeper Zoo Keeper Zoo Keeper Load Balancer Generator XD Container HTTP XD Container HTTP XD Container HDFS XD Container HDFS XD Container HDFS Hadoop Hadoop Hadoop Rabbit Redis Demo Topology
  46. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Demo: Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Smart Grid 23
  47. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Looking Ahead • Programming Model • Further Unification of the Batch and Stream Models • Reactive Streams • Developer Experience • Support for Java Config and Spring Integration DSLs • Spring Boot-based Module Deployment and Packaging • Deployment Targets • Cloud Foundry Service • Docker • Mesos 24
  48. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 25 “ I am hoping that I can be known as a great writer and actor some day, rather than a sex symbol„ ― Steven Seagal
  49. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Thank You Eric Bottard Pivotal t@ebottard Äericbottard http://projects.spring.io/spring-xd/ ‛
  50. Unless otherwise indicated, these slides are © 2013-2014 Pivotal Software,

    Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Thank You Eric Bottard Pivotal t@ebottard Äericbottard http://projects.spring.io/spring-xd/ ‛