Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cloud Native Data Streaming Microservices With Spring Cloud and Kafka

Cloud Native Data Streaming Microservices With Spring Cloud and Kafka

[Talk given at the Kafka Summit NYC, May 8, 2017]

Marius Bogoevici

May 10, 2017
Tweet

More Decks by Marius Bogoevici

Other Decks in Programming

Transcript

  1. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Cloud Native Data Streaming Microservices with Spring Cloud and Kafka Marius Bogoevici, Pivotal @mariusbogoevici
  2. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Who am I ? • Software Engineer with Pivotal – Project Lead, Spring Cloud Stream • Spring ecosystem contributor since 2008: – Spring Integration, Spring XD, Spring Integration Kafka, – Spring Cloud Stream, Spring Cloud Data Flow • Co-author, “Spring Integration in Action”, Manning, 2012
  3. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Why microservices for data processing? • Cohesiveness around business capability (“do one thing and do it well”) • Organizational alignment (Conway’s Law), cross-team collaboration • Development agility • Optimized for replacement • Enable continuous delivery • Failure isolation • Granular resource tuning: • scaling out the critical parts of the pipeline • per-process: memory, CPU, network 3
  4. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
  5. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Event-driven (Messaging) Microservices • Solving communication complexity for data processing • Decoupling: • Physical: discovery • Temporal: availability • Eventual consistency vs. shared stores/distributed transactions • especially over heterogenous resources • Pub-sub makes it easy to add new elements to the topology
  6. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Data and Event Streaming: Conceptually Similar Data Streaming: ingestion, analytics Async interaction, event sourcing
  7. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 7 file jms http Kafka cassandra Solution : messaging microservices with Kafka
  8. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 8 file jms http Kafka cassandra count-words Solution : messaging microservices with Kafka (and Kafka Streams!)
  9. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ How about operational complexity? • Distributed systems are inherently complex, operating them even more so • Operational prerequisites: • Self-servicing and provisioning • elastic infrastructure • Monitoring • Rapid delivery • CI/CD, deployment pipeline • automation • “DevOps culture” https://martinfowler.com/bliki/MicroservicePrerequisites.html 9
  10. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 10
  11. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Cloud native applications and platforms • Target a platform that makes running apps reliable, transparent and boring • In-built resource management • Memory, CPU, networking • Elastic scaling • Monitoring and failover • Health, logging, metrics • Routing and load balancing • Rolling upgrades 11 Apache YARN Apache Mesos Kubernetes
  12. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 12 file jms http Kafka Platform cassandra count-words Cloud-native event-driven microservices with Kafka
  13. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ The Monolith, the Platform and the Microservice(s) 13 Spring Cloud Stream 2015 Spring XD Spring Cloud Data Flow Spring Cloud Task
  14. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Simple things should be simple; complex things should be possible. — Alan Kay
  15. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Easy to build
  16. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Microservice development challenge: reducing the boilerplate code Monolith Boilerplate Business code In practice Microservices In theory
  17. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Boot • Spring Framework for microservices • Simplified application structure: • Opinionated autoconfiguration based on application dependencies • Elliminate boilerplate, focus on business code • Externalized configuration • Environment, run-time arguments, system properties • Immutable artifact • Uberjar with nested dependencies • Executes from command line • Management and monitoring (JMX, HTTP) • Actuator endpoints: metrics, health, pause/shutdown
  18. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Initializr: start.spring.io
  19. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Trend of projects generated with Spring Initializr
  20. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Stream • Event-driven microservice framework • Goal: simplify writing messaging applications, bias towards event/data streaming • Built on battle-tested components • Spring Boot: full-stack standalone applications • Spring Integration: Messaging, EIP patterns, connectors • Opinionated primitives provided as configuration options: • Durable publish-subscribe • Consumer groups • Semantic partitioning • Pluggable middleware abstractions 20
  21. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Stream in a nutshell
  22. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Binder abstraction
  23. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 23
  24. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 24
  25. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ … 10000 ft nutshell
  26. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Programming model @EnableBinding + Binder Implementation Apache Kafka JMS Google PubSub Production-ready: Experimental
  27. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Programming model: individual message handling 27 @SpringBootApplication
 @EnableBinding(Processor.class)
 public class UppercaseProcessor {
 
 @StreamListener(“input”)
 @SendTo(“output”)
 public String process(String s) {
 return s.toUpperCase();
 }
 }
  28. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Stream + Kafka Streams • Overlapping paradigm: encapsulated input/output • Spring Cloud Stream + KStream • subscribes input KStreams to topics • connects the output KStream to topics • OOTB stateful processing support with KStream • Spring Cloud Stream content type negotiation • Or use Confluent Schema Registry directly • Underlying Spring Boot boot support • Flexible configuration: program arguments, environment variables • Actuator endpoints: health, metrics
  29. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Programming model : Kafka Streams 29 @StreamListener(“input”) @SendTo(“output”)
 public KStream<String,Integer> wordCount(KStream<?,String> input) {
 return input.map((key, word) -> new KeyValue<>(word.toUppercase(), 1)) .groupByKey() .count(“Counts”) .toStream() }

  30. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Stream + Kafka Streams input output Spring Cloud Stream KStream Processor words word-counts Spring Boot KStream API Spring Cloud Stream Programming model (developer focus) Application model (configuration options, StreamConfig based on Spring Boot properties, KStreamBuilder, KStream, lifecycle) Externalized configuration, uberjar construction, health monitoring endpoints (framework focus)
  31. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Programming model : Kafka Streams (2) 31 @StreamListener @SendTo(“clicksImpressions”) public KStream<byte[], ClicksImpressions> join( @Input(“clicks”) KStream<byte[],Click> clicks, @Input(“impressions”) KStream<byte[],Impressions> users){
 // join clicks and impressions }
 Functional programming model with multiple inputs and outputs
  32. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Easy to orchestrate and deploy
  33. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Simple topologies : (relatively) easy to deploy … 33 http HDFS spring.cloud.stream.bindings.output.destination=httphdfs spring.cloud.stream.bindings.input.destination=httphdfs spring.cloud.stream.bindings.input.group=httphdfs httphdfs.1
  34. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ … but how about complex topologies ? 34 http raw-sensor-data averages top-n Calculator Failure detector averages averages HDFS HDFS HDFS
  35. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Data Flow • Orchestration: • DSL for Stream topologies • REST API • Shell • UI • Portable Deployment SPI • OOTB apps for common integration use-cases 35
  36. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Data Flow - Stream DSL 36 Stream definition Spring Boot Apps built with Spring Cloud Stream httpfile = http | file |
  37. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Data Flow Deployment Platforms 37 Data Flow Server REST API Deployer SPI SCDF Flo SCDF Shell
  38. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 38 cassandra gpfdist http stream1 = http | count-words | file stream2 = jms | cassandra Kafka Data Flow Server DB jms file jms http Kafka Platform cassandra count-words
  39. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Deployment: Partitioning and Instance Count 39 http http work work work hdfs hdfs hdfs hdfs Load Balancer stream create s1 --definition “http | work | hdfs” stream deploy s1 --propertiesFile ingest.properties app.http.count=2 app.work.count=3 app.hdfs.count=4 app.http.producer.partitionKeyExpression=payload.id
  40. Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Deployment: Resource Management 40 http http work work work app.work.spring.cloud.deployer.cloudfoundry.memory=2048
  41. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Conclusions • Spring Cloud Stream, Spring Cloud Data Flow and Kafka are complementary • Kafka provides: • High-throughput, low latency messaging middleware (transport) • Stream processing engine via Kafka Streams • Spring Cloud Stream provides: • Spring Boot integration • Boilerplate reduction via opinionated application model • Spring Cloud Data Flow provides: • High-level orchestration for sophisticated topologies • Simplified deployment on a number of platforms: Cloud Foundry, Kubernetes, Mesos, Yarn
  42. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Some links … http://cloud.spring.io/spring-cloud-stream http://cloud.spring.io/spring-cloud-dataflow https://github.com/mbogoevici/spring-cloud-stream-binder-kstream https://github.com/spring-cloud/spring-cloud-stream-samples
  43. Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software,

    Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Questions ?