Slide 1

Slide 1 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Cloud Native Data Streaming Microservices with Spring Cloud and Kafka Marius Bogoevici, Pivotal @mariusbogoevici

Slide 2

Slide 2 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Who am I ? • Software Engineer with Pivotal – Project Lead, Spring Cloud Stream • Spring ecosystem contributor since 2008: – Spring Integration, Spring XD, Spring Integration Kafka, – Spring Cloud Stream, Spring Cloud Data Flow • Co-author, “Spring Integration in Action”, Manning, 2012

Slide 3

Slide 3 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Why microservices for data processing? • Cohesiveness around business capability (“do one thing and do it well”) • Organizational alignment (Conway’s Law), cross-team collaboration • Development agility • Optimized for replacement • Enable continuous delivery • Failure isolation • Granular resource tuning: • scaling out the critical parts of the pipeline • per-process: memory, CPU, network 3

Slide 4

Slide 4 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/

Slide 5

Slide 5 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Event-driven (Messaging) Microservices • Solving communication complexity for data processing • Decoupling: • Physical: discovery • Temporal: availability • Eventual consistency vs. shared stores/distributed transactions • especially over heterogenous resources • Pub-sub makes it easy to add new elements to the topology

Slide 6

Slide 6 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Data and Event Streaming: Conceptually Similar Data Streaming: ingestion, analytics Async interaction, event sourcing

Slide 7

Slide 7 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 7 file jms http Kafka cassandra Solution : messaging microservices with Kafka

Slide 8

Slide 8 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 8 file jms http Kafka cassandra count-words Solution : messaging microservices with Kafka (and Kafka Streams!)

Slide 9

Slide 9 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ How about operational complexity? • Distributed systems are inherently complex, operating them even more so • Operational prerequisites: • Self-servicing and provisioning • elastic infrastructure • Monitoring • Rapid delivery • CI/CD, deployment pipeline • automation • “DevOps culture” https://martinfowler.com/bliki/MicroservicePrerequisites.html 9

Slide 10

Slide 10 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 10

Slide 11

Slide 11 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Cloud native applications and platforms • Target a platform that makes running apps reliable, transparent and boring • In-built resource management • Memory, CPU, networking • Elastic scaling • Monitoring and failover • Health, logging, metrics • Routing and load balancing • Rolling upgrades 11 Apache YARN Apache Mesos Kubernetes

Slide 12

Slide 12 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 12 file jms http Kafka Platform cassandra count-words Cloud-native event-driven microservices with Kafka

Slide 13

Slide 13 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ The Monolith, the Platform and the Microservice(s) 13 Spring Cloud Stream 2015 Spring XD Spring Cloud Data Flow Spring Cloud Task

Slide 14

Slide 14 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Simple things should be simple; complex things should be possible. — Alan Kay

Slide 15

Slide 15 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Easy to build

Slide 16

Slide 16 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Microservice development challenge: reducing the boilerplate code Monolith Boilerplate Business code In practice Microservices In theory

Slide 17

Slide 17 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Boot • Spring Framework for microservices • Simplified application structure: • Opinionated autoconfiguration based on application dependencies • Elliminate boilerplate, focus on business code • Externalized configuration • Environment, run-time arguments, system properties • Immutable artifact • Uberjar with nested dependencies • Executes from command line • Management and monitoring (JMX, HTTP) • Actuator endpoints: metrics, health, pause/shutdown

Slide 18

Slide 18 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Initializr: start.spring.io

Slide 19

Slide 19 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Trend of projects generated with Spring Initializr

Slide 20

Slide 20 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Stream • Event-driven microservice framework • Goal: simplify writing messaging applications, bias towards event/data streaming • Built on battle-tested components • Spring Boot: full-stack standalone applications • Spring Integration: Messaging, EIP patterns, connectors • Opinionated primitives provided as configuration options: • Durable publish-subscribe • Consumer groups • Semantic partitioning • Pluggable middleware abstractions 20

Slide 21

Slide 21 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Stream in a nutshell

Slide 22

Slide 22 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Binder abstraction

Slide 23

Slide 23 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 23

Slide 24

Slide 24 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 24

Slide 25

Slide 25 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ … 10000 ft nutshell

Slide 26

Slide 26 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Programming model @EnableBinding + Binder Implementation Apache Kafka JMS Google PubSub Production-ready: Experimental

Slide 27

Slide 27 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Programming model: individual message handling 27 @SpringBootApplication
 @EnableBinding(Processor.class)
 public class UppercaseProcessor {
 
 @StreamListener(“input”)
 @SendTo(“output”)
 public String process(String s) {
 return s.toUpperCase();
 }
 }

Slide 28

Slide 28 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Stream + Kafka Streams • Overlapping paradigm: encapsulated input/output • Spring Cloud Stream + KStream • subscribes input KStreams to topics • connects the output KStream to topics • OOTB stateful processing support with KStream • Spring Cloud Stream content type negotiation • Or use Confluent Schema Registry directly • Underlying Spring Boot boot support • Flexible configuration: program arguments, environment variables • Actuator endpoints: health, metrics

Slide 29

Slide 29 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Programming model : Kafka Streams 29 @StreamListener(“input”) @SendTo(“output”)
 public KStream wordCount(KStream,String> input) {
 return input.map((key, word) -> new KeyValue<>(word.toUppercase(), 1)) .groupByKey() .count(“Counts”) .toStream() }


Slide 30

Slide 30 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Stream + Kafka Streams input output Spring Cloud Stream KStream Processor words word-counts Spring Boot KStream API Spring Cloud Stream Programming model (developer focus) Application model (configuration options, StreamConfig based on Spring Boot properties, KStreamBuilder, KStream, lifecycle) Externalized configuration, uberjar construction, health monitoring endpoints (framework focus)

Slide 31

Slide 31 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Programming model : Kafka Streams (2) 31 @StreamListener @SendTo(“clicksImpressions”) public KStream join( @Input(“clicks”) KStream clicks, @Input(“impressions”) KStream users){
 // join clicks and impressions }
 Functional programming model with multiple inputs and outputs

Slide 32

Slide 32 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Easy to orchestrate and deploy

Slide 33

Slide 33 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Simple topologies : (relatively) easy to deploy … 33 http HDFS spring.cloud.stream.bindings.output.destination=httphdfs spring.cloud.stream.bindings.input.destination=httphdfs spring.cloud.stream.bindings.input.group=httphdfs httphdfs.1

Slide 34

Slide 34 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ … but how about complex topologies ? 34 http raw-sensor-data averages top-n Calculator Failure detector averages averages HDFS HDFS HDFS

Slide 35

Slide 35 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Data Flow • Orchestration: • DSL for Stream topologies • REST API • Shell • UI • Portable Deployment SPI • OOTB apps for common integration use-cases 35

Slide 36

Slide 36 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Data Flow - Stream DSL 36 Stream definition Spring Boot Apps built with Spring Cloud Stream httpfile = http | file |

Slide 37

Slide 37 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Spring Cloud Data Flow Deployment Platforms 37 Data Flow Server REST API Deployer SPI SCDF Flo SCDF Shell

Slide 38

Slide 38 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 38 cassandra gpfdist http stream1 = http | count-words | file stream2 = jms | cassandra Kafka Data Flow Server DB jms file jms http Kafka Platform cassandra count-words

Slide 39

Slide 39 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Deployment: Partitioning and Instance Count 39 http http work work work hdfs hdfs hdfs hdfs Load Balancer stream create s1 --definition “http | work | hdfs” stream deploy s1 --propertiesFile ingest.properties app.http.count=2 app.work.count=3 app.hdfs.count=4 app.http.producer.partitionKeyExpression=payload.id

Slide 40

Slide 40 text

Unless otherwise indicated, these slides are © 2013-2016 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Deployment: Resource Management 40 http http work work work app.work.spring.cloud.deployer.cloudfoundry.memory=2048

Slide 41

Slide 41 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Conclusions • Spring Cloud Stream, Spring Cloud Data Flow and Kafka are complementary • Kafka provides: • High-throughput, low latency messaging middleware (transport) • Stream processing engine via Kafka Streams • Spring Cloud Stream provides: • Spring Boot integration • Boilerplate reduction via opinionated application model • Spring Cloud Data Flow provides: • High-level orchestration for sophisticated topologies • Simplified deployment on a number of platforms: Cloud Foundry, Kubernetes, Mesos, Yarn

Slide 42

Slide 42 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Some links … http://cloud.spring.io/spring-cloud-stream http://cloud.spring.io/spring-cloud-dataflow https://github.com/mbogoevici/spring-cloud-stream-binder-kstream https://github.com/spring-cloud/spring-cloud-stream-samples

Slide 43

Slide 43 text

Unless otherwise indicated, these slides are © 2013-2017, Pivotal Software, Inc. and licensed under a Creative Commons Attribution- NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Questions ?