Slide 1

Slide 1 text

Kafka Streams for Event-Driven Microservices DevNation Live - Oct 18, 2018 Marius Bogoevici Principal Specialist Solution Architect @mariusbogoevici

Slide 2

Slide 2 text

Marius Bogoevici ● Principal Specialist Solution Architect at Red Hat ○ Specialize in Integration/Messaging/Data Streaming ● OSS contributor since 2008 ○ Spring Integration ○ Spring XD, Spring Integration Kafka ○ Former Spring Cloud Stream project lead ● Co-author “Spring Integration in Action”, Manning, 2012

Slide 3

Slide 3 text

INSERT DESIGNATOR, IF NEEDED 3 Why event-driven microservices? ● Event-driven architecture reduces friction ○ From a technical standpoint: ■ Building robust and resilient distributed architectures ○ From a development process standpoint ■ Composability encourage agility and experimentation ○ From a business standpoint: ■ Aligns digital business with the real world ● Microservice use cases ○ State propagation: CDC/Event Sourcing/CQRS ○ Event-driven DDD

Slide 4

Slide 4 text

INSERT DESIGNATOR, IF NEEDED 4 Coordinating state is hard

Slide 5

Slide 5 text

INSERT DESIGNATOR, IF NEEDED 5 Coordinating events and state is harder

Slide 6

Slide 6 text

INSERT DESIGNATOR, IF NEEDED 6 Kafka as a distributed messaging system How about applications that are both producers and consumers and perform complex computations?

Slide 7

Slide 7 text

INSERT DESIGNATOR, IF NEEDED 7 ● Client library for stream processing ○ Embed stream processing features into regular Java applications ○ Create sophisticated topologies of independent applications ○ One-record-at-a-time processing (no microbatching) ● Kafka-to-Kafka semantics ○ Event/State management coordination ○ Stateful processing support ○ Transactions/exactly once Kafka Cluster Application Kafka Streams Kafka Streams Overview Events State

Slide 8

Slide 8 text

INSERT DESIGNATOR, IF NEEDED 8 Kafka Streams - high level functional DSL KStream words = builder.stream(“words”) KTable countsTable = words.flatMapValues(value -> Arrays.asList(value.toLowerCase().split("\\W+"))) .map((key, value) -> new KeyValue<>(value, value)) .groupByKey(Serdes.String(), Serdes.String()) .count(timeWindows, "WordCounts"); KStream counts = counts.toStream() counts.to(“counts”)

Slide 9

Slide 9 text

INSERT DESIGNATOR, IF NEEDED 9 Key Kafka Streams abstractions ● KStream ○ Record stream abstraction ○ Read from/written to external topic as is ● KTable/GlobalKTable ○ Key/Value map abstraction ○ Read from/written to topic as a sequence of updates based on record key ○ Complex operations: joins, aggregations ● Stream/Table Duality ○ KStream -> KTable - read a stream as a changelog centered around the key ○ KTable -> KStream - table updates are produced as a stream ● Time windowing for aggregate operations

Slide 10

Slide 10 text

INSERT DESIGNATOR, IF NEEDED ● Kubernetes makes running complex topologies reliable, transparent and boring ● Not only for applications, but also messaging infrastructure ● In-built resource management ○ Memory, CPU, disk ● Elastic scaling ● Monitoring and failover ○ Health, logging, metrics ● Routing and load balancing ● Rolling upgrades and CI/CD ● Namespacing Kafka Streams and Kubernetes

Slide 11

Slide 11 text

Strimzi: Provisioning Kafka on Kubernetes What is Strimzi ? ● Open source project focused on running Apache Kafka on Kubernetes and OpenShift ● Available as a part of Red Hat AMQ ● Licensed under Apache License 2.0 ● Web site: http://strimzi.io/ ● GitHub: https://github.com/strimzi ● Slack: strimzi.slack.com ● Mailing list: strimzi@redhat.com ● Twitter: @strimziio

Slide 12

Slide 12 text

Cluster Controller Creating and managing Apache Kafka clusters Zookeeper Kafka Cluster Controller Config Map Manages

Slide 13

Slide 13 text

Topic Controller Creating and managing Kafka topics Zookeeper Kafka Topic Controller Config Map Manages topics

Slide 14

Slide 14 text

Kafka Streams on Kubernetes 14 Kafka Cluster Application Kafka Streams Container changelog events Application Kafka Streams Container changelog events Application Kafka Streams Container changelog events

Slide 15

Slide 15 text

INSERT DESIGNATOR, IF NEEDED 15 Kafka Streams: stateful and stateless deployments Kafka Cluster Application Kafka Streams In-memory state store Local disk ● Changes propagated to changelog topic ● Stored locally for recovery/restart ● Fully stateless deployments require to replay the topic on restart/failover ● State store recovery can be optimized by providing access to stateful deployments changelog events

Slide 16

Slide 16 text

INSERT DESIGNATOR, IF NEEDED 16 Kafka Streams with Kubernetes StatefulSets Application Kafka Streams Pod Application Kafka Streams Pod Application Kafka Streams Pod volume-word-count-0 word-count-1 word-count-2 volume-word-count-1 volume-word-count-2 word-count-0

Slide 17

Slide 17 text

THANK YOU plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat