2019-04 Kafka on Kubernetes

Event-Driven Microservices with Kafka & Kubernetes Toronto Java User Group
- April 25, 2019 Marius Bogoevici Principal Specialist Solutions Architect Red Hat [email protected] @mariusbogoevici

Marius Bogoevici • Principal Specialist Solutions Architect at Red Hat
◦ Specialize in Integration/Messaging/Data Streaming • OSS contributor since 2008 ◦ Spring Integration ◦ Spring XD, Spring Integration Kafka ◦ Former Spring Cloud Stream project lead • Co-author “Spring Integration in Action”, Manning, 2012

Event-Driven Microservices ...

INSERT DESIGNATOR, IF NEEDED 4 Why microservices? Monolith Microservices Operational
efficiency Fast value delivery

INSERT DESIGNATOR, IF NEEDED 5 Still, exactly why microservices? Fast
value delivery, meaning ... Fixes New features Experiments Increased confidence

INSERT DESIGNATOR, IF NEEDED 6 Monolith vs. Microservices

INSERT DESIGNATOR, IF NEEDED 7 https://martinfowler.com/articles/microservice-trade-offs.html

INSERT DESIGNATOR, IF NEEDED 8 Adopting microservices means dealing with
the inherent complexity of distributed systems

INSERT DESIGNATOR, IF NEEDED 9 Request-reply vs. event-driven communication Synchronous
& ephemeral Low composability Simpliﬁed model Low tolerance to failure Best practices evolved as REST Asynchronous and persistent Decoupled Highly composable Complex model High tolerance to failure Best practices are still evolving

INSERT DESIGNATOR, IF NEEDED 10 Event-driven architecture reduces friction •
From a technical standpoint: ◦ Building robust and resilient distributed architectures • From a development process standpoint ◦ High composability encourage agility and experimentation • From a business standpoint: ◦ Aligning digital business with the real world

INSERT DESIGNATOR, IF NEEDED 11 Event-driven microservices Applications Messaging Middleware

INSERT DESIGNATOR, IF NEEDED 12 Why event-driven microservices? • Asynchronous
communication patterns ◦ Decoupling: logical, spatial, temporal • Enable eventual consistency across heterogenous resources ◦ Alternative to distributed transactions • Composability via pub-sub integration

INSERT DESIGNATOR, IF NEEDED 13 Key challenges of event-driven microservices
• Programming model ◦ Requires higher abstractions than vendor-speciﬁc producer/consumer APIs ◦ Requires higher level DSLs than simple message handling • Messaging infrastructure ◦ Large number of producers/consumers ◦ Complex interaction patterns - esp for pub-sub • Complex operations ◦ Scaling, elasticity, resiliency, etc.

… with Kafka …

What is Apache Kafka? A publish/subscribe messaging system A data
streaming platform A distributed, horizontally-scalable, fault-tolerant, commit log

Traditional Messaging Queue Producer Consumer •Reference count-based message retention model
• When message is consumed it is deleted from broker •“Smart broker, dumb client” • Broker knows about all consumers • Can perform per consumer ﬁltering

Apache Kafka Kafka Topic Producer Consumer 1 2 3 •Time-based
message retention model by default • Messages are retained according to topic conﬁg (time or capacity) • Also “compacted topic” – like a “last-value topic” •“Dumb broker, smart client” • Client maintains position in message stream • Message stream can be replayed

Kafka Concepts High Availability Broker 1 T1 - P1 T1
- P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Leaders and followers spread across the cluster

Kafka Concepts High Availability If a broker with leader partition
goes down, a new leader partition is elected on different node Broker 1 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2

Kafka Concepts Clients Interact With Leaders Broker 1 T1 -
P1 T1 - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Producer P2 Consumer C3 Consumer C1 Producer P1 Consumer C2

Topic 21 Consumer Groups Partitions assignment Partition 0 Partition 1
Partition 2 Partition 3 Group 1 Consumer Consumer Group 2 Consumer Consumer Consumer

Topic 22 Consumer Groups Rebalancing Partition 0 Partition 1 Partition
2 Partition 3 Group 1 Consumer Consumer Group 2 Consumer Consumer Consumer

Topic 23 Consumer Groups Max parallelism & Idle consumer Partition
0 Partition 1 Partition 2 Partition 3 Group 1 Consumer Consumer Consumer Consumer Consumer

INSERT DESIGNATOR, IF NEEDED 24 Traditional messaging Kafka • Advantage
in: individual message exchanges (transactionality, acknowledgment, error handling/DLQs), P2P/competing consumer support • Strong support for queueing/competing consumers • Publish-subscribe support (with limitations) • No replay support • Advantage in: long-term persistence, replay, semantic partitioning, large publisher/subscriber imbalances, replay and late-coming subscribers • Weak support for individual message acknowledgment, p2p/competing consumers Kafka vs. traditional messaging

INSERT DESIGNATOR, IF NEEDED 25

… and Kubernetes

INSERT DESIGNATOR, IF NEEDED • Reduce overhead in running services
• Higher density/utilization gains • Portable across deployment platforms • Rich ecosystem (see Kubernetes!) Containerization

INSERT DESIGNATOR, IF NEEDED 28 Containers and microservices

INSERT DESIGNATOR, IF NEEDED • Kubernetes makes running complex topologies
reliable, transparent and boring • Stateless and stateful workloads ◦ Not only applications, but also messaging infra • In-built resource management ◦ Memory, CPU, disk • Elastic scaling • Monitoring and failover ◦ Health, logging, metrics • Routing and load balancing • Rolling upgrades and CI/CD • Namespacing Kubernetes as a runtime platform

Strimzi: Provisioning Kafka on Kubernetes What is Strimzi ? •
Open source project focused on running Apache Kafka on Kubernetes and OpenShift • Available as a part of Red Hat AMQ • Licensed under Apache License 2.0 • Web site: http://strimzi.io/ • GitHub: https://github.com/strimzi • Slack: strimzi.slack.com • Mailing list: [email protected] • Twitter: @strimziio

Kafka on Kubernetes ? • As more application workloads move
to Kubernetes, it makes sense to bring Kafka to the same environment • Serve as the foundation for event-driven microservices • Beneﬁt from Kubernetes core strengths • However Kafka is stateful which requires: • a stable broker identity • a way for the brokers to discover each other on the network • durable broker state (i.e., the messages) • the ability to recover broker state after a failure • Kubernetes primitives help but still not easy

INSERT DESIGNATOR, IF NEEDED 32

Goals for Strimzi • Simplifying the deployment of Apache Kafka
on Kubernetes • Using the Kubernetes native mechanisms for... • Provisioning the cluster • Persistence • Ordering and identity • Managing the topics and users • Providing a better integration with applications running on Kubernetes • Microservices, data streaming, event-sourcing, etc.

Stateful Sets and Persistent Volumes • Description: ◦ Provides an
identity to each pod of the set that corresponds to that pod’s persistent volume(s) ◦ If a StatefulSet pod is lost, a new pod with the same virtual identity is reinstated and the associated storage is reattached • Beneﬁts ◦ Alleviate complex, state-related problems ◦ Automation of manual process ◦ Easy to run stateful applications at scale

The Operator Pattern • An application used to create, configure
and manage other complex applications ◦ Contains domain-specific domain knowledge • Operator works based on input from Custom Resource Definitions (CRDs) ◦ User describes the desired state ◦ Controller applies this state to the application • It watches the *desired* state and the *actual* state and makes forward progress to reconcile Observe Analyze Act

Strimzi Operators Cluster Operator Kafka CR Kafka Zookeeper Deploys &
manages cluster Topic Operator User Operator Topic CR User CR Manages topics & users

Kafka Streams: coordinating events and state

INSERT DESIGNATOR, IF NEEDED 38 Coordinating state is hard

INSERT DESIGNATOR, IF NEEDED 39 Coordinating events and state is
harder

INSERT DESIGNATOR, IF NEEDED 40 • Client library for stream
processing ◦ Embed stream processing features into regular Java applications ◦ Create sophisticated topologies of independent applications ◦ One-record-at-a-time processing (no microbatching) • Kafka-to-Kafka semantics ◦ Event/State management coordination ◦ Stateful processing support ◦ Transactions/exactly once Kafka Cluster Application Kafka Streams Kafka Streams Overview Events State

INSERT DESIGNATOR, IF NEEDED 41 Kafka Streams - high level
functional DSL KStream words = builder.stream(“words”) KTable countsTable = words.flatMapValues(value -> Arrays.asList(value.toLowerCase().split("\\W+"))) .map((key, value) -> new KeyValue<>(value, value)) .groupByKey(Serdes.String(), Serdes.String()) .count(timeWindows, "WordCounts"); KStream counts = counts.toStream() counts.to(“counts”)

INSERT DESIGNATOR, IF NEEDED 42 Key Kafka Streams abstractions •
KStream ◦ Record stream abstraction ◦ Read from/written to external topic as is • KTable/GlobalKTable ◦ Key/Value map abstraction ◦ Read from/written to topic as a sequence of updates based on record key ◦ Complex operations: joins, aggregations • Stream/Table Duality ◦ KStream -> KTable - read a stream as a changelog centered around the key ◦ KTable -> KStream - table updates are produced as a stream • Time windowing for aggregate operations

Kafka Streams on Kubernetes 43 Kafka Cluster Application Kafka Streams
Container changelog events Application Kafka Streams Container changelog events Application Kafka Streams Container changelog events

INSERT DESIGNATOR, IF NEEDED 44 Kafka Streams: stateful and stateless
deployments Kafka Cluster Application Kafka Streams In-memory state store Local disk • Changes propagated to changelog topic • Stored locally for recovery/restart • Fully stateless deployments require to replay the topic on restart/failover • State store recovery can be optimized by providing access to stateful deployments changelog events

INSERT DESIGNATOR, IF NEEDED 45 Kafka Streams with Kubernetes StatefulSets
Application Kafka Streams Pod Application Kafka Streams Pod Application Kafka Streams Pod volume-word-count-0 word-count-1 word-count-2 volume-word-count-1 volume-word-count-2 word-count-0

INSERT DESIGNATOR, IF NEEDED 46 Back to the future Camel-K
https://istio.io/ https://github.com/apache/camel-k https://github.com/knative/ QUARKUS https://quarkus.io/ https://debezium.io/

THANK YOU plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat

2019-04 Kafka on Kubernetes

2019-04 Kafka on Kubernetes

More Decks by Toronto Java Users Group

Other Decks in Education

Featured

Transcript