Slide 1

Slide 1 text

1 Distributed streaming made easy on the cloud Apache Kafka with Red Hat AMQ Streams Aykut M. Bulgu Middleware Consultant - Red Hat @systemcraftsman APACHE KAFKA WITH RED HAT AMQ STREAMS

Slide 2

Slide 2 text

2 Who am I? #oc apply -f aykutbulgu.yaml apiVersion: redhat/v1.5 kind: Middleware Consultant metadata: name: Aykut Bulgu namespace: Red Hat Consulting, CEMEA Annotations: twitter: @systemcraftsman organizer: Software Craftsmanship Turkey founder: System Craftsman labels: married: yes children: daughter interests: openshift, kubernetes, spring boot, middleware, infinispan, kafka, strimzi, camel spec: replicas: 1 containers: - image: aykut:latest

Slide 3

Slide 3 text

3 Messaging Messaging ≠ ≠ Messaging Low-latency pub/sub Cross-cloud backbone Temporal decoupling Load levelling Load balancing Enterprise application integration IoT device connectivity Message-driv en beans Event-driven microservices Long-term message storage Replayable streams Event sourcing Geo-aware routing Database change data capture

Slide 4

Slide 4 text

4 Flexible messaging for the enterprise, cloud and Internet of Things Standard Protocols Broker • Queuing and pub/sub • Rich feature set • JMS 2.0 compliance • Best-in-class perf • Based on Apache ActiveMQ Artemis Interconnect • Message routing • Secure messaging backbone for hybrid cloud • Based on Apache Qpid Dispatch Router Streams • Durable pub/sub • Replayable streams • Highly scalable • Based on Apache Kafka Online • Scalable, “self-service” messaging-as-a-service utility based on OpenShift • Available for self-managed and Red Hat-managed deployments Polyglot Clients Common Management Red Hat AMQ 2019

Slide 5

Slide 5 text

What is Apache Kafka? 5 A publish/subscribe messaging system. A data streaming platform A distributed, horizontally-scalable, fault-tolerant, commit log

Slide 6

Slide 6 text

6 Developed at LinkedIn back in 2010, open sourced in 2011 Distributed by design High throughput Designed to be fast, scalable, durable and highly available Data partitioning (sharding) Ability to handle huge number of consumers What is Apache Kafka?

Slide 7

Slide 7 text

Traditional Messaging 7 Queue Producer Consumer 1 2 3 Reference count-based message retention model When message is consumed it is deleted from broker “Smart broker, dumb client” Broker knows about all consumers Can perform per consumer filtering Throughput up to 4K/10K mes./sec.

Slide 8

Slide 8 text

Apache Kafka 8 Kafka Topic Producer Consumer 1 2 3 1 2 3 Time-based message retention model by default Messages are retained according to topic config (time or capacity) Also “compacted topic” – like a “last-value topic” “Dumb broker, smart client” Client maintains position in message stream Message stream can be replayed Throughput up to 1 million mes./sec.

Slide 9

Slide 9 text

Kafka Concepts - Producers 9 old new 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 Partition 0 Partition 1 Partition 2 Producer Topic

Slide 10

Slide 10 text

Kafka Concepts - Consumers 10 old new 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 Partition 0 Partition 1 Partition 2 Consumer Topic

Slide 11

Slide 11 text

Kafka Concepts - High Availability 11 Broker 1 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Leaders and followers spread across the cluster

Slide 12

Slide 12 text

Kafka Concepts - High Availability 12 If a broker with leader partition goes down, a new leader partition is elected on different node Broker 1 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2

Slide 13

Slide 13 text

Kafka Concepts - Interaction with Leaders 13 Broker 1 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Producer P2 Consumer C3 Consumer C1 Producer P1 Consumer C2

Slide 14

Slide 14 text

14 Based on OSS project called Strimzi Container images for running Apache Kafka and Zookeeper Operators for managing and configuring Apache Kafka clusters, topics and users Simplified deployment on OpenShift Broker, Connect, Streams, Mirror Maker, Java clients & management tools Apache Zookeeper (as a Kafka dependency) Enterprise distribution of Apache Kafka Red Hat AMQ Streams

Slide 15

Slide 15 text

15 Enterprise Kubernetes A comprehensive enterprise-grade application platform, built for containers with Kubernetes Orchestration Orchestrates computing, networking, and storage infrastructure on behalf of user workloads Modularity Better software management through modularity Simplified Deployment Simplifies the deployment and update of software at scale High Availability & Auto Scalability Provides high availability, horizontal autoscaling, rolling updates, canary deployments

Slide 16

Slide 16 text

Goals for AMQ Streams on OpenShift 16 Simplifying the deployment of Apache Kafka on OpenShift Using the OpenShift native mechanisms for; Provisioning the cluster Managing the topics and users Providing a better integration with applications running on OpenShift Microservices, data streaming, event-sourcing, etc.

Slide 17

Slide 17 text

Challanges 17 A Kafka cluster requires; A stable broker identity and stable network address A way for brokers to discover each other and communicate Durable state on brokers and storage recovery To have brokers accessible from clients, directly It runs alongside a Zookeeper ensemble which requires; Each node has the configuration of the others To have nodes able to communicate each others Accessing Kafka isn’t so simple

Slide 18

Slide 18 text

How Openshift Helps 18 Openshift provides; StatefulSets for stable identity and network Together with Headless services for internal discovery Services for accessing the cluster Secrets and ConfigMap for handling configurations PersistentVolume and PersistentVolumeClaim for durable storage Openshift/Kubernetes primitives help but still not easy

Slide 19

Slide 19 text

The Operator Pattern 19 An application used to create, configure and manage other complex applications Contains domain-specific domain knowledge Operator works based on input from Custom Resource Definitions (CRDs) User describes the desired state Controller applies this state to the application It watches the *desired* state and the *actual* state and makes forward progress to reconcile OperatorHub.io Observe Analyze Act

Slide 20

Slide 20 text

AMQ Streams Operators 20 Cluster Operator Kafka CR Kafka Zookeeper Deploys & manages cluster Topic Operator User Operator Topic CR User CR Manages topics & users

Slide 21

Slide 21 text

Accessing Kafka 21 Broker 1 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Producer P2 Consumer C3 Consumer C1 Producer P1 Consumer C2

Slide 22

Slide 22 text

Kafka’s Discovery Protocol 22

Slide 23

Slide 23 text

Openshift Cluster Internal Access 23

Slide 24

Slide 24 text

Openshift Cluster External Access 24

Slide 25

Slide 25 text

AMQ Streams on OpenShift 25 Tolerations Memory and CPU resources High Availability Mirroring Affinity Authentication Storage Encryption Scale Down JVM Configuration Logging Metrics Off cluster access Scale Up Authorization Healthchecks Source2Image Configuration

Slide 26

Slide 26 text

AMQ Streams Use Cases 26 Messaging Replacement of traditional message broker High scale, high throughput, built-in partitioning, replication, and fault-tolerance. Some limitations compared to traditional broker (filtering, standard protocols, JMS …) Website Activity Tracker Rebuild user activity tracking pipeline as a set of real-time publish-subscribe feeds. Activity is published to central topics with one topic per activity type Metrics Aggregation of statistics from distributed applications to produce centralized feeds of operational data. Log Aggregation Abstracts details of files an gives event data as stream of messages. Offers good performance, stronger durability guarantees due to replication. Stream Processing Enables continuous, real-time applications built to react to, process, or transform streams. Data Integration Captures streams of events or data changes and feed these to other data systems (see Debezium project).

Slide 27

Slide 27 text

Demo 27

Slide 28

Slide 28 text

Summary 28 AMQ Streams 1.3.0 is generally available now Productized distribution of Apache Kafka (v2.3.0) Broker, Connect, Streams, Mirror Maker, Java clients & management tools Apache Zookeeper (as a Kafka dependency) Simplified deployment and management on OpenShift Based on OSS project called Strimzi Container images for running Apache Kafka and Zookeeper Operators for managing and configuring Apache Kafka clusters, topics and users

Slide 29

Slide 29 text

Q/A 29

Slide 30

Slide 30 text

CONFIDENTIAL Designator linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat 30 Red Hat AMQ Streams https://access.redhat.com/products/red-hat-amq Strimzi http://strimzi.io/ @strimziio Apache Kafka https://kafka.apache.org/ Thank you APACHE KAFKA WITH RED HAT AMQ STREAMS