Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cloud Native Apache Kafka with Strimzi on Kubernetes and Openshift

Cloud Native Apache Kafka with Strimzi on Kubernetes and Openshift

Presentation doing a bit Apache Kafka explanation, and than diving into how to run and manage an Apache Kafka cluster, on Openshift and Kubernetes.

The Strimzi Open-Source project implements a Kubernetes Operator, for managing various aspects of Apache Kafka on Openshift and Kubernetes:
* Cluster configuration
* Topic management
* User management

Here is a gist to get started with minikube and Strimzi:
https://gist.github.com/matzew/a5efcaa60eedeb910711becaa1534e01

Df135e9a2604ec2ce5d12ad049a8c99b?s=128

Matthias Wessendorf

October 11, 2018
Tweet

Transcript

  1. Cloud Native Apache Kafka with Strimzi.io RH_PREStemp_light_v2_0816 1 Matthias Wessendorf

    Principal Software Engineer @mwessendorf
  2. What is Apache Kafka? A publish/subscribe messaging system? A streaming

    data platform? A distributed, horizontally-scalable, fault-tolerant, commit log?
  3. DEMO : WebSocket to Kafka

  4. Apache Kafka Concepts •Messages are sent to and received from

    a topic • Topics are split into one or more partitions (aka shards) • All actual work is done on partition level, topic is just a virtual object •Each message is written only into a one selected partition • Partitioning is usually done based on the message key • Message ordering within the partition is fixed •Retention • Based on size / message age • Compacted based on message key
  5. Kafka concepts Topics & partitions with Producers old new 0

    1 2 3 4 5 6 7 8 9 1 0 1 1 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 1 0 Producer Partition 0 Partition 1 Partition 2
  6. Kafka concepts Topics & partitions with Consumers old new 0

    1 2 3 4 5 6 7 8 9 1 0 1 1 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 1 0 Consumer Partition 0 Partition 1 Partition 2
  7. Kafka concepts Consumer Groups Group1 C1 C2 Group2 C3 C4

    C5 C6 Broker 1 T1 - P1 T1 - P2 Broker 2 T1 - P3 T1 - P4 Cluster •Logical grouping of consumers • The group receives the message... • Consumer might have a partition assigned •Separate scaling of groups • Scaling on use-case… ▪ Non-time-sensitive (down) ▪ Time-sensitive (up)
  8. Kafka concepts High availability Broker 1 T1 - P1 T1

    - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Leaders and followers spread across the cluster
  9. Kafka concepts High availability Broker 1 T1 - P1 T1

    - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 If a broker with leader partition goes down, a new leader partition is elected on different node
  10. DEMO : WebSocket behind the scenes The JAVA API

  11. Apache Kafka on Kubernetes & OpenShift The challenges • Apache

    Kafka is *stateful* which means we require … ◦ … a stable broker identity ◦ … a way for the brokers to discover each other on the network ◦ … durable broker state (i.e., the messages) ◦ … the ability to recover broker state after a failure • All the above are true for Apache Zookeeper as well • StatefulSets, PersistentVolumeClaims, Services can help but …
  12. It’s not easy!

  13. Apache Kafka on Kubernetes & OpenShift What is Strimzi ?

    • Open source project focused on running Apache Kafka on Kubernetes and OpenShift • Provides: ◦ Docker images for running Apache Kafka and Zookeeper ◦ Tooling for managing and configuring Apache Kafka clusters and topics • Follows the Kubernetes “operator” model • Licensed under Apache License 2.0 • Web site: http://strimzi.io/ • GitHub: https://github.com/strimzi • Slack: strimzi.slack.com • Mailing list: strimzi@redhat.com • Twitter: @strimziio
  14. Strimzi on Kubernetes & OpenShift Goals • Simplifying the Apache

    Kafka deployment on OpenShift/k8s • Using the OpenShift / kube-native mechanisms for... ◦ Provisioning the cluster ◦ Managing the topics • … thereby removing the need to use Kafka command-line tools • Providing a better integration with applications running on OpenShift/k8s ◦ microservices, data streaming, event-sourcing, etc.
  15. Strimzi on Kubernetes & OpenShift The “Operator” model • An

    application used to create, configure and manage other complex applications ◦ Contains specific domain / application knowledge • Controller operates based on input from Config Maps or Custom Resource Definitions ◦ User describes the desired state ◦ Controller applies this state to the application • It watches the *desired* state and the *actual* state … ◦ … taking appropriate actions Observe Analyze Act
  16. Strimzi on Kubernetes & OpenShift Config Map versus Custom Resource

    Definitions • OLD version: used Config Maps for configuration... ◦ Main advantage of Config Maps is no need for special permissions to install Strimzi/AMQ Streams on OpenShift • However,... CRDs have some advantages as well ◦ Flexible data structure ◦ Possibility to set permissions for the CRD resources
  17. Cluster Operator Creating and managing Apache Kafka clusters Zookeeper Kafka

    Cluster Operator Cluster CR Manages
  18. DEMO : CLUSTER DEPLOYMENT (using minikube)

  19. Topic Operator Zookeeper Kafka Topic Operator Topic CR Manages Topics

    Creating and managing Kafka topics
  20. Topic Operator Creating and managing Kafka topics Zookeeper (Topic Operator’s

    own storage) Kafka topics Topic Operator (3-way diff) Topic CR
  21. DEMO : TOPICS MANAGEMENT

  22. Outlook A rich ecosystem for Apache Kafka • Kafka-CDI /

    reactive Messaging (MP) • Eclipse Vert.x ◦ Reactive wrappers for Apache Kafka client API • Debezium.io: CDC platform ◦ Not just Apache Kafka! Gunnar knows more ◦ Contains KafkaCluster for unit tests • AMQP-Kafka-Bridge (Strimzi) • ...
  23. Resources • Strimzi : http://strimzi.io/ • Apache Kafka : https://kafka.apache.org/

    • Demo : https://github.com/matzew/kafka-presentation
  24. THANK YOU plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHatNews 24