Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Event-driven microservices with Apache Kafka

Event-driven microservices with Apache Kafka

Running event-driven microservices with Apache Kafka, on Kubernetes and Openshift. The talk presents Strimzi, which provides a "Kubernetes Operator" to manage your Apache Kafka cluster, on Kubernetes or Openshift!

http://strimzi.io/

Materials: https://github.com/matzew/kafka-presentation

Matthias Wessendorf

June 08, 2018
Tweet

More Decks by Matthias Wessendorf

Other Decks in Technology

Transcript

  1. Enterprise Java Standards History J2EE 1.2 J2EE 1.3 J2EE 1.4

    Java EE 5 Java EE 6 Java EE 7 Java EE 8 2000 2005 2010 2015 2020 Release Cadence
  2. MicroProfile Background • Began as a collection of independent discussions

    ◦ Many innovative “microservices” efforts in existing Java EE projects ▪ WildFly Swarm (NOW: Thorntail) ▪ WebSphere Liberty ▪ Payara ▪ TomEE ◦ Projects already leveraging both Java EE and non-Java EE technologies ◦ Creating new features/capabilities to address microservices architectures • Quickly realized there is common ground • Java EE technologies are already being used for microservices, but we can do better
  3. MicroProfile Release Philosophy Release 1.0 JAX-RS CDI JSON-P Build consensus

    Standardize Rapidly iterate and innovate Sept 2016
  4. What is Apache Kafka? A publish/subscribe messaging system? A streaming

    data platform? A distributed, horizontally-scalable, fault-tolerant, commit log?
  5. Apache Kafka Concepts •Messages are sent to and received from

    a topic • Topics are split into one or more partitions (aka shards) • All actual work is done on partition level, topic is just a virtual object •Each message is written only into a one selected partition • Partitioning is usually done based on the message key • Message ordering within the partition is fixed •Retention • Based on size / message age • Compacted based on message key
  6. Kafka concepts Topics & partitions with Producers old new 0

    1 2 3 4 5 6 7 8 9 1 0 1 1 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 1 0 Producer Partition 0 Partition 1 Partition 2
  7. Kafka concepts Topics & partitions with Consumers old new 0

    1 2 3 4 5 6 7 8 9 1 0 1 1 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 1 0 Consumer Partition 0 Partition 1 Partition 2
  8. Kafka concepts Consumer Groups Group1 C1 C2 Group2 C3 C4

    C5 C6 Broker 1 T1 - P1 T1 - P2 Broker 2 T1 - P3 T1 - P4 Cluster •Logical grouping of consumers • The group receives the message... • Consumer might have a partition assigned •Separate scaling of groups • Scaling on use-case… ▪ Non-time-sensitive (down) ▪ Time-sensitive (up)
  9. Kafka concepts High availability Broker 1 T1 - P1 T1

    - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Leaders and followers spread across the cluster
  10. Kafka concepts High availability Broker 1 T1 - P1 T1

    - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 If a broker with leader partition goes down, a new leader partition is elected on different node
  11. Apache Kafka on OpenShift What is Strimzi ? • Open

    source project focused on running Apache Kafka on Kubernetes and OpenShift • Provides: ◦ Docker images for running Apache Kafka and Zookeeper ◦ Tooling for managing and configuring Apache Kafka clusters and topics • Follows the Kubernetes “operator” model • Licensed under Apache License 2.0 • Web site: http://strimzi.io/ • GitHub: https://github.com/strimzi • Slack: strimzi.slack.com • Mailing list: [email protected] • Twitter: @strimziio
  12. Apache Kafka on OpenShift The challenges • Apache Kafka is

    *stateful* which means we require … ◦ … a stable broker identity ◦ … a way for the brokers to discover each other on the network ◦ … durable broker state (i.e., the messages) ◦ … the ability to recover broker state after a failure • All the above are true for Apache Zookeeper as well • StatefulSets, PersistentVolumeClaims, Services can help but …
  13. Strimzi on OpenShift Goals • Simplifying the Apache Kafka deployment

    on OpenShift • Using the OpenShift native mechanisms for... ◦ Provisioning the cluster ◦ Managing the topics • … thereby removing the need to use Kafka command-line tools • Providing a better integration with applications running on OpenShift ◦ microservices, data streaming, event-sourcing, etc.
  14. Strimzi on OpenShift The “Operator” model • An application used

    to create, configure and manage other complex applications ◦ Contains specific domain / application knowledge • Controller operates based on input from Config Maps or Custom Resource Definitions ◦ User describes the desired state ◦ Controller applies this state to the application • It watches the *desired* state and the *actual* state … ◦ … taking appropriate actions Observe Analyze Act
  15. Strimzi on OpenShift Config Map versus Custom Resource Definitions •

    Operator are currently using Config Maps for configuration ◦ Main advantage of Config Maps is no need for special permissions to install Strimzi/AMQ Streams on OpenShift • CRDs have some advantages as well ◦ Flexible data structure ◦ Possibility to set permissions for the CRD resources • Support for CRDs is on backlog for the future
  16. Topic Operator Creating and managing Kafka topics Zookeeper (Topic Operator’s

    own storage) Kafka topics Topic Operator (3-way diff) Config Map
  17. Apache Kafka and Microprofile CDI to the rescue • Kafka

    library is easy to integrate • Wiring of Producers/Consumers with CDI • Contexts and Dependency Injection (CDI) for the Java EE* platform ◦ Contexts ▪ The ability to bind the lifecycle and interactions of stateful components to well-defined but extensible lifecycle contexts ◦ Dependency Injection ▪ The ability to inject components into an application in a typesafe way, including the ability to choose at deployment time which implementation of a particular interface to inject • THE FOUNDATION for frameworks, extensions and integration with other technologies!
  18. Outlook - KStreams API • KStream API ◦ Based on

    vanilla Apache Kafka client API ◦ Processor API ◦ DSL / functional programming for filter/map/reduce streams ◦ No external Streaming cluster needed ...
  19. Outlook II A rich ecosystem for Apache Kafka • Eclipse

    Vert.x ◦ Reactive wrappers for Apache Kafka client API • Debezium.io: CDC platform ◦ Not just Apache Kafka! Gunnar knows more ◦ Contains KafkaCluster for unit tests • AMQP-Kafka-Bridge (Strimzi) • ...
  20. Resources • Strimzi : http://strimzi.io/ • Apache Kafka : https://kafka.apache.org/

    • Kafka-CDI: https://github.com/aerogear/kafka-cdi • Demo : https://github.com/matzew/kafka-presentation