Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Event-driven microservices with Apache Kafka

Event-driven microservices with Apache Kafka

Running event-driven microservices with Apache Kafka, on Kubernetes and Openshift. The talk presents Strimzi, which provides a "Kubernetes Operator" to manage your Apache Kafka cluster, on Kubernetes or Openshift!

http://strimzi.io/

Materials: https://github.com/matzew/kafka-presentation

Avatar for Matthias Wessendorf

Matthias Wessendorf

June 08, 2018
Tweet

More Decks by Matthias Wessendorf

Other Decks in Technology

Transcript

  1. Enterprise Java Standards History J2EE 1.2 J2EE 1.3 J2EE 1.4

    Java EE 5 Java EE 6 Java EE 7 Java EE 8 2000 2005 2010 2015 2020 Release Cadence
  2. MicroProfile Background • Began as a collection of independent discussions

    ◦ Many innovative “microservices” efforts in existing Java EE projects ▪ WildFly Swarm (NOW: Thorntail) ▪ WebSphere Liberty ▪ Payara ▪ TomEE ◦ Projects already leveraging both Java EE and non-Java EE technologies ◦ Creating new features/capabilities to address microservices architectures • Quickly realized there is common ground • Java EE technologies are already being used for microservices, but we can do better
  3. MicroProfile Release Philosophy Release 1.0 JAX-RS CDI JSON-P Build consensus

    Standardize Rapidly iterate and innovate Sept 2016
  4. What is Apache Kafka? A publish/subscribe messaging system? A streaming

    data platform? A distributed, horizontally-scalable, fault-tolerant, commit log?
  5. Apache Kafka Concepts •Messages are sent to and received from

    a topic • Topics are split into one or more partitions (aka shards) • All actual work is done on partition level, topic is just a virtual object •Each message is written only into a one selected partition • Partitioning is usually done based on the message key • Message ordering within the partition is fixed •Retention • Based on size / message age • Compacted based on message key
  6. Kafka concepts Topics & partitions with Producers old new 0

    1 2 3 4 5 6 7 8 9 1 0 1 1 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 1 0 Producer Partition 0 Partition 1 Partition 2
  7. Kafka concepts Topics & partitions with Consumers old new 0

    1 2 3 4 5 6 7 8 9 1 0 1 1 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 1 0 Consumer Partition 0 Partition 1 Partition 2
  8. Kafka concepts Consumer Groups Group1 C1 C2 Group2 C3 C4

    C5 C6 Broker 1 T1 - P1 T1 - P2 Broker 2 T1 - P3 T1 - P4 Cluster •Logical grouping of consumers • The group receives the message... • Consumer might have a partition assigned •Separate scaling of groups • Scaling on use-case… ▪ Non-time-sensitive (down) ▪ Time-sensitive (up)
  9. Kafka concepts High availability Broker 1 T1 - P1 T1

    - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Leaders and followers spread across the cluster
  10. Kafka concepts High availability Broker 1 T1 - P1 T1

    - P2 T2 - P1 T2 - P2 Broker 2 T1 - P1 T1 - P2 T2 - P1 T2 - P2 Broker 3 T1 - P1 T1 - P2 T2 - P1 T2 - P2 If a broker with leader partition goes down, a new leader partition is elected on different node
  11. Apache Kafka on OpenShift What is Strimzi ? • Open

    source project focused on running Apache Kafka on Kubernetes and OpenShift • Provides: ◦ Docker images for running Apache Kafka and Zookeeper ◦ Tooling for managing and configuring Apache Kafka clusters and topics • Follows the Kubernetes “operator” model • Licensed under Apache License 2.0 • Web site: http://strimzi.io/ • GitHub: https://github.com/strimzi • Slack: strimzi.slack.com • Mailing list: [email protected] • Twitter: @strimziio
  12. Apache Kafka on OpenShift The challenges • Apache Kafka is

    *stateful* which means we require … ◦ … a stable broker identity ◦ … a way for the brokers to discover each other on the network ◦ … durable broker state (i.e., the messages) ◦ … the ability to recover broker state after a failure • All the above are true for Apache Zookeeper as well • StatefulSets, PersistentVolumeClaims, Services can help but …
  13. Strimzi on OpenShift Goals • Simplifying the Apache Kafka deployment

    on OpenShift • Using the OpenShift native mechanisms for... ◦ Provisioning the cluster ◦ Managing the topics • … thereby removing the need to use Kafka command-line tools • Providing a better integration with applications running on OpenShift ◦ microservices, data streaming, event-sourcing, etc.
  14. Strimzi on OpenShift The “Operator” model • An application used

    to create, configure and manage other complex applications ◦ Contains specific domain / application knowledge • Controller operates based on input from Config Maps or Custom Resource Definitions ◦ User describes the desired state ◦ Controller applies this state to the application • It watches the *desired* state and the *actual* state … ◦ … taking appropriate actions Observe Analyze Act
  15. Strimzi on OpenShift Config Map versus Custom Resource Definitions •

    Operator are currently using Config Maps for configuration ◦ Main advantage of Config Maps is no need for special permissions to install Strimzi/AMQ Streams on OpenShift • CRDs have some advantages as well ◦ Flexible data structure ◦ Possibility to set permissions for the CRD resources • Support for CRDs is on backlog for the future
  16. Topic Operator Creating and managing Kafka topics Zookeeper (Topic Operator’s

    own storage) Kafka topics Topic Operator (3-way diff) Config Map
  17. Apache Kafka and Microprofile CDI to the rescue • Kafka

    library is easy to integrate • Wiring of Producers/Consumers with CDI • Contexts and Dependency Injection (CDI) for the Java EE* platform ◦ Contexts ▪ The ability to bind the lifecycle and interactions of stateful components to well-defined but extensible lifecycle contexts ◦ Dependency Injection ▪ The ability to inject components into an application in a typesafe way, including the ability to choose at deployment time which implementation of a particular interface to inject • THE FOUNDATION for frameworks, extensions and integration with other technologies!
  18. Outlook - KStreams API • KStream API ◦ Based on

    vanilla Apache Kafka client API ◦ Processor API ◦ DSL / functional programming for filter/map/reduce streams ◦ No external Streaming cluster needed ...
  19. Outlook II A rich ecosystem for Apache Kafka • Eclipse

    Vert.x ◦ Reactive wrappers for Apache Kafka client API • Debezium.io: CDC platform ◦ Not just Apache Kafka! Gunnar knows more ◦ Contains KafkaCluster for unit tests • AMQP-Kafka-Bridge (Strimzi) • ...
  20. Resources • Strimzi : http://strimzi.io/ • Apache Kafka : https://kafka.apache.org/

    • Kafka-CDI: https://github.com/aerogear/kafka-cdi • Demo : https://github.com/matzew/kafka-presentation