Kafka Streams for Event Driven Microservices

Kafka Streams for Event Driven Microservices

Kafka Streams makes it easy to build JavaTM or Scala applications that interact with Kafka clusters, providing features that have been traditionally available in streaming platforms as part of standalone applications. Enterprises around the world use it to build solutions for data streaming, real-time analytics or event-driven architecture.

In this session, we’ll introduce the Kafka Streams application programming interface (API) and the Kafka Streams processing engine. We’ll show how to easily write and deploy Kafka Streams applications and how to take advantage of the enterprise Kubernetes platform, OpenShift®, for deploying microservice-based event driven and data streaming solutions.

Ee7ff5474c7ecfe0ec209df0eeb531fa?s=128

Marius Bogoevici

October 18, 2018
Tweet

Transcript

  1. Kafka Streams for Event-Driven Microservices DevNation Live - Oct 18,

    2018 Marius Bogoevici Principal Specialist Solution Architect @mariusbogoevici
  2. Marius Bogoevici • Principal Specialist Solution Architect at Red Hat

    ◦ Specialize in Integration/Messaging/Data Streaming • OSS contributor since 2008 ◦ Spring Integration ◦ Spring XD, Spring Integration Kafka ◦ Former Spring Cloud Stream project lead • Co-author “Spring Integration in Action”, Manning, 2012
  3. INSERT DESIGNATOR, IF NEEDED 3 Why event-driven microservices? • Event-driven

    architecture reduces friction ◦ From a technical standpoint: ▪ Building robust and resilient distributed architectures ◦ From a development process standpoint ▪ Composability encourage agility and experimentation ◦ From a business standpoint: ▪ Aligns digital business with the real world • Microservice use cases ◦ State propagation: CDC/Event Sourcing/CQRS ◦ Event-driven DDD
  4. INSERT DESIGNATOR, IF NEEDED 4 Coordinating state is hard

  5. INSERT DESIGNATOR, IF NEEDED 5 Coordinating events and state is

    harder
  6. INSERT DESIGNATOR, IF NEEDED 6 Kafka as a distributed messaging

    system How about applications that are both producers and consumers and perform complex computations?
  7. INSERT DESIGNATOR, IF NEEDED 7 • Client library for stream

    processing ◦ Embed stream processing features into regular Java applications ◦ Create sophisticated topologies of independent applications ◦ One-record-at-a-time processing (no microbatching) • Kafka-to-Kafka semantics ◦ Event/State management coordination ◦ Stateful processing support ◦ Transactions/exactly once Kafka Cluster Application Kafka Streams Kafka Streams Overview Events State
  8. INSERT DESIGNATOR, IF NEEDED 8 Kafka Streams - high level

    functional DSL KStream words = builder.stream(“words”) KTable countsTable = words.flatMapValues(value -> Arrays.asList(value.toLowerCase().split("\\W+"))) .map((key, value) -> new KeyValue<>(value, value)) .groupByKey(Serdes.String(), Serdes.String()) .count(timeWindows, "WordCounts"); KStream counts = counts.toStream() counts.to(“counts”)
  9. INSERT DESIGNATOR, IF NEEDED 9 Key Kafka Streams abstractions •

    KStream ◦ Record stream abstraction ◦ Read from/written to external topic as is • KTable/GlobalKTable ◦ Key/Value map abstraction ◦ Read from/written to topic as a sequence of updates based on record key ◦ Complex operations: joins, aggregations • Stream/Table Duality ◦ KStream -> KTable - read a stream as a changelog centered around the key ◦ KTable -> KStream - table updates are produced as a stream • Time windowing for aggregate operations
  10. INSERT DESIGNATOR, IF NEEDED • Kubernetes makes running complex topologies

    reliable, transparent and boring • Not only for applications, but also messaging infrastructure • In-built resource management ◦ Memory, CPU, disk • Elastic scaling • Monitoring and failover ◦ Health, logging, metrics • Routing and load balancing • Rolling upgrades and CI/CD • Namespacing Kafka Streams and Kubernetes
  11. Strimzi: Provisioning Kafka on Kubernetes What is Strimzi ? •

    Open source project focused on running Apache Kafka on Kubernetes and OpenShift • Available as a part of Red Hat AMQ • Licensed under Apache License 2.0 • Web site: http://strimzi.io/ • GitHub: https://github.com/strimzi • Slack: strimzi.slack.com • Mailing list: strimzi@redhat.com • Twitter: @strimziio
  12. Cluster Controller Creating and managing Apache Kafka clusters Zookeeper Kafka

    Cluster Controller Config Map Manages
  13. Topic Controller Creating and managing Kafka topics Zookeeper Kafka Topic

    Controller Config Map Manages topics
  14. Kafka Streams on Kubernetes 14 Kafka Cluster Application Kafka Streams

    Container changelog events Application Kafka Streams Container changelog events Application Kafka Streams Container changelog events
  15. INSERT DESIGNATOR, IF NEEDED 15 Kafka Streams: stateful and stateless

    deployments Kafka Cluster Application Kafka Streams In-memory state store Local disk • Changes propagated to changelog topic • Stored locally for recovery/restart • Fully stateless deployments require to replay the topic on restart/failover • State store recovery can be optimized by providing access to stateful deployments changelog events
  16. INSERT DESIGNATOR, IF NEEDED 16 Kafka Streams with Kubernetes StatefulSets

    Application Kafka Streams Pod Application Kafka Streams Pod Application Kafka Streams Pod volume-word-count-0 word-count-1 word-count-2 volume-word-count-1 volume-word-count-2 word-count-0
  17. THANK YOU plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat