Upgrade to Pro — share decks privately, control downloads, hide ads and more …

"I need my data and a little bit of your data." - Integrating services with Apache Kafka (Confluent Streaming Event Munich)

René Kerner
October 09, 2018

"I need my data and a little bit of your data." - Integrating services with Apache Kafka (Confluent Streaming Event Munich)

We split our monoliths into services. Every service has it's own bounded context with it's own data in its own datastore. But at some point, to do its job, the service needs additional data from other services or datastores. I show you, how to integrate them with Apache Kafka and make parts reactive.
All integrated and reactive everywhere!

René Kerner

October 09, 2018
Tweet

More Decks by René Kerner

Other Decks in Technology

Transcript

  1. Up Next: “I need my data and a little bit

    of your data.”: Integrating services with Apache Kafka (CSE Edition) René Kerner, Senior Consultant at codecentric since 06/2018, former Software-Engineer/Architect at trivago 11/2011 - 06/2018 Mail: [email protected] || Twitter: @rk3rn3r https://lparchive.org/The-Secret-of-Monkey-Island/Update%201/1-somi_001.gif
  2. René Kerner 2 - >9 years Software-Engineer and Software-Architect -

    ~2 years of experience with Apache Kafka, CDC, Protobuf, Data Streaming - 6,5 yrs Software-Engineer/-Architect at trivago - since June 2018 Senior Consultant at codecentric - Software-Architecture, Distributed Systems - Streaming, Reactive Systems - Cloud, DevOps, Ops, Virtualisation - Development: Java, Kotlin, Spring Frame- work, PHP, Bash @rk3rn3r
  3. 3 3 "I need my data and a little bit

    of your data." Integrating Services with Apache Kafka 3
  4. Integrating Services with Apache Kafka 4 “I need my data

    and a little bit of your data.” - Tim Berglund, confluent
  5. My Data - Your Data… huh?! 5 Imagine a web

    e-commerce shop architecture - Web UI / GUI - Web backend - Services for - Products - Orders - Inventory/stock - Payment - Shipping
  6. Example: Dependencies of a web shop in a service world

    6 - Web UI / GUI talks to its backend service
  7. Web Shop Backend Dependencies 7 - Web UI / GUI

    talks to its backend service - The backend needs to talk to a lot of other services
  8. Orders Service Dependencies 8 - Web UI / GUI talks

    to its backend service - The backend needs to talk to a lot of other services - The order service needs some more data too...
  9. Products Service Dependencies 9 - Web UI / GUI talks

    to its backend service - The backend needs to talk to a lot of other services - The order service needs some more data too… - Product service might fetch stocks from stock service
  10. Payment checks stocks and shipping 10 - Web UI /

    GUI talks to its backend service - The backend needs to talk to a lot of other services - The order service needs some more data too… - Product service might fetch stocks from stock service - Payment checks stock and triggers shipping
  11. Shipping decreases stock 11 - Web UI / GUI talks

    to its backend service - The backend needs to talk to a lot of other services - The order service needs some more data too… - Product service might fetch stocks from stock service - Payment checks stock and triggers shipping - Shipping decreases stock
  12. Example: Dependencies of a Web Shop 12 - DONE! -

    But we left out - Accounting - SEO/SEM - Email/Connectivity - Business Intelligence - and, and, and, … → Loooots of dependencies! - Services need data from other services - Dependencies on data and schema (and maybe behavior)
  13. The Wannabe-Netflix Way / HTTP APIs 22 - Very flexible,

    but with a lot of technical dependencies - Cascading requests, traffic and failures - Availability for 500 services: → 99,9% ^ 500 = 0,0657% - Upstream performance needs (availability, speed) are put on downstream services - sometimes multiplied
  14. HTTP APIs in distributed systems are HARD! 24 Optimistic Lock,

    Pessimistic Lock, MVCC, Session Pinning, Server Stickyness
  15. 28 Share your data. But without a central database or

    centrally managed integration pattern! https://i.ytimg.com/vi/8eDYVtPwWiM/maxresdefault.jpg
  16. Kafka Integration Scenarios 29 - Direct Integration - Producer →

    Producer API - Consumer → Consumer API - Data Replication - Connector → Connect API - Sink Connector - Source Connector - Reactive Data Transformation - Stream Processor (SP) → Processor API → Streams API - KSQL on top of Streams API
  17. Kafka Integration Scenarios 30 - Direct Integration - Producer →

    Producer API - Consumer → Consumer API - Data Replication - Connector → Connect API - Sink Connector - Source Connector - Reactive Data Transformation - Stream Processor (SP) → Processor API → Streams API - KSQL on top of Streams API
  18. Kafka Integration Scenarios 31 - Direct Integration - Producer →

    Producer API - Consumer → Consumer API - Data Replication - Connector → Connect API - Sink Connector - Source Connector - Reactive Data Transformation - Stream Processor (SP) → Processor API → Streams API - KSQL on top of Streams API
  19. Step 1: Split up old DB or HTTP Integration 32

    What can we do here, instead of HTTP? http://cdn.onlinewebfonts.co m/svg/img_494692.png
  20. Step 1: Replicate! 33 Instead of HTTP request to fetch

    data, capture all changes of the “New Orders” DB table and replicate it into the datastore of the Stock service.
  21. Step 1: Replicate data using Kafka Connect API 34 Instead

    of HTTP request to fetch data, capture all changes of the “New Orders” DB table and replicate it into the datastore of the Stock service. Kafka Topic
  22. Step 1: Replicate data using Kafka Connect API 35 You

    could, maybe, use Change Data Capturing (CDC) with Debezium. Kafka Topic
  23. Change Data Capturing (CDC) and Debezium (DBZ) 36 - Classical:

    Mark changed rows - hard to handle Primary Key changes and Deletes - Modern: Capture datastore’s Changelog / Commitlog / Replicationlog - Debezium - supports MySQL - supports PostgreSQL - supports MongoDB - supports Oracle DB - alpha: SQL Server 2016 SP1+ - supports Deletes and PK-changes - can handle DDL changes
  24. DB lookups from Stock service 40 We still kept the

    database lookup scenarios. Maybe our Stock service recalculates stocks every few hours/minutes. A common scenario (batch, …).
  25. Can we improve? 42 When our stock goes too low,

    wouldn’t it be cool to directly place a new order?
  26. Can we directly process our new state? 43 When our

    stock goes too low, wouldn’t it be cool to directly place a new order? → Directly calculate our new state
  27. Step 2: Make the service reactive using Kafka Consumer API

    47 On every New Order event/dataset our business logic will be triggered to lookup and update stock. When stock is too low, we can directly place the order.
  28. And another batch job... 48 Hmmm… This Recommendation service is

    just a cronjob that updates recommendations for the different categories once a day. At night… Can’t we?
  29. Step 3: Update recommendations in a reactive way 49 Use

    Streams API to recalculate recommendations weights after each order. Sink them into the recommendations database table with a Sink Connector.
  30. Step 4: Send orders to Orders service 52 - Orders

    service doesn’t need some of the fields of the “New Orders” messages - Some data need to be processed before they are stored
  31. Step 4: Transform, Filter and Pre-Process Messages 53 - Orders

    service doesn’t need some of the fields of the “New Orders” messages - Some data need to be processed before they are stored → use Kafka Streams API → to transform messages and remove unneccessary fields → to make precalculations or filter → store result in a new Kafka topic and sink it into the datastore
  32. Architectural Pattern: Kafka Integration 55 - Simple and easy, real

    decoupled architecture - Many problems of a distributed system are handled - Real decoupling of producers and consumers - Producer’s work is done after ACK - Consumers are free to do whatever they want - Supports independent teams by decoupling readers and writers → CQRS, democratizing data
  33. When to use Connect API? 56 If you connect to

    an external system that’s not directly able to connect natively to Kafka. Or, if you want to keep your well-known datastore access/lookup behavior. (e.g. legacy applications)
  34. When to use Connect API? 57 If your application data

    access scenario is natively a lookup scenario. Incoming messages doesn’t necessarily change the application state.
  35. When to use Consumer API? 58 When every message that

    comes in must trigger your business logic or is supposed to update your application state. e.g. Stream Processors, Reactive Dashboards, “realtime stuff”
  36. When to use Streams API? 59 When you are going

    to write an application that consumes, processes and produces.
  37. When to use a database? 60 When your application or

    service natively needs a lookup scenario. For example: a user wants to see all products of a specific type.
  38. 63 63 codecentric helps you! We build, we consult, we

    enable: https://www.codecentric.de 63