Slide 1

Slide 1 text

Up Next: “I need my data and a little bit of your data.”: Integrating services with Apache Kafka (CSE Edition) René Kerner, Senior Consultant at codecentric since 06/2018, former Software-Engineer/Architect at trivago 11/2011 - 06/2018 Mail: [email protected] || Twitter: @rk3rn3r https://lparchive.org/The-Secret-of-Monkey-Island/Update%201/1-somi_001.gif

Slide 2

Slide 2 text

René Kerner 2 - >9 years Software-Engineer and Software-Architect - ~2 years of experience with Apache Kafka, CDC, Protobuf, Data Streaming - 6,5 yrs Software-Engineer/-Architect at trivago - since June 2018 Senior Consultant at codecentric - Software-Architecture, Distributed Systems - Streaming, Reactive Systems - Cloud, DevOps, Ops, Virtualisation - Development: Java, Kotlin, Spring Frame- work, PHP, Bash @rk3rn3r

Slide 3

Slide 3 text

3 3 "I need my data and a little bit of your data." Integrating Services with Apache Kafka 3

Slide 4

Slide 4 text

Integrating Services with Apache Kafka 4 “I need my data and a little bit of your data.” - Tim Berglund, confluent

Slide 5

Slide 5 text

My Data - Your Data… huh?! 5 Imagine a web e-commerce shop architecture - Web UI / GUI - Web backend - Services for - Products - Orders - Inventory/stock - Payment - Shipping

Slide 6

Slide 6 text

Example: Dependencies of a web shop in a service world 6 - Web UI / GUI talks to its backend service

Slide 7

Slide 7 text

Web Shop Backend Dependencies 7 - Web UI / GUI talks to its backend service - The backend needs to talk to a lot of other services

Slide 8

Slide 8 text

Orders Service Dependencies 8 - Web UI / GUI talks to its backend service - The backend needs to talk to a lot of other services - The order service needs some more data too...

Slide 9

Slide 9 text

Products Service Dependencies 9 - Web UI / GUI talks to its backend service - The backend needs to talk to a lot of other services - The order service needs some more data too… - Product service might fetch stocks from stock service

Slide 10

Slide 10 text

Payment checks stocks and shipping 10 - Web UI / GUI talks to its backend service - The backend needs to talk to a lot of other services - The order service needs some more data too… - Product service might fetch stocks from stock service - Payment checks stock and triggers shipping

Slide 11

Slide 11 text

Shipping decreases stock 11 - Web UI / GUI talks to its backend service - The backend needs to talk to a lot of other services - The order service needs some more data too… - Product service might fetch stocks from stock service - Payment checks stock and triggers shipping - Shipping decreases stock

Slide 12

Slide 12 text

Example: Dependencies of a Web Shop 12 - DONE! - But we left out - Accounting - SEO/SEM - Email/Connectivity - Business Intelligence - and, and, and, … → Loooots of dependencies! - Services need data from other services - Dependencies on data and schema (and maybe behavior)

Slide 13

Slide 13 text

https://i.imgflip.com/1d7btz.jpg

Slide 14

Slide 14 text

First try... 14 Let’s build one or two services...

Slide 15

Slide 15 text

First try... 15 Let’s build one or two services... ...that use a central database!

Slide 16

Slide 16 text

Database Integration 16

Slide 17

Slide 17 text

Database Integration 17 Doesn’t look that bad!

Slide 18

Slide 18 text

Database Integration and it’s Evolution 18 growth & time natural growth

Slide 19

Slide 19 text

19 https://i.kinja-img.com/gawker-media/image/upload/s--wl85UrLl--/c_fill,f_auto ,fl_progressive,g_center,h_675,q_80,w_1200/i4fvyeoxdyxc4pdwxaob.jpg

Slide 20

Slide 20 text

Second try... 20 Let’s do it like NETFLIX...

Slide 21

Slide 21 text

Second try... 21 Let’s do it like NETFLIX... ...HTTP APIs for the win!

Slide 22

Slide 22 text

The Wannabe-Netflix Way / HTTP APIs 22 - Very flexible, but with a lot of technical dependencies - Cascading requests, traffic and failures - Availability for 500 services: → 99,9% ^ 500 = 0,0657% - Upstream performance needs (availability, speed) are put on downstream services - sometimes multiplied

Slide 23

Slide 23 text

HTTP APIs in distributed systems are HARD! 23

Slide 24

Slide 24 text

HTTP APIs in distributed systems are HARD! 24 Optimistic Lock, Pessimistic Lock, MVCC, Session Pinning, Server Stickyness

Slide 25

Slide 25 text

25 http://screencrush.com/files/2015/08/Patrick-Stewart-Star-Trek.jpg

Slide 26

Slide 26 text

https://i.imgflip.com/1d7btz.jpg

Slide 27

Slide 27 text

27 Share your data. https://i.ytimg.com/vi/8eDYVtPwWiM/maxresdefault.jpg

Slide 28

Slide 28 text

28 Share your data. But without a central database or centrally managed integration pattern! https://i.ytimg.com/vi/8eDYVtPwWiM/maxresdefault.jpg

Slide 29

Slide 29 text

Kafka Integration Scenarios 29 - Direct Integration - Producer → Producer API - Consumer → Consumer API - Data Replication - Connector → Connect API - Sink Connector - Source Connector - Reactive Data Transformation - Stream Processor (SP) → Processor API → Streams API - KSQL on top of Streams API

Slide 30

Slide 30 text

Kafka Integration Scenarios 30 - Direct Integration - Producer → Producer API - Consumer → Consumer API - Data Replication - Connector → Connect API - Sink Connector - Source Connector - Reactive Data Transformation - Stream Processor (SP) → Processor API → Streams API - KSQL on top of Streams API

Slide 31

Slide 31 text

Kafka Integration Scenarios 31 - Direct Integration - Producer → Producer API - Consumer → Consumer API - Data Replication - Connector → Connect API - Sink Connector - Source Connector - Reactive Data Transformation - Stream Processor (SP) → Processor API → Streams API - KSQL on top of Streams API

Slide 32

Slide 32 text

Step 1: Split up old DB or HTTP Integration 32 What can we do here, instead of HTTP? http://cdn.onlinewebfonts.co m/svg/img_494692.png

Slide 33

Slide 33 text

Step 1: Replicate! 33 Instead of HTTP request to fetch data, capture all changes of the “New Orders” DB table and replicate it into the datastore of the Stock service.

Slide 34

Slide 34 text

Step 1: Replicate data using Kafka Connect API 34 Instead of HTTP request to fetch data, capture all changes of the “New Orders” DB table and replicate it into the datastore of the Stock service. Kafka Topic

Slide 35

Slide 35 text

Step 1: Replicate data using Kafka Connect API 35 You could, maybe, use Change Data Capturing (CDC) with Debezium. Kafka Topic

Slide 36

Slide 36 text

Change Data Capturing (CDC) and Debezium (DBZ) 36 - Classical: Mark changed rows - hard to handle Primary Key changes and Deletes - Modern: Capture datastore’s Changelog / Commitlog / Replicationlog - Debezium - supports MySQL - supports PostgreSQL - supports MongoDB - supports Oracle DB - alpha: SQL Server 2016 SP1+ - supports Deletes and PK-changes - can handle DDL changes

Slide 37

Slide 37 text

Make use of Apache Kafka 37 Kafka Connect Kafka Connect

Slide 38

Slide 38 text

That’s it! 38 Services Integrated! We’re done. Thank you!

Slide 39

Slide 39 text

That’s it! 39 But let’s see, if we can improve!

Slide 40

Slide 40 text

DB lookups from Stock service 40 We still kept the database lookup scenarios. Maybe our Stock service recalculates stocks every few hours/minutes. A common scenario (batch, …).

Slide 41

Slide 41 text

Is there more? 41 But do we need that?

Slide 42

Slide 42 text

Can we improve? 42 When our stock goes too low, wouldn’t it be cool to directly place a new order?

Slide 43

Slide 43 text

Can we directly process our new state? 43 When our stock goes too low, wouldn’t it be cool to directly place a new order? → Directly calculate our new state

Slide 44

Slide 44 text

Step 1: Reactive! 44 Yes! Let’s make it reactive!

Slide 45

Slide 45 text

How to make Stock service reactive? 45 But how?

Slide 46

Slide 46 text

Step 1: Reactive! 46 Make your service a Kafka consumer using the Kafka Consumer API

Slide 47

Slide 47 text

Step 2: Make the service reactive using Kafka Consumer API 47 On every New Order event/dataset our business logic will be triggered to lookup and update stock. When stock is too low, we can directly place the order.

Slide 48

Slide 48 text

And another batch job... 48 Hmmm… This Recommendation service is just a cronjob that updates recommendations for the different categories once a day. At night… Can’t we?

Slide 49

Slide 49 text

Step 3: Update recommendations in a reactive way 49 Use Streams API to recalculate recommendations weights after each order. Sink them into the recommendations database table with a Sink Connector.

Slide 50

Slide 50 text

50 Engage! https://cdn1.tvnz.co.nz/content/dam/images/news/2018/08 /05/picard.jpg.hashed.7bdfa2df.desktop.story.inline.jpg

Slide 51

Slide 51 text

Step 4: Send orders to Orders service 51

Slide 52

Slide 52 text

Step 4: Send orders to Orders service 52 - Orders service doesn’t need some of the fields of the “New Orders” messages - Some data need to be processed before they are stored

Slide 53

Slide 53 text

Step 4: Transform, Filter and Pre-Process Messages 53 - Orders service doesn’t need some of the fields of the “New Orders” messages - Some data need to be processed before they are stored → use Kafka Streams API → to transform messages and remove unneccessary fields → to make precalculations or filter → store result in a new Kafka topic and sink it into the datastore

Slide 54

Slide 54 text

Step 5: Move forward! 54

Slide 55

Slide 55 text

Architectural Pattern: Kafka Integration 55 - Simple and easy, real decoupled architecture - Many problems of a distributed system are handled - Real decoupling of producers and consumers - Producer’s work is done after ACK - Consumers are free to do whatever they want - Supports independent teams by decoupling readers and writers → CQRS, democratizing data

Slide 56

Slide 56 text

When to use Connect API? 56 If you connect to an external system that’s not directly able to connect natively to Kafka. Or, if you want to keep your well-known datastore access/lookup behavior. (e.g. legacy applications)

Slide 57

Slide 57 text

When to use Connect API? 57 If your application data access scenario is natively a lookup scenario. Incoming messages doesn’t necessarily change the application state.

Slide 58

Slide 58 text

When to use Consumer API? 58 When every message that comes in must trigger your business logic or is supposed to update your application state. e.g. Stream Processors, Reactive Dashboards, “realtime stuff”

Slide 59

Slide 59 text

When to use Streams API? 59 When you are going to write an application that consumes, processes and produces.

Slide 60

Slide 60 text

When to use a database? 60 When your application or service natively needs a lookup scenario. For example: a user wants to see all products of a specific type.

Slide 61

Slide 61 text

Kafka Usage And Benefits 61

Slide 62

Slide 62 text

62 How to actually do this?

Slide 63

Slide 63 text

63 63 codecentric helps you! We build, we consult, we enable: https://www.codecentric.de 63

Slide 64

Slide 64 text

64 The future is coming.... https://vignette.wikia.nocookie.net/memoryalpha/images/9/9d/USS_Enterprise-D.jpeg/ revision/latest?cb=20131125162836&path-prefix=de

Slide 65

Slide 65 text

65 65 Thank you! 65

Slide 66

Slide 66 text

Q & A 66 https://muppetmindset.files.wordpress.com/2016/05/question-mark.jpg?w=70 0 http://indigo.ie/~rdshiels/monkey/guyswing.gif