Slide 1

Slide 1 text

Gunnar Morling Software Engineer, Red Hat @gunnarmorling Change Data Streaming Patterns in Distributed Systems Hans-Peter Grahsl Technical Trainer, Netconomy @hpgrahsl

Slide 2

Slide 2 text

#CDCPatterns @gunnarmorling @hpgrahsl … implemented using Change Data Capture Today’s Objectives

Slide 3

Slide 3 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Open source software engineer at Red Hat ○ Debezium ○ Quarkus ● Spec Lead for Bean Validation 2.0 ● Java Champion ● @gunnarmorling Gunnar Morling

Slide 4

Slide 4 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Technical Trainer at NETCONOMY ● Independent Engineer & Consultant ● Confluent Community Catalyst ● MongoDB Champion ● @hpgrahsl Hans-Peter Grahsl

Slide 5

Slide 5 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Taps into TX log to capture INSERT/UPDATE/DELETE events ● Propagated to consumers via Apache Kafka and Kafka Connect Debezium — Log-based Change Data Capture

Slide 6

Slide 6 text

#CDCPatterns @gunnarmorling @hpgrahsl Debezium in a Nutshell ● A CDC Platform ■ Based on transaction logs ■ Snapshotting, filtering, etc. ■ Outbox support ■ Web-based UI ● Fully open-source, very active community ● Large production deployments

Slide 7

Slide 7 text

#CDCPatterns @gunnarmorling @hpgrahsl Debezium: Connectors ● Stable ■ MySQL ■ Postgres ■ MongoDB ■ SQL Server ■ Db2 ■ Oracle ● Incubating ■ Vitess ■ Cassandra

Slide 8

Slide 8 text

#CDCPatterns @gunnarmorling @hpgrahsl Debezium: Deployment Alternatives Embedded Engine and Debezium Server

Slide 9

Slide 9 text

#CDCPatterns @gunnarmorling @hpgrahsl Data Change Events ● Old and new row state ● Metadata on table, TX id, etc. ● Operation type, timestamp

Slide 10

Slide 10 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Old and new row state ● Metadata on table, TX id, etc. ● Operation type, timestamp Data Change Events

Slide 11

Slide 11 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Old and new row state ● Metadata on table, TX id, etc. ● Operation type, timestamp Data Change Events

Slide 12

Slide 12 text

Outbox Pattern

Slide 13

Slide 13 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Services need to update their database, ● send messages to other services, ● and that consistently! Challenge: Microservices Data Exchange

Slide 14

Slide 14 text

#CDCPatterns @gunnarmorling @hpgrahsl “Dual writes” are prone to inconsistencies! Outbox Pattern

Slide 15

Slide 15 text

#CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern

Slide 16

Slide 16 text

#CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern

Slide 17

Slide 17 text

#CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern

Slide 18

Slide 18 text

#CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern

Slide 19

Slide 19 text

#CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern with enrichment from external system using Flink

Slide 20

Slide 20 text

#CDCPatterns @gunnarmorling @hpgrahsl Integrating Debezium With Apache Flink Debezium → Kafka Topic → Flink

Slide 21

Slide 21 text

#CDCPatterns @gunnarmorling @hpgrahsl Integrating Debezium With Apache Flink Flink CDC Connectors (Debezium Embedded Engine) → Flink

Slide 22

Slide 22 text

Strangler Fig Pattern

Slide 23

Slide 23 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Gradually evolve from old into new ● Support temporary coexistence ● Avoid big bang cut-over Challenge: Migrating Systems

Slide 24

Slide 24 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

Slide 25

Slide 25 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

Slide 26

Slide 26 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

Slide 27

Slide 27 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

Slide 28

Slide 28 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

Slide 29

Slide 29 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

Slide 30

Slide 30 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

Slide 31

Slide 31 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern Demo Repo: https://bit.ly/ff21-sfp

Slide 32

Slide 32 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Incremental migration → “baby steps” ● Pause or stop migration without losing spent efforts ● Migration steps ideally reversible Rationale: ⚠ minimize risk ⚠ Benefits

Slide 33

Slide 33 text

#CDCPatterns @gunnarmorling @hpgrahsl CDC Pipeline Considerations ● data model leaking from monolith ? ● “1:1 replication” → building aggregates ?

Slide 34

Slide 34 text

#CDCPatterns @gunnarmorling @hpgrahsl Enhanced CDC Processing Single Message Transforms

Slide 35

Slide 35 text

#CDCPatterns @gunnarmorling @hpgrahsl Enhanced CDC Processing custom stream processing with Flink

Slide 36

Slide 36 text

#CDCPatterns @gunnarmorling @hpgrahsl Example: join with custom aggregation

Slide 37

Slide 37 text

#CDCPatterns @gunnarmorling @hpgrahsl Example: join with custom aggregation Flink Table API

Slide 38

Slide 38 text

#CDCPatterns @gunnarmorling @hpgrahsl Example: join with custom aggregation Flink SQL

Slide 39

Slide 39 text

Saga Pattern

Slide 40

Slide 40 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Multiple services need to act collaboratively to achieve a consistent outcome ● Without 2-phase commit protocols ● Ensure correctness in case of failures Challenge: Long-running Business Transactions

Slide 41

Slide 41 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern

Slide 42

Slide 42 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Message Flow

Slide 43

Slide 43 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Compensation

Slide 44

Slide 44 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Configuration

Slide 45

Slide 45 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Execution Flow

Slide 46

Slide 46 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Execution Flow

Slide 47

Slide 47 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Execution Flow

Slide 48

Slide 48 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Expanding Partial Change Events with Flink

Slide 49

Slide 49 text

#CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Expanding Partial Change Events with Flink

Slide 50

Slide 50 text

Wrap-Up

Slide 51

Slide 51 text

#CDCPatterns @gunnarmorling @hpgrahsl ● CDC: a powerful tool in the box for event-driven architectures ● Debezium: open-source CDC for a variety of databases ● Debezium + Apache Flink = ❤ Takeaways

Slide 52

Slide 52 text

#CDCPatterns @gunnarmorling @hpgrahsl ● Outbox implementation https://debezium.io/blog/2019/02/19/reliable-microservices-data -exchange-with-the-outbox-pattern/ ● Strangler fig pattern https://martinfowler.com/bliki/StranglerFigApplication.html ● Saga implementation https://www.infoq.com/articles/saga-orchestration-outbox/ ● Demo repo https://github.com/debezium/debezium-examples Resources

Slide 53

Slide 53 text

#CDCPatterns @gunnarmorling @hpgrahsl Q & A [email protected] @gunnarmorling 📧 [email protected] @hpgrahsl 📧 Thank You!

Slide 54

Slide 54 text

#CDCPatterns @gunnarmorling @hpgrahsl Unsplash https://unsplash.com/license © Pablo García Saldaña https://unsplash.com/photos/lPQIndZz8Mo © David Clode https://unsplash.com/photos/T49WTav4LgU © Aaron Burden https://unsplash.com/photos/GFpxQ2ZyNc0 © Nathan Dumlao https://unsplash.com/photos/wQDysNUCKfw © mari lezhava https://unsplash.com/photos/q65bNe9fW-w © Michał Parzuchowski https://unsplash.com/photos/Bt0PM7cNJFQ © Charles Forerunner https://unsplash.com/photos/3fPXt37X6UQ Flickr Attribution 2.0 Generic https://creativecommons.org/licenses/by/2.0/ © Thomas Kamann https://flic.kr/p/coa2c CC0 1.0 Universal Public Domain Dedication https://creativecommons.org/publicdomain/zero/1.0/ © Wall Boat https://flic.kr/p/Y6zkmX Attribution-ShareAlike 2.0 Generic https://creativecommons.org/licenses/by-sa/2.0/ © Andrew Hart https://flic.kr/p/dmjkSk Image Credits In Order of Appearance