Change Data Streaming Patterns in Distributed Systems @ Flink Forward 2021

Microservices are one of the big trends in software engineering of the last few years; organising business functionality in several self-contained, loosely coupled services helps teams to work efficiently, make the most suitable technical decisions, and react quickly to new business requirements.

In this session we'll discuss and showcase how open-source change data capture (CDC) with Debezium can help developers with typical challenges they often face when working on microservices. Come and join us to learn how to:

* Employ the outbox pattern for reliable, eventually consistent data exchange between microservices, without incurring unsafe dual writes or tight coupling
* Gradually extract microservices from existing monolithic applications, using CDC, the strangler fig pattern and Apache Flink
* Coordinate long-running business transactions across multiple services using CDC-based saga orchestration, ensuring such activity gets consistently applied or aborted by all participating services.

Demo Repository: https://github.com/hpgrahsl/flinkforward21


Hans-Peter Grahsl

October 26, 2021

  1. Gunnar Morling Software Engineer, Red Hat @gunnarmorling Change Data Streaming

    Patterns in Distributed Systems Hans-Peter Grahsl Technical Trainer, Netconomy @hpgrahsl
  2. #CDCPatterns @gunnarmorling @hpgrahsl … implemented using Change Data Capture Today’s

  3. #CDCPatterns @gunnarmorling @hpgrahsl • Open source software engineer at Red

    Hat ◦ Debezium ◦ Quarkus • Spec Lead for Bean Validation 2.0 • Java Champion • @gunnarmorling Gunnar Morling
  4. #CDCPatterns @gunnarmorling @hpgrahsl • Technical Trainer at NETCONOMY • Independent

    Engineer & Consultant • Confluent Community Catalyst • MongoDB Champion • @hpgrahsl Hans-Peter Grahsl
  5. #CDCPatterns @gunnarmorling @hpgrahsl • Taps into TX log to capture

    INSERT/UPDATE/DELETE events • Propagated to consumers via Apache Kafka and Kafka Connect Debezium — Log-based Change Data Capture
  6. #CDCPatterns @gunnarmorling @hpgrahsl Debezium in a Nutshell • A CDC

    Platform ▪ Based on transaction logs ▪ Snapshotting, filtering, etc. ▪ Outbox support ▪ Web-based UI • Fully open-source, very active community • Large production deployments
  7. #CDCPatterns @gunnarmorling @hpgrahsl Debezium: Connectors • Stable ▪ MySQL ▪

    Postgres ▪ MongoDB ▪ SQL Server ▪ Db2 ▪ Oracle • Incubating ▪ Vitess ▪ Cassandra
  8. #CDCPatterns @gunnarmorling @hpgrahsl Debezium: Deployment Alternatives Embedded Engine and Debezium

  9. #CDCPatterns @gunnarmorling @hpgrahsl Data Change Events • Old and new

    row state • Metadata on table, TX id, etc. • Operation type, timestamp
  10. #CDCPatterns @gunnarmorling @hpgrahsl • Old and new row state •

    Metadata on table, TX id, etc. • Operation type, timestamp Data Change Events
  11. #CDCPatterns @gunnarmorling @hpgrahsl • Old and new row state •

    Metadata on table, TX id, etc. • Operation type, timestamp Data Change Events
  12. Outbox Pattern

  13. #CDCPatterns @gunnarmorling @hpgrahsl • Services need to update their database,

    • send messages to other services, • and that consistently! Challenge: Microservices Data Exchange
  14. #CDCPatterns @gunnarmorling @hpgrahsl “Dual writes” are prone to inconsistencies! Outbox

  15. #CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern

  16. #CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern

  17. #CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern

  18. #CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern

  19. #CDCPatterns @gunnarmorling @hpgrahsl Outbox Pattern with enrichment from external system

    using Flink
  20. #CDCPatterns @gunnarmorling @hpgrahsl Integrating Debezium With Apache Flink Debezium →

    Kafka Topic → Flink
  21. #CDCPatterns @gunnarmorling @hpgrahsl Integrating Debezium With Apache Flink Flink CDC

    Connectors (Debezium Embedded Engine) → Flink
  22. Strangler Fig Pattern

  23. #CDCPatterns @gunnarmorling @hpgrahsl • Gradually evolve from old into new

    • Support temporary coexistence • Avoid big bang cut-over Challenge: Migrating Systems
  24. #CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

  25. #CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

  26. #CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

  27. #CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

  28. #CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

  29. #CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

  30. #CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern

  31. #CDCPatterns @gunnarmorling @hpgrahsl CDC-based Strangler Fig Pattern Demo Repo: https://bit.ly/ff21-sfp

  32. #CDCPatterns @gunnarmorling @hpgrahsl • Incremental migration → “baby steps” •

    Pause or stop migration without losing spent efforts • Migration steps ideally reversible Rationale: ⚠ minimize risk ⚠ Benefits
  33. #CDCPatterns @gunnarmorling @hpgrahsl CDC Pipeline Considerations • data model leaking

    from monolith ? • “1:1 replication” → building aggregates ?
  34. #CDCPatterns @gunnarmorling @hpgrahsl Enhanced CDC Processing Single Message Transforms

  35. #CDCPatterns @gunnarmorling @hpgrahsl Enhanced CDC Processing custom stream processing with

  36. #CDCPatterns @gunnarmorling @hpgrahsl Example: join with custom aggregation

  37. #CDCPatterns @gunnarmorling @hpgrahsl Example: join with custom aggregation Flink Table

  38. #CDCPatterns @gunnarmorling @hpgrahsl Example: join with custom aggregation Flink SQL

  39. Saga Pattern

  40. #CDCPatterns @gunnarmorling @hpgrahsl • Multiple services need to act collaboratively

    to achieve a consistent outcome • Without 2-phase commit protocols • Ensure correctness in case of failures Challenge: Long-running Business Transactions
  41. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern

  42. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Message Flow

  43. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Compensation

  44. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Configuration

  45. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Execution Flow

  46. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Execution Flow

  47. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Execution Flow

  48. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Expanding Partial Change Events with

  49. #CDCPatterns @gunnarmorling @hpgrahsl Saga Pattern Expanding Partial Change Events with

  50. Wrap-Up

  51. #CDCPatterns @gunnarmorling @hpgrahsl • CDC: a powerful tool in the

    box for event-driven architectures • Debezium: open-source CDC for a variety of databases • Debezium + Apache Flink = ❤ Takeaways
  52. #CDCPatterns @gunnarmorling @hpgrahsl • Outbox implementation https://debezium.io/blog/2019/02/19/reliable-microservices-data -exchange-with-the-outbox-pattern/ • Strangler

    fig pattern https://martinfowler.com/bliki/StranglerFigApplication.html • Saga implementation https://www.infoq.com/articles/saga-orchestration-outbox/ • Demo repo https://github.com/debezium/debezium-examples Resources
  53. #CDCPatterns @gunnarmorling @hpgrahsl Q & A gunnar@hibernate.org @gunnarmorling 📧 grahslhp@gmail.com

    @hpgrahsl 📧 Thank You!
  54. #CDCPatterns @gunnarmorling @hpgrahsl Unsplash https://unsplash.com/license © Pablo García Saldaña https://unsplash.com/photos/lPQIndZz8Mo

    © David Clode https://unsplash.com/photos/T49WTav4LgU © Aaron Burden https://unsplash.com/photos/GFpxQ2ZyNc0 © Nathan Dumlao https://unsplash.com/photos/wQDysNUCKfw © mari lezhava https://unsplash.com/photos/q65bNe9fW-w © Michał Parzuchowski https://unsplash.com/photos/Bt0PM7cNJFQ © Charles Forerunner https://unsplash.com/photos/3fPXt37X6UQ Flickr Attribution 2.0 Generic https://creativecommons.org/licenses/by/2.0/ © Thomas Kamann https://flic.kr/p/coa2c CC0 1.0 Universal Public Domain Dedication https://creativecommons.org/publicdomain/zero/1.0/ © Wall Boat https://flic.kr/p/Y6zkmX Attribution-ShareAlike 2.0 Generic https://creativecommons.org/licenses/by-sa/2.0/ © Andrew Hart https://flic.kr/p/dmjkSk Image Credits In Order of Appearance