Slide 1

Slide 1 text

Change Data Streaming Patterns for Change Data Streaming Patterns for Microservices With Debezium Microservices With Debezium Gunnar Morling Gunnar Morling @gunnarmorling @gunnarmorling

Slide 2

Slide 2 text

Gunnar Morling Gunnar Morling Open source software engineer at Red Hat Debezium Hibernate Spec Lead for Bean Validation 2.0 Other projects: Deptective, MapStruct [email protected] @gunnarmorling http://in.relation.to/gunnar-morling/ #Debezium @gunnarmorling

Slide 3

Slide 3 text

Change Data Capture Change Data Capture What is it about? What is it about? Get an event stream with all data and schema changes in your DB #Debezium @gunnarmorling Apache Kafka DB 1 ?

Slide 4

Slide 4 text

CDC Use Cases CDC Use Cases Data Replication Data Replication Replicate data to other DB Feed analytics system or DWH Feed data to other teams #Debezium @gunnarmorling Apache Kafka DB 1 DB 2

Slide 5

Slide 5 text

CDC Use Cases CDC Use Cases Others Others Auditing/Historization Update or invalidate caches Enable full-text search via Elasticsearch, Solr etc. Update CQRS read models UI live updates Enable streaming queries #Debezium @gunnarmorling

Slide 6

Slide 6 text

Change Data Capture Change Data Capture With Debezium With Debezium

Slide 7

Slide 7 text

Debezium Debezium Change Data Capture Platform Change Data Capture Platform Retrieves change events from TX logs from different DBs Transparent to writing apps Comprehensive type support (PostGIS etc.) Snapshotting, Filtering etc. Fully open-source, very active community Latest version: 0.9 (based on Kafka 2.0) Production deployments at multiple companies (e.g. WePay, Trivago, BlaBlaCar etc.) #Debezium @gunnarmorling

Slide 8

Slide 8 text

Advantages of Log-based CDC Advantages of Log-based CDC Tailing the transaction log Tailing the transaction log All data changes are captured No polling delay or overhead Transparent to writing applications and models Can capture deletes Can capture old record state and further meta data Different formats/APIs, but Debezium deals with this #Debezium @gunnarmorling

Slide 9

Slide 9 text

Change Event Structure Change Event Structure Key (PK of table) and Value Payload: Before state, After state, Source info Serialization format: JSON Avro (with Confluent Schema Registry) { "schema": { ... }, "payload": { "before": null, "after": { "id": 1004, "first_name": "Anne", "last_name": "Kretchmar", "email": "[email protected]" }, "source": { "name": "dbserver1", "server_id": 0, "ts_sec": 0, "file": "mysql­bin.000003", "pos": 154, "row": 0, "snapshot": true, "db": "inventory", "table": "customers" }, "op": "c", "ts_ms": 1486500577691 } } #Debezium @gunnarmorling

Slide 10

Slide 10 text

Debezium Connectors Debezium Connectors MySQL Postgres MongoDB SQL Server Oracle (Tech Preview, based on XStream) Possible future additions Cassandra? MariaDB? @gunnarmorling #Debezium

Slide 11

Slide 11 text

#Debezium @gunnarmorling Postgres MySQL Apache Kafka CDC with Debezium and Kafka Connect CDC with Debezium and Kafka Connect

Slide 12

Slide 12 text

CDC with Debezium and Kafka Connect CDC with Debezium and Kafka Connect #Debezium @gunnarmorling Postgres MySQL Apache Kafka Kafka Connect Kafka Connect

Slide 13

Slide 13 text

#Debezium @gunnarmorling Postgres MySQL Apache Kafka Kafka Connect Kafka Connect DBZ PG DBZ MySQL CDC with Debezium and Kafka Connect CDC with Debezium and Kafka Connect

Slide 14

Slide 14 text

#Debezium @gunnarmorling Postgres MySQL Kafka Connect Kafka Connect Apache Kafka DBZ PG DBZ MySQL Elasticsearch ES Connector CDC with Debezium and Kafka Connect CDC with Debezium and Kafka Connect

Slide 15

Slide 15 text

Microservice Microservice CDC Patterns CDC Patterns

Slide 16

Slide 16 text

Pattern: Microservice Data Pattern: Microservice Data Synchronization Synchronization Microservice Architectures Microservice Architectures Propagate data between different services without coupling Each service keeps optimised views locally #Debezium @gunnarmorling Order Item Stock App Local DB Local DB Local DB App App Item Changes Stock Changes

Slide 17

Slide 17 text

Source DB (with "Events" table) Kafka Connect Apache Kafka DBZ Order Events Credit Worthiness Check Events Pattern: Outbox Pattern: Outbox Avoiding Dual Writes Avoiding Dual Writes #Debezium @gunnarmorling ID Category Type Payload 123 Order OrderCreated { "id" : 123, ... } 456 Order OrderDetailCanceled { "id" : 456, ... } Order Service Shipment Service Customer Service ID Category Type Payload 123 Order OrderCreated { "id" : 123, ... } 456 Order OrderDetail- Canceled { "id" : 456, ... } 789 ... ... ...

Slide 18

Slide 18 text

Pattern: Microservice Extraction Pattern: Microservice Extraction Migrating from Monoliths to Microservices Migrating from Monoliths to Microservices Extract microservice for single component(s) Keep write requests against running monolith Stream changes to extracted microservice Test new functionality Switch over, evolve schema only afterwards #Debezium @gunnarmorling

Slide 19

Slide 19 text

Pattern: Ensuring Data Quality Pattern: Ensuring Data Quality Detecting Missing or Wrong Data Detecting Missing or Wrong Data Constantly compare record counts on source and sink side Raise alert if threshold is reached Compare every n-th record field by field E.g. have all records compared within one week #Debezium @gunnarmorling

Slide 20

Slide 20 text

Pattern: Leverage the Powers of SMTs Pattern: Leverage the Powers of SMTs Single Message Transformations Single Message Transformations Aggregate sharded tables to single topic Keep compatibility with existing consumers Format conversions, e.g. for dates Ensure compatibility with sink connectors Extracting "after" state only Expand MongoDB's JSON structures #Debezium @gunnarmorling

Slide 21

Slide 21 text

Demo Demo

Slide 22

Slide 22 text

Running Debezium on Kubernetes Running Debezium on Kubernetes AMQ Streams: Enterprise Distribution of Apache Kafka AMQ Streams: Enterprise Distribution of Apache Kafka Provides Container images for Apache Kafka, Connect, Zookeeper and MirrorMaker Operators for managing/configuring Apache Kafka clusters, topics and users Kafka Consumer, Producer and Admin clients, Kafka Streams Supported by Red Hat Upstream Community: Strimzi #Debezium @gunnarmorling

Slide 23

Slide 23 text

Summary Summary CDC enables use cases such as replication, microservices data exchange and much more Debezium: CDC for a growing number of databases Contributions welcome! Tell us about your feature requests and ideas! #Debezium @gunnarmorling

Slide 24

Slide 24 text

Resources Resources Website: Source code, examples, Compose files etc. Discussion group Strimzi (Kafka on Kubernetes/OpenShift) Latest news: @debezium http://debezium.io/ https://github.com/debezium https://groups.google.com/forum/ #!forum/debezium http://strimzi.io/ #Debezium @gunnarmorling

Slide 25

Slide 25 text

No content