Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Change Data Capture with Debezium @ Java Vienna - June 2024, Austria

Change Data Capture with Debezium @ Java Vienna - June 2024, Austria

Abstract:

This session introduces you to Debezium (https://debezium.io), one of the most powerful open source change data capture platforms. You will learn about its core features and explore the different ways Debezium lets you build streaming data pipelines using popular databases as data sources. Examples focus on using Debezium together with Apache Kafka, but also briefly show how to stream captured change events to other types of data sinks, such as a simple HTTP endpoint.

These are the main takeaways:

* Understand the Benefits of log-based change data capture (CDC)
* Know about the 3 ways to run / work with Debezium
* Using Debezium’s PostgreSQL connector to publish database changes to Apache Kafka
* Running Debezium Server to set up a CDC pipeline between PostgreSQL and a web API endpoint
* Working with Debezium UI to configure and inspect a MySQL source connector

Tutorial Docs: https://redhat-scholars.github.io/debezium-tutorial/debezium-tutorial/index.html

Tutorial Repo: https://github.com/redhat-scholars/debezium-tutorial

Recording: pending

Hans-Peter Grahsl

June 03, 2024
Tweet

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Transcript

  1. Change Data Capture with Debezium ☕ Java Vienna 🎡 |

    03. Juni 2024 Hans-Peter Grahsl @hpgrahsl Developer 🥑 Advocate Red Hat
  2. debezium.io Hans-Peter Grahsl “I’m passionate about event-driven architectures, distributed stream

    processing systems and data engineering.” Developer 🥑 Advocate @ Red Hat based in Graz, Austria 󰎈 open-source enthusiast Confluent Community Catalyst MongoDB Champion since 2020 speaker at dev & tech conferences @[email protected]
  3. debezium.io Agenda • What is Debezium and Change Data Capture

    (CDC)? • Benefits of Log-based CDC • Databases supported by Debezium • Change Event Payload Structure • Debezium deployment modes • 🎬 Debezium in Action 🎬
  4. debezium.io Debezium in a Nutshell • fully open-source + very

    active community • change data capture (CDC) platform ◦ based on transaction logs ◦ snapshotting, filtering, routing, flattening etc. ◦ web-based UI • large production deployments
  5. debezium.io • 10 databases ◦ relational + non-relational ◦ 8

    production-ready ◦ 2 incubating: Vitess + Informix • + JDBC sink connector Currently supported Databases
  6. debezium.io • 3 “categories” of connectors ◦ core ◦ community-led

    ◦ external • 🎊 DB support is steadily growing 🤩 Currently supported Databases
  7. debezium.io Benefits of Log-based CDC data model agnostic get additional

    meta-data no changes missed access to previous state low latency & little overhead captures delete operations
  8. debezium.io Change Event Payload Structure • Message Key: table’s primary

    key • Message Value: ◦ old & new data state ◦ meta-data on table, TX id, etc. ◦ operation type, timestamp • Serialization → JSON, Avro, … • Cloud Events spec also supported
  9. debezium.io Change Event Payload Structure • Message Key: table’s primary

    key • Message Value: ◦ old & new data state ◦ meta-data on table, TX id, etc. ◦ operation type, timestamp • Serialization → JSON, Avro, … • Cloud Events spec also supported
  10. debezium.io Debezium enables various Use Cases 🤩 All of these

    and many more can be powered by change data capture and Debezium! 🤩
  11. debezium.io Further CDC-related materials • What is change data capture

    (CDC)? https://www.redhat.com/en/topics/integration/what-is-change-data-capture • Using Change Data Capture for Stack Modernization https://www.solutionpatterns.io/solution-pattern-modernization-cdc/solution-pattern-mod ernization-cdc/index.html • Use Case: Emerging Disease Detection https://validatedpatterns.io/patterns/emerging-disease-detection/