Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Kafka JDBC Source Connector: What could go wrong?

Apache Kafka JDBC Source Connector: What could go wrong?

Slides of my #KafkaSummit talk "Apache Kafka JDBC Source Connector: What could go wrong?"

When needing to source Database events into Apache Kafka, the JDBC source connector usually represents the first choice for its flexibility and the almost-zero setup required on the database side. But sometimes simplicity comes at the cost of accuracy and missing events can have catastrophic impacts on our data pipelines.

In this session we'll understand how the JDBC source connector works and explore the various modes it can operate to load data in a bulk or incremental manner. Having covered the basics, we'll analyse the edge cases causing things to go wrong like infrequent snapshot times, out of order events, non-incremental sequences or hard deletes.

Finally we'll look at other approaches, like the Debezium source connector, and demonstrate how some more configuration on the database side helps avoid problems and sets up a reliable source of events for our streaming pipeline.

Want to reliably take your Database events into Apache Kafka? This session is for you!

A23789f299ed06fe7d9f1c6940440bfa?s=128

FTisiot

April 26, 2022
Tweet

More Decks by FTisiot

Other Decks in Technology

Transcript

  1. Francesco Tisiot - Developer Advocate @ftisiot - @aiven_io JDBC Source

    Connector What Could Go Wrong?
  2. @ftisiot | @aiven_io

  3. @ftisiot | @aiven_io

  4. @ftisiot | @aiven_io

  5. @ftisiot | @aiven_io

  6. @ftisiot | @aiven_io

  7. @ftisiot | @aiven_io

  8. @ftisiot | @aiven_io Kafka Connect

  9. @ftisiot | @aiven_io JDBC Source Connector

  10. @ftisiot | @aiven_io List of Tables Polling Interval Query Mode

  11. @ftisiot | @aiven_io Bulk Mode

  12. @ftisiot | @aiven_io Incremental Mode WHERE ID > 4 WHERE

    ID > 6
  13. @ftisiot | @aiven_io WHERE TS > 10.03 WHERE TS >

    10.05 Timestamp Mode
  14. @ftisiot | @aiven_io Query Mode WHERE COL = △

  15. @ftisiot | @aiven_io Problems

  16. @ftisiot | @aiven_io Which JDBC Connector ? https:/ /github.com/aiven/jdbc-connector-for-apache-kafka

  17. @ftisiot | @aiven_io Common Challenges Data Types Out of Memory

    Errors Number Mapping numeric.mapping defaultRowFetchSize
  18. @ftisiot | @aiven_io ERROR java.lang.IllegalArgumentException: Number of groups must be

    positive table.types
  19. @ftisiot | @aiven_io Everything is Fine Not

  20. @ftisiot | @aiven_io Fast Events

  21. @ftisiot | @aiven_io Polling Interval State Event

  22. @ftisiot | @aiven_io Ghost Events

  23. @ftisiot | @aiven_io 1 2 3 Id Name 1 2

    3 Incremental = No Updates!
  24. @ftisiot | @aiven_io Name Change Timestamp 10:00 10:01 10:02 10:03

    10:00 10:01 10:02 10:03 No Hard Deletes!
  25. @ftisiot | @aiven_io Out of Order Events

  26. @ftisiot | @aiven_io Name Change Timestamp 10:00 10:01 10:03 10:03

    10:00 10:01 10:03 10:02
  27. @ftisiot | @aiven_io Why? Device Clock Network Lag Transaction Duration

    Batching
  28. @ftisiot | @aiven_io Polling Interval 10:01 10:02 10:02 Polling Interval

    10:02 > 10:01
  29. @ftisiot | @aiven_io Polling Interval timestamp.delay.interval.ms Delay

  30. @ftisiot | @aiven_io JDBC Limits Polling Time Out of Order

    Events Load on the DB Updates/Deletions Require Extra Fields
  31. @ftisiot | @aiven_io Log Based Approach

  32. @ftisiot | @aiven_io Write Ahead Log - PostgreSQL binlog -

    MySQL oplog - MongoDB
  33. @ftisiot | @aiven_io Debezium Connector

  34. @ftisiot | @aiven_io Video

  35. @ftisiot | @aiven_io Video

  36. @ftisiot | @aiven_io JDBC Limits Polling Time Out of Order

    Events Load on the DB Updates/Deletions Require Extra Fields All Events Near Real Time Tracked as per Log Minimal Load No Extra Fields
  37. @ftisiot | @aiven_io Additional Benefit - Enhanced Metadata!

  38. @ftisiot | @aiven_io Timestamps Pre-Post status Operation Type Sequence Number

  39. @ftisiot | @aiven_io

  40. @ftisiot | @aiven_io JDBC Debezium

  41. @ftisiot | @aiven_io https:/ /aiven.io Debezium Connector JDBC Source Connector

    in Action Debezium Connector in Action JDBC Connector https:/ /ftisiot.net/talks/kafka-jdbc-what-can-go-wrong/ kafka-summit-2022 500$