Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streaming Data Integration with Apache Kafka Connect @ Graz Kafka Meetup #1

Streaming Data Integration with Apache Kafka Connect @ Graz Kafka Meetup #1

Abstract:

Stream processing gained a lot of traction in recent times and more often than not, we find Apache Kafka - the de facto standard event streaming platform - as the central nervous system of company-wide data architectures. This notwithstanding, many real-world uses cases simply need operational data stores as complementary components to live up to various application-related requirements and challenges. Join this session to learn about how Kafka Connect enables robust integration paths between both worlds. The focus lies on configuration-based data-in-motion scenarios leveraging turn-key ready connector implementations in order to lay out streaming data pipelines without writing a single line of code.

Bio:

Hans-Peter Grahsl is working as a technical trainer at NETCONOMY in Graz, Austria. As an independent engineer & consultant he is working with customers to build on-premise or cloud-based data architectures using NoSQL data stores and event streaming platforms such as Apache Kafka. Hans-Peter is also an associate lecturer for Software Engineering at CAMPUS 02 and occasionally speaks at developer conferences.

Event Page:
https://www.meetup.com/Graz-Kafka/events/265837901/

Recording:
There was no recording of this session.

Hans-Peter Grahsl

November 27, 2019
Tweet

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Transcript

  1. Streaming Data Integration
    with Apache Kafka Connect

    View Slide

  2. Hans-Peter Grahsl
    • technical trainer at
    • independent consultant & engineer
    • associate lecturer

    !
    occasional conference speaker
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 2

    View Slide

  3. Apache
    Kafka
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 3

    View Slide

  4. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 4

    View Slide

  5. STREAMING
    Platform
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 5

    View Slide

  6. Apache Kafka
    • pub / sub to event streams
    • (permanently) store event streams
    • process streams in near real-time
    ➔ horizontal scalability
    ➔ high availability
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 6

    View Slide

  7. EVENTS
    ...events everywhere
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 7

    View Slide

  8. APIs for "Everything"
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 8

    View Slide

  9. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 9

    View Slide

  10. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 10

    View Slide

  11. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 11

    View Slide

  12. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 12

    View Slide

  13. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 13

    View Slide

  14. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 14

    View Slide

  15. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 15

    View Slide

  16. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 16

    View Slide

  17. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 17

    View Slide

  18. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 18

    View Slide

  19. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 19

    View Slide

  20. Apache Kafka
    Connect
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 20

    View Slide

  21. Apache Kafka Connect
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 21

    View Slide

  22. Apache Kafka Connect
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 22

    View Slide

  23. Apache Kafka Connect
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 23

    View Slide

  24. Apache Kafka Connect
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 24

    View Slide

  25. Apache Kafka Connect
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 25

    View Slide

  26. disentangling
    spaghetti
    architectures
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 26

    View Slide

  27. Apache Kafka Connect
    • often about data stores
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 27

    View Slide

  28. Apache Kafka Connect
    • concrete examples
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 28

    View Slide

  29. Apache Kafka Connect
    • concrete examples
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 29

    View Slide

  30. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 30

    View Slide

  31. MongoDB Connector
    • officially supported by MongoDB
    • developed open-source on GitHub
    • verified Gold by Confluent
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 31

    View Slide

  32. MongoDB Connector
    • instead of my community sink
    https://confluent.io/hub/hpgrahsl/kafka-connect-mongodb
    • recommendation: use the official one
    https://confluent.io/hub/mongodb/kafka-connect-mongodb
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 32

    View Slide

  33. Source Connectors
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 33

    View Slide

  34. Source Connectors
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 34

    View Slide

  35. Change-Data-Capture
    • react to database changes
    • INSERTs and UPDATEs
    • DELETEs (if applicable)
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 35

    View Slide

  36. Change-Data-Capture
    "...is one giant enabler [...] ultimately,
    it's liberation for your data."
    — Gunnar Morling
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 36

    View Slide

  37. Change-Data-Capture
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 37

    View Slide

  38. Change-Data-Capture
    query-based log-based
    no changes missed ! ✔
    low delay & low polling overhead ! ✔
    data model agnostic ! ✔
    captures deletions & previous state ! ✔
    installation & configuration ✔ !
    !
    Debezium Blog https://bit.ly/2CRUvxo
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 38

    View Slide

  39. @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 39

    View Slide

  40. Apache Kafka
    Source Connectors
    Demo
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 40

    View Slide

  41. ...break apart
    the silos
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 41

    View Slide

  42. to unleash
    your data
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 42

    View Slide

  43. Source Connectors
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 43

    View Slide

  44. Single Message Transforms
    • cast types
    • drop key / value
    • mask fields
    • blacklist / whitelist fields
    • convert timestamps
    • topic routing
    • ...
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 44

    View Slide

  45. Source Connectors
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 45

    View Slide

  46. Converters
    • Kafka only stores bytes
    • converters do (de)serializations
    • e.g. String, JSON, Avro, ...
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 46

    View Slide

  47. Serialization Formats
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 47

    View Slide

  48. Serialization Formats
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 48

    View Slide

  49. Serialization Formats
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 49

    View Slide

  50. Serialization Formats
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 50

    View Slide

  51. Serialization Formats
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 51

    View Slide


  52. Schema Governance
    "If your dev process doesn't validate
    schema compatibility somewhere
    between your IDE and production - you
    are screwed and don't know it."
    — Gwen Shapira
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 52

    View Slide

  53. "Best bet" currently
    together with Confluent's Schema Registry
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 53

    View Slide

  54. Sink Connectors
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 54

    View Slide

  55. Sink Connectors
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 55

    View Slide

  56. Sink Connectors
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 56

    View Slide

  57. Sink Connectors
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 57

    View Slide

  58. Apache Kafka
    Sink Connectors
    Demo
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 58

    View Slide

  59. Exemplary
    Use Cases
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 59

    View Slide

  60. Customer 360° View
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 60

    View Slide

  61. Customer 360° View
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 61

    View Slide

  62. Customer 360° View
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 62

    View Slide

  63. Customer 360° View
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 63

    View Slide

  64. Customer 360° View
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 64

    View Slide

  65. Synchronization across Services
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 65

    View Slide

  66. Synchronization across Services
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 66

    View Slide

  67. Synchronization across Services
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 67

    View Slide

  68. Synchronization across Services
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 68

    View Slide

  69. Synchronization across Services
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 69

    View Slide

  70. Near Real-Time Recommendations
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 70

    View Slide

  71. Near Real-Time Recommendations
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 71

    View Slide

  72. Near Real-Time Recommendations
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 72

    View Slide

  73. Near Real-Time Recommendations
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 73

    View Slide

  74. Near Real-Time Recommendations
    @hpgrahsl | Apache Kafka® Meetup, 27th Nov. 2019, Graz - Austria 74

    View Slide

  75. View Slide

  76. View Slide