Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Keep your cache always fresh with Debezium! (Current 22)

Keep your cache always fresh with Debezium! (Current 22)

The saying goes that there are only two hard things in Computer Science: cache invalidation, and naming things. Well, turns out the first one is solved actually. Join us for this session to learn how to keep read views of your data in distributed caches close to your users, always kept in sync with your primary data stores change data capture. You will learn how to:

* Implement a low-latency data pipeline for cache updates based on Debezium, Apache Kafka, and Infinispan
* Create denormalized views of your data using Kafka Streams and make them accessible via plain key look-ups from a cache cluster close by
* Propagate updates between cache clusters using cross-site replication

We'll also touch on some advanced concepts, such as detecting and rejecting writes to the system of record which are derived from outdated cached state, and show in a demo how all the pieces come together, of course connected via Apache Kafka.

Gunnar Morling

October 05, 2022
Tweet

More Decks by Gunnar Morling

Other Decks in Programming

Transcript

  1. Image © Nathalie https://flic.kr/p/21Ghf2g (CC BY 2.0)
    Keep Your Cache Always Fresh With Debezium!
    Gunnar Morling
    Software Engineer, Red Hat
    @gunnarmorling

    View full-size slide

  2. #Debezium @gunnarmorling
    The Challenge
    Multi-site Application With Shared Database

    View full-size slide

  3. #Debezium @gunnarmorling
    The Challenge
    Multi-site Application With Shared Database

    View full-size slide

  4. #Debezium @gunnarmorling
    The Challenge
    Multi-site Application With Shared Database

    View full-size slide

  5. #Debezium @gunnarmorling
    The Challenge
    ● Deployments in different geographies, single DB
    ● 90% read requests
    ● Complex queries
    Multi-site Application With Shared Database

    View full-size slide

  6. #Debezium @gunnarmorling
    … Multi-site application with shared system-of-record database
    … With local, denormalized read views (CQRS)
    ... Automatically kept in sync after writes
    Today’s Mission
    🤔
    Explore How to Build a…

    View full-size slide

  7. #Debezium @gunnarmorling
    ● Software engineer at Red Hat
    ○ Debezium
    ○ Quarkus
    ● kcctl 🧸, JfrUnit, ModiTect, MapStruct
    ● Spec Lead for Bean Validation 2.0
    ● Java Champion
    Gunnar Morling

    View full-size slide

  8. #Debezium @gunnarmorling
    The Idea
    Caching to the Rescue!

    View full-size slide

  9. #Debezium @gunnarmorling
    The Idea
    Caching to the Rescue!

    View full-size slide

  10. https://flic.kr/p/PFDvkY Public Domain, Angelo Brathot

    View full-size slide

  11. #Debezium @gunnarmorling
    Infinispan
    100% Open-source In-Memory Distributed Data Store
    Interoperability
    Resilient
    Fault Tolerant Data
    Clustered Processing Query
    ACID Tx

    View full-size slide

  12. #Debezium @gunnarmorling
    App 1
    Data
    App 2
    Data
    1, Maria
    2, Jenny
    2, Jenny
    3, Juan
    App 3
    Data
    1, Maria
    3, Juan
    Infinispan Deployment
    Distributed Cache

    View full-size slide

  13. #Debezium @gunnarmorling
    App 1
    Data
    App 2
    Data
    1, Maria
    2, Jenny
    2, Jenny
    3, Juan
    App 3
    Data
    1, Maria
    3, Juan
    Put 4 Will Put 4 Will
    Infinispan Deployment
    Distributed Cache

    View full-size slide

  14. #Debezium @gunnarmorling
    App 1
    Data
    App 2
    Data
    1, Maria
    2, Jenny
    4, Will
    2, Jenny
    3, Juan
    4, Will
    App 3
    Data
    1, Maria
    3, Juan
    Infinispan Deployment
    Distributed Cache

    View full-size slide

  15. #Debezium @gunnarmorling

    View full-size slide

  16. #Debezium @gunnarmorling
    Infinispan
    Client/Server
    Service 1
    Service 2
    Binary
    (hot rod)
    REST
    ...
    Service 3
    Data
    Infinispan Cluster
    Data
    Data

    View full-size slide

  17. #Debezium @gunnarmorling
    Infinispan Cross-Site Replication
    AWS (LON)
    GCP (NYC)
    Load Balancer
    APP
    APP
    Service
    APP
    APP
    Service
    Shared
    State
    Shared
    State
    Shared
    State
    Shared
    State
    Data
    Data
    Data NYC
    Data LON
    RELAY2

    View full-size slide

  18. #Debezium @gunnarmorling
    The Question
    How To Keep The Cache In Sync?

    View full-size slide

  19. #Debezium @gunnarmorling
    Dual Writes
    Easy to Get Wrong

    View full-size slide

  20. https://flic.kr/p/PFDvkY Public Domain, Angelo Br

    View full-size slide

  21. #Debezium @gunnarmorling
    Debezium in a Nutshell
    Open-Source Change Data Capture
    ● A CDC Platform
    ○ Based on transaction logs
    ○ Snapshotting, filtering, etc.
    ○ Outbox support
    ○ Web-based UI
    ● Fully open-source, very active
    community
    ● Large production deployments

    View full-size slide

  22. #Debezium @gunnarmorling
    Debezium
    Becoming the De-Facto CDC Standard
    https://debezium.io/blog/2021/09/22/deep-dive-into-a-debezium-community-connector-scylla-cdc-source-connector/

    View full-size slide

  23. #Debezium @gunnarmorling
    Debezium: Deployment Alternatives
    Embedded Engine and Debezium Server

    View full-size slide

  24. #Debezium @gunnarmorling
    Solution Overview
    Capturing Changes From the Database

    View full-size slide

  25. https://flic.kr/p/PFDvkY Public Domain, Angelo Brathot
    Demo

    View full-size slide

  26. #Debezium @gunnarmorling
    ● Fast start-up, low memory consumption
    ● Developer joy
    ● Imperative and Reactive
    ● Best-of-breed libraries
    ● Run via HotSpot and GraalVM native binaries
    Quarkus - Supersonic Subatomic Java
    A Stack for Building Cloud-native Apps

    View full-size slide

  27. https://flic.kr/p/PFDvkY Public Domain, Angelo Brathot
    Demo

    View full-size slide

  28. #Debezium @gunnarmorling
    Eventual Consistency
    Dealing With Stale Data
    ● Cached views updated asynchronously
    ● How to ensure read your own writes semantics?
    ● How to prevent taking action based on stale data?
    🤔

    View full-size slide

  29. #Debezium @gunnarmorling
    Eventual Consistency
    Read Your Own Writes

    View full-size slide

  30. #Debezium @gunnarmorling
    Eventual Consistency
    Read Your Own Writes

    View full-size slide

  31. #Debezium @gunnarmorling
    Eventual Consistency
    Read Your Own Writes

    View full-size slide

  32. #Debezium @gunnarmorling
    Eventual Consistency
    Read Your Own Writes

    View full-size slide

  33. #Debezium @gunnarmorling
    Eventual Consistency
    Detecting Writes Derived From Stale State

    View full-size slide

  34. #Debezium @gunnarmorling
    … Multi-site application with shared system-of-record database ✅
    … With local, denormalized read views (CQRS) ✅
    ... Automatically kept in sync after writes ✅
    Today’s Mission
    🤩
    Explore How to Build a…

    View full-size slide

  35. #Debezium @gunnarmorling
    Discussion
    What Have We Gained?
    ● Pro:
    ○ Lower latencies
    ○ Reduced load on primary database
    ○ Increased availability
    ● Con:
    ○ Increased complexity

    View full-size slide

  36. #Debezium @gunnarmorling
    ● Incremental snapshotting
    ● Postgres logical decoding messages
    ● Multi-DB support (SQL Server)
    ● Debezium Server sinks
    ● MongoDB change streams support
    ● Debezium UI
    ● Debezium 2.0
    What’s New in Debezium?

    View full-size slide

  37. #Debezium @gunnarmorling
    ● Infinispan: @infinispan | https://infinispan.org/
    ● Debezium: @debezium | https://debezium.io/
    ● Demo: https://github.com/debezium/debezium-examples/ →
    distributed-caching
    ● kcctl 🧸: https://github.com/kcctl/kcctl/
    Learn More

    View full-size slide

  38. #Debezium @gunnarmorling
    Q & A
    [email protected]
    @gunnarmorling
    📧
    Thank You!

    View full-size slide