Keeping your Caches Hot with Apache Kafka and the Connector API

Keeping your Caches Hot with Apache Kafka and the Connector API

02ff2dde723b6e26f4ef03ee6b3f6eb9?s=128

Ricardo Ferreira

September 18, 2019
Tweet

Transcript

  1. @riferrei | #oraclecodeone | @CONFLUENTINC Keep your caches hot with

    apache kafka® and the connector api [dev1346] @riferrei | @oraclecodeone | @CONFLUENTINC
  2. KS19Meetup. CONFLUENT COMMUNITY DISCOUNT CODE 25% OFF*

  3. About Us: • Ricardo Ferreira ❑ Developer Advocate @ Confluent

    ❑ Ex-Oracle, Red Hat, IONA Tech ❑ Ricardo@confluent.io ❑ https://riferrei.net • Alexa (amazon echo) ❑ The voice behind Amazon ❑ Ex-Raspberry Pi, Arduino ❑ She is a female in character! @riferrei @alexa99
  4. @riferrei | @oraclecodeone | @CONFLUENTINC Data is only useful if

    it is fresh and contextual
  5. Airbag systems are made out of three key components: •

    The bag itself • The sensors which detects when there is a collision probability based on speed • The inflation system, which does combine the chemical compounds (sodium azide and potassium nitrate) to make gas to inflate the bag What if the airbag deploys 30 seconds after the collision?
  6. @riferrei | @oraclecodeone | @CONFLUENTINC Caches are great to keep

    data fresh
  7. Apis need to access large amounts data freely and easily

    • Data should never be the scarce resource of apis • Latency should be kept as minimal as possible • Data should not be static: keep data always updated • Find ways to handle large amounts of data easily Cache API Read Write Read Write
  8. Caches can be built-in or distributed caches Cache API Built-in

    Caches Cache API Distributed Caches Cache Cache Read Write Read Write • If data can fit into the api memory space, use built-in • Use distributed caches for large amounts of data • Some cache implementations provide both options • For distributed caches, make sure to use one that has o(1) retrieval time
  9. @riferrei | @oraclecodeone | @CONFLUENTINC Why apache kafka?

  10. 10 01 Messaging done right 02 Stream processing 03 Persistent

    storage @riferrei | @oraclecodeone | @CONFLUENTINC
  11. Time for fun

  12. @riferrei | @oraclecodeone | @CONFLUENTINC

  13. @riferrei | @oraclecodeone | @CONFLUENTINC Caching patterns

  14. Caching pattern: refresh ahead • Proactively keep the cache updated

    with the last state • Keep the entries always in- sync for better consistency • Ideal for latency sensitive use cases such as apis • Ideal for when the data is costly to get from backend • It may need data loading Kafka Connect Cache Kafka Connect API
  15. Caching pattern: refresh ahead/adapt • Proactively keep the cache updated

    with the last state • Keep the entries always in- sync for better consistency • Ideal for latency sensitive use cases such as apis • Ideal for when the data is costly to get from backend • It may need data loading Kafka Connect Cache Kafka Connect Transform and adapt records before delivery Schema Registry for canonical models API
  16. Caching pattern: write behind • Removes i/o pressure from the

    api, allowing scalability • True horizontal scalability • Ensures event ordering and persistence (replayability) • Minimizes database code complexity from the api • Handles database failures beautifully via replication Kafka Connect Cache Kafka Connect API
  17. Caching pattern: write behind/adapt • Removes i/o pressure from the

    api, allowing scalability • True horizontal scalability • Ensures event ordering and persistence (replayability) • Minimizes database code complexity from the api • Handles database failures beautifully via replication Kafka Connect Cache Kafka Connect Transform and adapt records before delivery API Schema Registry for canonical models
  18. Caching pattern: event federation • Replicates data across regions across

    the globe • Keep multiple regions in-sync • Great to improve rto / rpo • Handles network slowness • While keeping disparate clusters in-sync, it also allows the caches to be global as well Confluent Replicator
  19. @riferrei | @oraclecodeone | @CONFLUENTINC Kafka connect implementation strategies

  20. Kafka connectors for popular caches • Connector for redis and

    is available on confluent hub • Connector for Memcached is available on confluent hub • Connector for gridgain and apache ignite are available • Connector for infinispan is available for red hat users Kafka Connect Kafka Connect Kafka Connect Kafka Connect
  21. some caches may need different strategies • Oracle provides hotcache

    for golden gate and coherence • Hazelcast has the jet sdk which supports connect • Pivotal gemfire and apache geode has the spring data • Good news: you can always write your own connectors using the connector api Golden gate Hazelcast Jet Spring data Connect Framework Any Cache
  22. Want to convert your database to streams? • Meet debezium:

    a plataform to perform database cdc • Works in a log level, which means true cdc behavior for your projects • Open-source and maintained by red hat. Has a broad set of connectors available • It is built on kafka connect
  23. @riferrei | #oraclecodeone | @CONFLUENTINC

  24. None