Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Kafka — The Hard Parts

E6c61f085d7aceaf97b7bd00e9195514?s=47 SQUER Solutions
June 14, 2022
19

Apache Kafka — The Hard Parts

Vienna Apache Kafka® Meetup by Confluent

E6c61f085d7aceaf97b7bd00e9195514?s=128

SQUER Solutions

June 14, 2022
Tweet

Transcript

  1. Apache Kafka @duffleit THE HARD PARTS

  2. David Leitner @duffleit Coding Architect david@squer.at @duffleit

  3. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C''
  4. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Producer
  5. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer
  6. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances.
  7. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. User: Bob User: Alice User: Tim selection of partition: = hash(key) % #ofpartitions User: Bob
  8. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. Partition D Partition '' Partition ' User: Bob User: Alice User: Tim selection of partition: = hash(key) % 3 to 4 User: Bob User: Bob User: Bob User: Alice Topic.v2
  9. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. Select is wisely & over partition a bit. Partition D Partition '' Partition ' User: Bob User: Alice User: Tim selection of partition: = hash(key) % 3 to 4 User: Bob User: Bob User: Bob User: Alice Topic.v2 Select something that can be devided by multipe numbers. e.g. 6, 12, 24, ...
  10. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: delete
  11. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: delete 2 weeks or some size
  12. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: delete 2 weeks
  13. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key.
  14. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙
  15. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙
  16. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙ log.segment.ms = 1week Especially in GDPR related usecases think explicitly about segement-size and roll-time.
  17. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact Tombstone: Tim delete.retention.ms = 1day Slow Consumer needs more than one day to read all events from the topic that starts new. Tombstone: Tim
  18. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact Tombstone: Tim delete.retention.ms = 1day Slow Consumer needs more than one day to read all events from the topic that starts new. User: Tim Keep delete.retention in sync with the given topic retention.
  19. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C''
  20. The Basics Cluster Node C Producer Consumer Consumer Consumer Group

    Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2
  21. The Basics Cluster Node C Producer Consumer Consumer Consumer Group

    Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2 Partition D Partition E Partition F Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' KIP-36: Rack aware replica assignment
  22. Cluster The Basics Node C Producer Consumer Consumer Consumer Group

    Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2 Partition D Partition E Partition F Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' KIP-36: Rack aware replica assignment
  23. Multi Region

  24. Cluster Multi Region? Node C Producer Consumer Consumer Consumer Group

    Topic Partition A Partition B Partition C Node B Node A Node F Node E Node D Region 1 Region 2 Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Partition D Partition E Partition F Usually the latency between multiple regions is to big to span a single cluster over it
  25. Cluster Region West Cluster Region East Multi Region? Node C

    Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Node B Node A Node F Node E Node D Region 1 Region 2 Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F''
  26. Cluster Region West Cluster Region East Multi Region? Node C

    Producer Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C
  27. Cluster Region West Cluster Region East Multi Region? Node C

    Producer East Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C Producer West
  28. Cluster Region West — "west" Cluster Region East — "east"

    Multi Region? Node C Producer East Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C Producer West Mirror Maker 2 east.C west.C *.C Ordering Guarantees?!
  29. @duffleit Order is Guaranteed in a single Partition. Are you

    sure?
  30. @duffleit Producer Partition max.in.flight.requests.per.connection = 5 Message A Message B

    Message A retry Message B Message A retries = MAX_INT
  31. @duffleit Producer Partition max.in.flight.requests.per.connection = 5 Message A Message B

    Message A retry Message B Message A Legacy Solution: max.in.flight.requests.per.connection = 1 State-of-the-Art Solution: enable.idempotence = true retries = MAX_INT max.in.flight.requests.per.connection = 5 acks = all SEQ#: 1 SEQ#: 2 OutOfOrderSequenceException SEQ#: 2 If you don't want to set your retries to invinite prefer "delivery.timeout.ms" over "retries".
  32. Node A Node B Node C Topic Partition A Partition

    B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = none acks = 1 acks = all min.insync.replicas = 3 @duffleit
  33. Node A Node B Node C Topic Partition A Partition

    B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = none acks = 1 acks = all min.insync.replicas = 2 @duffleit
  34. min.insync.replicas = 2 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
  35. min.insync.replicas = 3 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
  36. min.insync.replicas = 3 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
  37. min.insync.replicas = 2 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
  38. min.insync.replicas = 2 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas-- Possible Data Loss There is no "ad-hoc fsync" by default. Can be configured via "log.default.flush.interval.ms"
  39. @duffleit Keep in mind that rack assignment is ingnored for

    insync replicas. Node C Node B Node A Node D Node E Node F AZ 1 AZ 2 Text . . . . . replicas=6 min.insync.replicas = 4
  40. @duffleit Keep in mind that rack assignment is ingnored for

    insync replicas. Node C Node B Node A Node D Node E Node F AZ 1 AZ 2 Text . . . . replicas=5 min.insync.replicas = 4 fail on > 1
  41. @duffleit Keep in mind that rack assignment is ingnored for

    insync replicas. Node C Node B Node A Node D Node E Node F Node G Node H Node I AZ 1 AZ 2 AZ 3 Text . . . . . . . . replicas=9 min.insync.replicas = 7
  42. @duffleit @duffleit Let's talk about Lost Messages

  43. @duffleit Partition Message A Consumer Message B Message C enable.auto.commit=true

    auto.commit.interval.ms=5_SEC
  44. @duffleit Partition Message A Consumer Message B Message C enable.auto.commit=true

    auto.commit.interval.ms=5_SEC Auto-Commit: A,B,C
  45. @duffleit Partition Message D Consumer enable.auto.commit=true auto.commit.interval.ms=5_SEC Message A Message

    B Message C enable.auto.commit=false
  46. @duffleit

  47. Producer Stream Consistency Kafka, and "Non-Kafka". Transaction to achieve Message

  48. Producer Stream Consistency Kafka, and "Non-Kafka". Transaction to achieve Message

    Message
  49. Producer Stream Consistency Kafka, and "Non-Kafka". Transaction to achieve Message

    Message Producer Consumers Achieve Exactly Once Semantics. Transaction to achieve Message
  50. Producer Stream Consistency Kafka, and "Non-Kafka". Transaction to achieve Message

    Message Producer Consumers Achieve Exactly Once Semantics. Transaction to achieve Message Exactly Once. Atomicity between multipe Topic Operations. Transaction to achieve Transactions Balances Producer Message
  51. Producer Stream Consistency Kafka, and "Non-Kafka". Transaction to achieve Message

    Message Producer Consumers Achieve Exactly Once Semantics. Transaction to achieve Message Exactly Once. Atomicity between multipe Topic Operations. Transaction to achieve Transactions Balances Producer Message Message
  52. @duffleit Onboarding UserUpdate Stream 💥 ✅

  53. @duffleit Onboarding UserUpdate Stream User UserEvents CDC Outbox Pattern ✅

  54. @duffleit Onboarding UserUpdate Stream ✅ User Listen to yourself Pattern.

  55. @duffleit Onboarding UserUpdated (age: 21) Stream UserUpdated (age: 21) Advertisment

    User (age: 21) User (age: 21) EventSourcing
  56. @duffleit Onboarding UserUpdated (age: 22) Stream UserUpdated (age: 21) Advertisment

    User (age: 21) User (age: 21) UserUpdated (age: 22) EventSourcing
  57. @duffleit Onboarding UserUpdated (age: 22) Stream UserUpdated (age: 21) Advertisment

    User (age: 22) User (age: 22) UserUpdated (age: 22) UserUpdated (age: 23) EventSourcing
  58. @duffleit Onboarding UserUpdated (age: 22) Stream Global EventSourcing UserUpdated (age:

    21) Advertisment User (age: 22) User (age: 22) UserUpdated (age: 22) UserUpdated (age: 23) 👻 if often breaks information hiding & data isolation.
  59. Stream @duffleit UpdateUser (age: 21) Stream Local EventSourcing UserAgeChanged (age:

    21) UserUpdated (age: 21) Onboarding 🔒 👻 💅
  60. Producer Producer Stream Consumers Achieve Exactly Once Semantics. Transaction to

    achieve Consistency Kafka, and "Non-Kafka". Transaction to achieve Atomicity between multipe Topic Operations. Transaction to achieve Transactions Balances Producer Message Message Message Outbox Pattern Listen to Yourself Local Eventsourcing Message
  61. @duffleit "Kafka Transactions" Producer Consumers Producer Producer Consumers Consumers Stream

    Processor Stream Processor Stream Processor enable.idempotence = true isolation.level = read_committed Deduplication Inbox
  62. Producer Stream Achieve Exactly Once Semantics. Transaction to achieve Consistency

    Kafka, and "Non-Kafka". Transaction to achieve Atomicity between multipe Topic Operations. Transaction to achieve Message Outbox Pattern Listen to Yourself Local Eventsourcing Deduplication Inbox Idempotency Producer Consumers Transactions Balances Producer Message Message Message
  63. Transfers Payment Service Alice -> Bob Alice -10€ Bob +10€

    Transaction_Coordinator
  64. Transfers Payment Service Alice -> Bob Alice -10€ Bob +10€

    __transaction_state Transaction: ID __consumer_offset payments: 1 P1 P1 P2 Transfers P3 P2 P3 C C C C isolation.level=read_committed
  65. Transfers Payment Service Alice -> Bob Alice -10€ __transaction_state Transaction:

    ID __consumer_offset payments: 1 P1 P1 P2 Transfers P3 P2 isolation.level=read_committed Service A A A Transaction: ID2
  66. Producer Stream Achieve Exactly Once Semantics. Transaction to achieve Consistency

    Kafka, and "Non-Kafka". Transaction to achieve Atomicity between multipe Topic Operations. Transaction to achieve Message Outbox Pattern Listen to Yourself Listen to yourself Deduplication Inbox Idempotency Producer Consumers Transactions Balances Producer Message Message Message Kafka's Exactly-Once Semantics Outbox Pattern
  67. @duffleit

  68. @duffleit Producer Consumers Topic 📜 📜 Producer Producer Producer Producer

    Producer Producer Producer Consumers Consumers Consumers Consumers Consumers Consumers Consumers Topic Topic Topic Topic Topic Topic Topic
  69. Consumers Consumers Consumers Consumers Consumers Consumers Consumers Producer Producer Producer

    Producer Producer Producer Producer @duffleit Producer Consumers Topic 📜 📜 Topic Topic Topic Topic Topic Topic Topic
  70. @duffleit Producer Consumers Topic Schema Registry Faulty Message Producer Producer

    Producer Producer Producer Producer Producer Broker Side Validation, FTW
  71. @duffleit Producer Consumers Topic Schema Registry Faulty Message Broker Side

    Validation 🤚 Deserialization on Broker 😱 MagicByte SubjectId Payload ✅ Check if MagicByte Exists. ✅ Check if SubjectId is Valid. ✅ Check if Payload Matches Schema. The more to the right, the more expensive it gets.
  72. @duffleit squer.link/broker-side-valdiation-sidecar

  73. @duffleit Cluster Node A Node B Node C Go Proxy

    Go Proxy Go Proxy ⏳ ⏳ ⏳
  74. @duffleit Cluster Node A Node B Node C Go Proxy

    Go Proxy Go Proxy Go Proxy Go Proxy Go Proxy Race Condition We can no longer guarantee ordering.
  75. @duffleit squer.link/broker-side-valdiation-sidecar

  76. Ok, Lets sum up. @duffleit

  77. @duffleit Multi AZ, Multi Region, Multi Cloud Consistency vs. Availability

    Disable Autocommit! Different Options to Achieve Transactional Guarantees in Kafka Broker Side Schema Validation Segment Size Portion Size: "over-partition a bit" and 200+ more Configuration Properties. What we have seen 👀
  78. @duffleit Multi AZ, Multi Region, Multi Cloud Consistency vs. Availability

    Disable Autocommit! Different Options to Achieve Transactional Guarantees in Kafka Broker Side Schema Validation Segment Size Portion Size: "over-partition a bit" and 200+ more Configuration Properties. What we have seen 👀 We were able to handle them, so are you. 💪
  79. David Leitner @duffleit Coding Architect david@squer.at @duffleit