Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Kafka — The Hard Parts

SQUER Solutions
June 14, 2022
260

Apache Kafka — The Hard Parts

Vienna Apache Kafka® Meetup by Confluent

SQUER Solutions

June 14, 2022
Tweet

Transcript

  1. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C''
  2. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Producer
  3. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer
  4. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances.
  5. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. User: Bob User: Alice User: Tim selection of partition: = hash(key) % #ofpartitions User: Bob
  6. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. Partition D Partition '' Partition ' User: Bob User: Alice User: Tim selection of partition: = hash(key) % 3 to 4 User: Bob User: Bob User: Bob User: Alice Topic.v2
  7. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. Select is wisely & over partition a bit. Partition D Partition '' Partition ' User: Bob User: Alice User: Tim selection of partition: = hash(key) % 3 to 4 User: Bob User: Bob User: Bob User: Alice Topic.v2 Select something that can be devided by multipe numbers. e.g. 6, 12, 24, ...
  8. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: delete
  9. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: delete 2 weeks or some size
  10. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: delete 2 weeks
  11. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key.
  12. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙
  13. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙
  14. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙ log.segment.ms = 1week Especially in GDPR related usecases think explicitly about segement-size and roll-time.
  15. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact Tombstone: Tim delete.retention.ms = 1day Slow Consumer needs more than one day to read all events from the topic that starts new. Tombstone: Tim
  16. The Basics Topic: UserChanges User: Bob User: Alice User: Tim

    User: Bob User: Bob log.cleanup.policy: compact Tombstone: Tim delete.retention.ms = 1day Slow Consumer needs more than one day to read all events from the topic that starts new. User: Tim Keep delete.retention in sync with the given topic retention.
  17. The Basics Cluster Node A Node B Node C Producer

    Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C''
  18. The Basics Cluster Node C Producer Consumer Consumer Consumer Group

    Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2
  19. The Basics Cluster Node C Producer Consumer Consumer Consumer Group

    Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2 Partition D Partition E Partition F Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' KIP-36: Rack aware replica assignment
  20. Cluster The Basics Node C Producer Consumer Consumer Consumer Group

    Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2 Partition D Partition E Partition F Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' KIP-36: Rack aware replica assignment
  21. Cluster Multi Region? Node C Producer Consumer Consumer Consumer Group

    Topic Partition A Partition B Partition C Node B Node A Node F Node E Node D Region 1 Region 2 Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Partition D Partition E Partition F Usually the latency between multiple regions is to big to span a single cluster over it
  22. Cluster Region West Cluster Region East Multi Region? Node C

    Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Node B Node A Node F Node E Node D Region 1 Region 2 Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F''
  23. Cluster Region West Cluster Region East Multi Region? Node C

    Producer Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C
  24. Cluster Region West Cluster Region East Multi Region? Node C

    Producer East Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C Producer West
  25. Cluster Region West — "west" Cluster Region East — "east"

    Multi Region? Node C Producer East Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C Producer West Mirror Maker 2 east.C west.C *.C Ordering Guarantees?!
  26. @duffleit Producer Partition max.in.flight.requests.per.connection = 5 Message A Message B

    Message A retry Message B Message A Legacy Solution: max.in.flight.requests.per.connection = 1 State-of-the-Art Solution: enable.idempotence = true retries = MAX_INT max.in.flight.requests.per.connection = 5 acks = all SEQ#: 1 SEQ#: 2 OutOfOrderSequenceException SEQ#: 2 If you don't want to set your retries to invinite prefer "delivery.timeout.ms" over "retries".
  27. Node A Node B Node C Topic Partition A Partition

    B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = none acks = 1 acks = all min.insync.replicas = 3 @duffleit
  28. Node A Node B Node C Topic Partition A Partition

    B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = none acks = 1 acks = all min.insync.replicas = 2 @duffleit
  29. min.insync.replicas = 2 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
  30. min.insync.replicas = 3 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
  31. min.insync.replicas = 3 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
  32. min.insync.replicas = 2 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
  33. min.insync.replicas = 2 Node A Node B Node C Topic

    Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas-- Possible Data Loss There is no "ad-hoc fsync" by default. Can be configured via "log.default.flush.interval.ms"
  34. @duffleit Keep in mind that rack assignment is ingnored for

    insync replicas. Node C Node B Node A Node D Node E Node F AZ 1 AZ 2 Text . . . . . replicas=6 min.insync.replicas = 4
  35. @duffleit Keep in mind that rack assignment is ingnored for

    insync replicas. Node C Node B Node A Node D Node E Node F AZ 1 AZ 2 Text . . . . replicas=5 min.insync.replicas = 4 fail on > 1
  36. @duffleit Keep in mind that rack assignment is ingnored for

    insync replicas. Node C Node B Node A Node D Node E Node F Node G Node H Node I AZ 1 AZ 2 AZ 3 Text . . . . . . . . replicas=9 min.insync.replicas = 7
  37. Producer Stream Consistency Kafka, and "Non-Kafka". Transaction to achieve Message

    Message Producer Consumers Achieve Exactly Once Semantics. Transaction to achieve Message
  38. Producer Stream Consistency Kafka, and "Non-Kafka". Transaction to achieve Message

    Message Producer Consumers Achieve Exactly Once Semantics. Transaction to achieve Message Exactly Once. Atomicity between multipe Topic Operations. Transaction to achieve Transactions Balances Producer Message
  39. Producer Stream Consistency Kafka, and "Non-Kafka". Transaction to achieve Message

    Message Producer Consumers Achieve Exactly Once Semantics. Transaction to achieve Message Exactly Once. Atomicity between multipe Topic Operations. Transaction to achieve Transactions Balances Producer Message Message
  40. @duffleit Onboarding UserUpdated (age: 22) Stream UserUpdated (age: 21) Advertisment

    User (age: 21) User (age: 21) UserUpdated (age: 22) EventSourcing
  41. @duffleit Onboarding UserUpdated (age: 22) Stream UserUpdated (age: 21) Advertisment

    User (age: 22) User (age: 22) UserUpdated (age: 22) UserUpdated (age: 23) EventSourcing
  42. @duffleit Onboarding UserUpdated (age: 22) Stream Global EventSourcing UserUpdated (age:

    21) Advertisment User (age: 22) User (age: 22) UserUpdated (age: 22) UserUpdated (age: 23) 👻 if often breaks information hiding & data isolation.
  43. Producer Producer Stream Consumers Achieve Exactly Once Semantics. Transaction to

    achieve Consistency Kafka, and "Non-Kafka". Transaction to achieve Atomicity between multipe Topic Operations. Transaction to achieve Transactions Balances Producer Message Message Message Outbox Pattern Listen to Yourself Local Eventsourcing Message
  44. @duffleit "Kafka Transactions" Producer Consumers Producer Producer Consumers Consumers Stream

    Processor Stream Processor Stream Processor enable.idempotence = true isolation.level = read_committed Deduplication Inbox
  45. Producer Stream Achieve Exactly Once Semantics. Transaction to achieve Consistency

    Kafka, and "Non-Kafka". Transaction to achieve Atomicity between multipe Topic Operations. Transaction to achieve Message Outbox Pattern Listen to Yourself Local Eventsourcing Deduplication Inbox Idempotency Producer Consumers Transactions Balances Producer Message Message Message
  46. Transfers Payment Service Alice -> Bob Alice -10€ Bob +10€

    __transaction_state Transaction: ID __consumer_offset payments: 1 P1 P1 P2 Transfers P3 P2 P3 C C C C isolation.level=read_committed
  47. Transfers Payment Service Alice -> Bob Alice -10€ __transaction_state Transaction:

    ID __consumer_offset payments: 1 P1 P1 P2 Transfers P3 P2 isolation.level=read_committed Service A A A Transaction: ID2
  48. Producer Stream Achieve Exactly Once Semantics. Transaction to achieve Consistency

    Kafka, and "Non-Kafka". Transaction to achieve Atomicity between multipe Topic Operations. Transaction to achieve Message Outbox Pattern Listen to Yourself Listen to yourself Deduplication Inbox Idempotency Producer Consumers Transactions Balances Producer Message Message Message Kafka's Exactly-Once Semantics Outbox Pattern
  49. @duffleit Producer Consumers Topic 📜 📜 Producer Producer Producer Producer

    Producer Producer Producer Consumers Consumers Consumers Consumers Consumers Consumers Consumers Topic Topic Topic Topic Topic Topic Topic
  50. Consumers Consumers Consumers Consumers Consumers Consumers Consumers Producer Producer Producer

    Producer Producer Producer Producer @duffleit Producer Consumers Topic 📜 📜 Topic Topic Topic Topic Topic Topic Topic
  51. @duffleit Producer Consumers Topic Schema Registry Faulty Message Producer Producer

    Producer Producer Producer Producer Producer Broker Side Validation, FTW
  52. @duffleit Producer Consumers Topic Schema Registry Faulty Message Broker Side

    Validation 🤚 Deserialization on Broker 😱 MagicByte SubjectId Payload ✅ Check if MagicByte Exists. ✅ Check if SubjectId is Valid. ✅ Check if Payload Matches Schema. The more to the right, the more expensive it gets.
  53. @duffleit Cluster Node A Node B Node C Go Proxy

    Go Proxy Go Proxy ⏳ ⏳ ⏳
  54. @duffleit Cluster Node A Node B Node C Go Proxy

    Go Proxy Go Proxy Go Proxy Go Proxy Go Proxy Race Condition We can no longer guarantee ordering.
  55. @duffleit Multi AZ, Multi Region, Multi Cloud Consistency vs. Availability

    Disable Autocommit! Different Options to Achieve Transactional Guarantees in Kafka Broker Side Schema Validation Segment Size Portion Size: "over-partition a bit" and 200+ more Configuration Properties. What we have seen 👀
  56. @duffleit Multi AZ, Multi Region, Multi Cloud Consistency vs. Availability

    Disable Autocommit! Different Options to Achieve Transactional Guarantees in Kafka Broker Side Schema Validation Segment Size Portion Size: "over-partition a bit" and 200+ more Configuration Properties. What we have seen 👀 We were able to handle them, so are you. 💪