The Basics Cluster Node A Node B Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C''
The Basics Cluster Node A Node B Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Producer
The Basics Cluster Node A Node B Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer
The Basics Cluster Node A Node B Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances.
The Basics Cluster Node A Node B Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. User: Bob User: Alice User: Tim selection of partition: = hash(key) % #ofpartitions User: Bob
The Basics Cluster Node A Node B Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. Partition D Partition '' Partition ' User: Bob User: Alice User: Tim selection of partition: = hash(key) % 3 to 4 User: Bob User: Bob User: Bob User: Alice Topic.v2
The Basics Cluster Node A Node B Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Consumer Consumer The number of partitions is the limiting factor for consumer instances. Select is wisely & over partition a bit. Partition D Partition '' Partition ' User: Bob User: Alice User: Tim selection of partition: = hash(key) % 3 to 4 User: Bob User: Bob User: Bob User: Alice Topic.v2 Select something that can be devided by multipe numbers. e.g. 6, 12, 24, ...
The Basics Topic: UserChanges User: Bob User: Alice User: Tim User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key.
The Basics Topic: UserChanges User: Bob User: Alice User: Tim User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙
The Basics Topic: UserChanges User: Bob User: Alice User: Tim User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙
The Basics Topic: UserChanges User: Bob User: Alice User: Tim User: Bob User: Bob log.cleanup.policy: compact we only keep the latest record for a specific key. How Kafka Stores Data userchanges.active.segment 📄 userchanges.segment.1 userchanges.segment.2 log.segment.bytes = 1GB Active Compaction ⚙ Compaction ⚙ log.segment.ms = 1week Especially in GDPR related usecases think explicitly about segement-size and roll-time.
The Basics Topic: UserChanges User: Bob User: Alice User: Tim User: Bob User: Bob log.cleanup.policy: compact Tombstone: Tim delete.retention.ms = 1day Slow Consumer needs more than one day to read all events from the topic that starts new. Tombstone: Tim
The Basics Topic: UserChanges User: Bob User: Alice User: Tim User: Bob User: Bob log.cleanup.policy: compact Tombstone: Tim delete.retention.ms = 1day Slow Consumer needs more than one day to read all events from the topic that starts new. User: Tim Keep delete.retention in sync with the given topic retention.
The Basics Cluster Node A Node B Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C''
The Basics Cluster Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2
The Basics Cluster Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2 Partition D Partition E Partition F Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' KIP-36: Rack aware replica assignment
Cluster The Basics Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Node B Node A Node F Node E Node D AZ 1 AZ 2 Partition D Partition E Partition F Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' KIP-36: Rack aware replica assignment
Cluster Multi Region? Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Node B Node A Node F Node E Node D Region 1 Region 2 Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F'' Partition A' Partition A'' Partition B' Partition B'' Partition C' Partition C'' Partition D Partition E Partition F Usually the latency between multiple regions is to big to span a single cluster over it
Cluster Region West Cluster Region East Multi Region? Node C Producer Consumer Consumer Consumer Group Topic Partition A Partition B Partition C Node B Node A Node F Node E Node D Region 1 Region 2 Partition D' Partition D'' Partition E' Partition E'' Partition F' Partition F''
Cluster Region West Cluster Region East Multi Region? Node C Producer Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C
Cluster Region West Cluster Region East Multi Region? Node C Producer East Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C Producer West
Cluster Region West — "west" Cluster Region East — "east" Multi Region? Node C Producer East Consumer Consumer Consumer Group Topic A B C Node B Node A Node F Node E Node D Region 1 Region 2 A B C Producer West Mirror Maker 2 east.C west.C *.C Ordering Guarantees?!
@duffleit Producer Partition max.in.flight.requests.per.connection = 5 Message A Message B Message A retry Message B Message A Legacy Solution: max.in.flight.requests.per.connection = 1 State-of-the-Art Solution: enable.idempotence = true retries = MAX_INT max.in.flight.requests.per.connection = 5 acks = all SEQ#: 1 SEQ#: 2 OutOfOrderSequenceException SEQ#: 2 If you don't want to set your retries to invinite prefer "delivery.timeout.ms" over "retries".
Node A Node B Node C Topic Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = none acks = 1 acks = all min.insync.replicas = 3 @duffleit
Node A Node B Node C Topic Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = none acks = 1 acks = all min.insync.replicas = 2 @duffleit
min.insync.replicas = 2 Node A Node B Node C Topic Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
min.insync.replicas = 3 Node A Node B Node C Topic Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
min.insync.replicas = 3 Node A Node B Node C Topic Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
min.insync.replicas = 2 Node A Node B Node C Topic Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas--
min.insync.replicas = 2 Node A Node B Node C Topic Partition A Partition B Partition C Partition C' Partition B'' Partition A' Partition C'' Partition B' Partition A'' Producer acks = all @duffleit CAP Theorem Consistency Availabiltiy Paritioning ❌ min.insync.replicas++ min.insync.replicas-- Possible Data Loss There is no "ad-hoc fsync" by default. Can be configured via "log.default.flush.interval.ms"
@duffleit Keep in mind that rack assignment is ingnored for insync replicas. Node C Node B Node A Node D Node E Node F AZ 1 AZ 2 Text . . . . . replicas=6 min.insync.replicas = 4
@duffleit Keep in mind that rack assignment is ingnored for insync replicas. Node C Node B Node A Node D Node E Node F AZ 1 AZ 2 Text . . . . replicas=5 min.insync.replicas = 4 fail on > 1
@duffleit Keep in mind that rack assignment is ingnored for insync replicas. Node C Node B Node A Node D Node E Node F Node G Node H Node I AZ 1 AZ 2 AZ 3 Text . . . . . . . . replicas=9 min.insync.replicas = 7
@duffleit Onboarding UserUpdated (age: 22) Stream Global EventSourcing UserUpdated (age: 21) Advertisment User (age: 22) User (age: 22) UserUpdated (age: 22) UserUpdated (age: 23) 👻 if often breaks information hiding & data isolation.
Producer Producer Stream Consumers Achieve Exactly Once Semantics. Transaction to achieve Consistency Kafka, and "Non-Kafka". Transaction to achieve Atomicity between multipe Topic Operations. Transaction to achieve Transactions Balances Producer Message Message Message Outbox Pattern Listen to Yourself Local Eventsourcing Message
Transfers Payment Service Alice -> Bob Alice -10€ Bob +10€ __transaction_state Transaction: ID __consumer_offset payments: 1 P1 P1 P2 Transfers P3 P2 P3 C C C C isolation.level=read_committed
Transfers Payment Service Alice -> Bob Alice -10€ __transaction_state Transaction: ID __consumer_offset payments: 1 P1 P1 P2 Transfers P3 P2 isolation.level=read_committed Service A A A Transaction: ID2
@duffleit Producer Consumers Topic Schema Registry Faulty Message Broker Side Validation 🤚 Deserialization on Broker 😱 MagicByte SubjectId Payload ✅ Check if MagicByte Exists. ✅ Check if SubjectId is Valid. ✅ Check if Payload Matches Schema. The more to the right, the more expensive it gets.
@duffleit Multi AZ, Multi Region, Multi Cloud Consistency vs. Availability Disable Autocommit! Different Options to Achieve Transactional Guarantees in Kafka Broker Side Schema Validation Segment Size Portion Size: "over-partition a bit" and 200+ more Configuration Properties. What we have seen 👀
@duffleit Multi AZ, Multi Region, Multi Cloud Consistency vs. Availability Disable Autocommit! Different Options to Achieve Transactional Guarantees in Kafka Broker Side Schema Validation Segment Size Portion Size: "over-partition a bit" and 200+ more Configuration Properties. What we have seen 👀 We were able to handle them, so are you. 💪