kafka_for_rubyists_advanced_kafka.pdf

Advanced Kafka

Tombstones - When you want to delete the message from
partition under a given key - Just send a message with a null payload (do not send “null” string though) - GDPR-friendly feature ;)

Publishing huge messages (in megabytes) - Do you really have
to do this in the first place? - If yes, you need to adjust multiple settings: 1. Consumer: "fetch.message.max.bytes" 2. Broker: "replica.fetch.max.bytes" 3. Broker: "message.max.bytes" 4. Broker: "max.message.bytes"

How to choose number of partitions? - More partitions -
more throughput - What do you expect? For example, if processing a message takes 10ms and you want to process 1000 messages/second, you need 10 partitions - More partitions - increased unavailability when broker is down (proportionally to the number of partitions) - More partitions - higher latency, increased replication time

Kafka Controller - A “normal” broker with some extra responsibilities
- The first broker that registers itself as a controller in the cluster - Responsible for electing partition leaders

Kafka Replication - Replication is necessary for a reasonable production
setup with satisfactory availability and durability - Replication factor (replication.factor option) determines how many times a partition is replicated (on how many brokers it will exist). 3 is a reasonable default - A replication factor of 3 (N) allows to lose 2 (N-1) brokers while still being operational

Kafka Replication: Leader/Follower - Leader replica: each partition has own
leader, all requests (produce/consume) go through the leader - Follower replica: they ensure they are up-to- date with the leader. If the leader goes down, the follower will take over

In-sync replicas - Replicas that are “up-to-date” - Configurable via
“replica.lag.time.max.ms” for how long the replica can be considered to be in-sync

In-sync replicas - For ensuring consistency, you might choose to
require data to be committed to more than one replica - “min.insync.replicas” - When it’s set to 2 and you have 3 brokers, you can lose only 1 broker, if you lose 2, it will no longer be possible to produce messages for the affected partitions

Leader election - Clean election - when in-sync replica is
chosen as a new leader, a standard process - Unclean election - when no in-sync replica exists (e.g. 2 brokers are down and then the last one, the leader, goes down) - Unclean election - difficult choice, consistency vs. availability (we can lose messages or decide to have the partition offline) - configurable via "unclean.leader.election.enable"

Split-brain - One controller goes down (network partition, stop-the-world GC
pause), and still thinks it’s a controller after coming back but meantime, a new one was elected - Epoch number (monotonically increasing number for controllers) is used to prevent split-brain - the highest ones wins

Zookeeper - It’s not a Kafka “core” itself, but it’s
used by Kafka - Zookeeper - a service for maintaining shared configuration - It’s used e.g. for electing a controller or keeping info about cluster membership (which brokers are part of the cluster) - Planned to be removed as a dependency

Producer’s Reliability - acks: 0/1/all - no acknowledgement/by leader/ all
required in-sync replicas - Error handling - if you don’t want to lose messages, you should retry somehow

Thanks!

kafka_for_rubyists_advanced_kafka.pdf

kafka_for_rubyists_advanced_kafka.pdf

karol.galanciak

More Decks by karol.galanciak

Other Decks in Programming

Featured

Transcript

Advanced Kafka

Tombstones - When you want to delete the message from

Publishing huge messages (in megabytes) - Do you really have

How to choose number of partitions? - More partitions -

Kafka Controller - A “normal” broker with some extra responsibilities

Kafka Replication - Replication is necessary for a reasonable production

Kafka Replication: Leader/Follower - Leader replica: each partition has own

In-sync replicas - Replicas that are “up-to-date” - Configurable via

In-sync replicas - For ensuring consistency, you might choose to

Leader election - Clean election - when in-sync replica is

Split-brain - One controller goes down (network partition, stop-the-world GC

Zookeeper - It’s not a Kafka “core” itself, but it’s

Producer’s Reliability - acks: 0/1/all - no acknowledgement/by leader/ all

Thanks!