$30 off During Our Annual Pro Sale. View Details »

kafka_for_rubyists_advanced_kafka.pdf

 kafka_for_rubyists_advanced_kafka.pdf

karol.galanciak

January 15, 2021
Tweet

More Decks by karol.galanciak

Other Decks in Programming

Transcript

  1. Advanced Kafka

    View Slide

  2. Tombstones
    - When you want to delete the message from
    partition under a given key
    - Just send a message with a null payload (do
    not send “null” string though)
    - GDPR-friendly feature ;)

    View Slide

  3. Publishing huge messages (in megabytes)
    - Do you really have to do this in the first place?
    - If yes, you need to adjust multiple settings:
    1. Consumer: "fetch.message.max.bytes"
    2. Broker: "replica.fetch.max.bytes"
    3. Broker: "message.max.bytes"
    4. Broker: "max.message.bytes"

    View Slide

  4. How to choose number of partitions?
    - More partitions - more throughput
    - What do you expect? For example, if processing
    a message takes 10ms and you want to process
    1000 messages/second, you need 10 partitions
    - More partitions - increased unavailability when
    broker is down (proportionally to the number of
    partitions)
    - More partitions - higher latency, increased
    replication time

    View Slide

  5. Kafka Controller
    - A “normal” broker with some extra
    responsibilities
    - The first broker that registers itself as a
    controller in the cluster
    - Responsible for electing partition leaders

    View Slide

  6. Kafka Replication
    - Replication is necessary for a reasonable
    production setup with satisfactory availability and
    durability
    - Replication factor (replication.factor option)
    determines how many times a partition is
    replicated (on how many brokers it will exist). 3 is
    a reasonable default
    - A replication factor of 3 (N) allows to lose 2 (N-1)
    brokers while still being operational

    View Slide

  7. Kafka Replication: Leader/Follower
    - Leader replica: each partition has own leader,
    all requests (produce/consume) go through
    the leader
    - Follower replica: they ensure they are up-to-
    date with the leader. If the leader goes down,
    the follower will take over

    View Slide

  8. In-sync replicas
    - Replicas that are “up-to-date”
    - Configurable via “replica.lag.time.max.ms” for
    how long the replica can be considered to be
    in-sync

    View Slide

  9. In-sync replicas
    - For ensuring consistency, you might choose to
    require data to be committed to more than one
    replica - “min.insync.replicas”
    - When it’s set to 2 and you have 3 brokers, you
    can lose only 1 broker, if you lose 2, it will no
    longer be possible to produce messages for
    the affected partitions

    View Slide

  10. Leader election
    - Clean election - when in-sync replica is chosen
    as a new leader, a standard process
    - Unclean election - when no in-sync replica exists
    (e.g. 2 brokers are down and then the last one,
    the leader, goes down)
    - Unclean election - difficult choice, consistency vs.
    availability (we can lose messages or decide to
    have the partition offline)
    - configurable via "unclean.leader.election.enable"

    View Slide

  11. Split-brain
    - One controller goes down (network partition,
    stop-the-world GC pause), and still thinks it’s
    a controller after coming back but meantime, a
    new one was elected
    - Epoch number (monotonically increasing
    number for controllers) is used to prevent
    split-brain - the highest ones wins

    View Slide

  12. Zookeeper
    - It’s not a Kafka “core” itself, but it’s used by
    Kafka
    - Zookeeper - a service for maintaining shared
    configuration
    - It’s used e.g. for electing a controller or
    keeping info about cluster membership (which
    brokers are part of the cluster)
    - Planned to be removed as a dependency

    View Slide

  13. Producer’s Reliability
    - acks: 0/1/all - no acknowledgement/by leader/
    all required in-sync replicas
    - Error handling - if you don’t want to lose
    messages, you should retry somehow

    View Slide

  14. Thanks!

    View Slide