developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a "massively scalable pub/sub message queue architected as a distributed transaction log". Wikipedia 6
default) b. At most once c. Exactly once 2. Fault-tolerance vs latency a. No ack b. Acks from leader c. Acks from followers 3. Message order in a single partition 17
ISR 2. Starts rebalancing partitions (if automatic rebalance enabled) a. Too much partitions can cause an issue 3. Notifies consumers 4. Notifies producers 18
Separating message processing in KafkaMessageProcessor 4. Different implementations a. 1 thread for all partitions vs 1 thread per 1 partition b. Autocommit c. Poll periods d. Batch support e. Rebalancing considerations 25
configs, boolean isKey); public T deserialize(String topic, byte[] data); public void close(); } public interface Serializer<T> { public void configure(Map<String, ?> configs, boolean isKey); public byte[] serialize(String topic, T data); public void close(); } 26
features (standard protocols, TTLs, in-memory) c. DLQ d. Extension points 2. Cons a. Delivery guarantees b. Loosing messages under high load c. Failure Handling scenarios d. Throughput in transactional mode 38
Rich delivery guarantee models c. Disk persistance 2. Weaknesses a. Need for ZooKeeper b. Lack of any kind of backpressure c. Lack of useful features othe queues havr d. Lack of any kind of DLQ e. Limited number of extension points f. Complex internal protocols g. Too smart clients 40