Kafka in Production

Kafka in Production Andrey Panasyuk, @defascat

Introduction 2

Remote Calls Types 1. Synchronous calls 2. Asynchronous calls Limitations
1. Peer-to-Peer 2. Retries 3. Load balancing 4. Durability 5. Backpressure 3

Message Queues 1. External tool 2. Asynchronous communication protocol 4

Lets get to Kafka!!! 5

Apache Kafka Apache Kafka is an open-source stream processing platform
developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a "massively scalable pub/sub message queue architected as a distributed transaction log". Wikipedia 6

Pub/Sub 7

Concepts. Log 8

Concepts. Data Flow 9

Concepts. Distributed Log 10

Concepts. Partitions 11

Concepts. Partitions to consumers 12

Concepts. Architecture 13

I’ve heard in other presentations. Lets get to it! 14

Kafka. Controller 1. One of brokers 2. Managing state of
partitions 3. Managing state of replicas 4. Partitions manipulations 5. High-availability 15

Kafka + ZooKeeper 1. Cluster membership 2. Electing leader 3.
Topic configuration 4. Offsets for a Group/Topic/Partition combination 16

Kafka. Guarantees 1. Delivery guarantees a. At least once (by
default) b. At most once c. Exactly once 2. Fault-tolerance vs latency a. No ack b. Acks from leader c. Acks from followers 3. Message order in a single partition 17

Kafka. Adding a broker 1. Adds a new machine into
ISR 2. Starts rebalancing partitions (if automatic rebalance enabled) a. Too much partitions can cause an issue 3. Notifies consumers 4. Notifies producers 18

Kafka. Failure Scenarios 1. In-Sync-Replicas 2. Leader election 3. CAP
a. Partition Tolerance b. Availability c. Consistency* 19

I’m a Java Developer. Show me the code! 20

Kafka. Producer Properties properties = new Properties(); properties.setProperty(" bootstrap.servers", brokers);
properties.setProperty("key.serializer","o.a.k.c.s.StringSerializer"); properties.setProperty("value.serializer","o.a.k.c.s.StringSerializer"); KafkaProducer<String, String> producer = new KafkaProducer<>(properties); KeyedMessage<String, String> data = new KeyedMessage<>( "sync", userId, steps); producer.send(data); 21

Kafka. Real-world Producers 1. Topic name validation 2. Adding metrics
3. Adding default metadata 22

Kafka. Message availability 23

Kafka. Consumer Properties properties = new Properties(); properties.setProperty(" bootstrap.servers", brokers);
properties.setProperty("key.deserializer","o.a.k.c.s.StringDeserializer"); properties.setProperty("value.deserializer","o.a.k.c.s.StringDeserializer"); properties.setProperty(" group.id", groupId); KafkaConsumer<String, String> consumer = new KafkaConsumer<>(properties); consumer.subscribe(“sync”); while(true) { consumer.poll(100) .forEach(r -> System.out.println(r.key() + ": " + r.value()); } 24

Kafka. Real-world Consumers 1. Metrics 2. Invalid message queue 3.
Separating message processing in KafkaMessageProcessor 4. Different implementations a. 1 thread for all partitions vs 1 thread per 1 partition b. Autocommit c. Poll periods d. Batch support e. Rebalancing considerations 25

Kafka. Serialization public interface Deserializer<T> { public void configure(Map<String, ?>
configs, boolean isKey); public T deserialize(String topic, byte[] data); public void close(); } public interface Serializer<T> { public void configure(Map<String, ?> configs, boolean isKey); public byte[] serialize(String topic, T data); public void close(); } 26

Kafka. Consumer Failure 1. Wait for ZooKeeper timeout 2. Controller
processes event from ZooKeeper 3. Controller notifies consumers 4. Consumers select new partition consumer 27

Do you really have all this mess working? 28

Kafka. Corporate Challenge Usages 1. User Sync Processing 2. Analytics
29

Kafka. Our Deployment 1. Yahoo kafka-manager 2. MirrorMaker 30

Kafka. Practices 1. Topics manually created on prod, automatically on
QA envs 2. Do not delete topics (KAFKA-1397, KAFKA-2937, KAFKA-4834, ...) 3. IMQ implementation 4. Use identical versions on all brokers 31

Kafka. Tuning 1. 20-100 brokers per cluster; hard limit of
10,000 partitions per cluster (Netflix) 2. Increase replica.lag.time.max.ms and replica.lag.max.messages 3. Increase num.replica.fetchers 4. Reduce retention 5. Increase rebalance.max.retries, rebalance.backoff.ms 32

Monitoring And Alerting 1. Consumer metrics 2. Producer metrics 3.
Kafka Broker metrics 4. Zookeeper metrics 5. PagerDuty alerts 33

Current State. Message Input Rate 34

Current State. Producer Latency 35

Lets wrap this up! 36

Kafka. Extension Points • Storages ◦ Amazon S3 (Sink) ◦
Files (Source) ◦ Elasticsearch (Sink) ◦ HDFS (Sink) ◦ JDBC (Source, Sink) ◦ C* (Sink) ◦ PostgreSQL (Sink) ◦ Oracle/MySQL/MSSQL (Sink) ◦ Vertica (Source, Sink) ◦ Ignite (Source, Sink) 37 • Protocols/Queues ◦ MQTT (Source) ◦ SQS (Source) ◦ JMS (Sink) ◦ RabbitMQ (Source) • Others ◦ Mixpanel (Sink)

Alternatives. ActiveMQ 1. Pros a. Simplicity b. Way more rich
features (standard protocols, TTLs, in-memory) c. DLQ d. Extension points 2. Cons a. Delivery guarantees b. Loosing messages under high load c. Failure Handling scenarios d. Throughput in transactional mode 38

Alternatives. RabbitMQ • Pros ◦ Simpler to start ◦ More
features ▪ Ability to query/filter ▪ Federated queues ▪ Sophisticated routing ◦ Plugins • Cons ◦ Scales vertically mostly ◦ Consumers are mostly online assumption ◦ Delivery guarantees are less rich 39

Kafka. Strengths and Weaknesses 1. Strengths a. Horizontal scalability b.
Rich delivery guarantee models c. Disk persistance 2. Weaknesses a. Need for ZooKeeper b. Lack of any kind of backpressure c. Lack of useful features othe queues havr d. Lack of any kind of DLQ e. Limited number of extension points f. Complex internal protocols g. Too smart clients 40

Kafka in Production

Kafka in Production

More Decks by Andrey Panasyuk

Other Decks in Programming

Featured

Transcript