Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kafka in Production

Kafka in Production

Presented at Java Professionals Meetup #15 (Minsk, Belarus)

Andrey Panasyuk

June 01, 2017
Tweet

More Decks by Andrey Panasyuk

Other Decks in Programming

Transcript

  1. Remote Calls Types 1. Synchronous calls 2. Asynchronous calls Limitations

    1. Peer-to-Peer 2. Retries 3. Load balancing 4. Durability 5. Backpressure 3
  2. Apache Kafka Apache Kafka is an open-source stream processing platform

    developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a "massively scalable pub/sub message queue architected as a distributed transaction log". Wikipedia 6
  3. Kafka. Controller 1. One of brokers 2. Managing state of

    partitions 3. Managing state of replicas 4. Partitions manipulations 5. High-availability 15
  4. Kafka + ZooKeeper 1. Cluster membership 2. Electing leader 3.

    Topic configuration 4. Offsets for a Group/Topic/Partition combination 16
  5. Kafka. Guarantees 1. Delivery guarantees a. At least once (by

    default) b. At most once c. Exactly once 2. Fault-tolerance vs latency a. No ack b. Acks from leader c. Acks from followers 3. Message order in a single partition 17
  6. Kafka. Adding a broker 1. Adds a new machine into

    ISR 2. Starts rebalancing partitions (if automatic rebalance enabled) a. Too much partitions can cause an issue 3. Notifies consumers 4. Notifies producers 18
  7. Kafka. Failure Scenarios 1. In-Sync-Replicas 2. Leader election 3. CAP

    a. Partition Tolerance b. Availability c. Consistency* 19
  8. Kafka. Producer Properties properties = new Properties(); properties.setProperty(" bootstrap.servers", brokers);

    properties.setProperty("key.serializer","o.a.k.c.s.StringSerializer"); properties.setProperty("value.serializer","o.a.k.c.s.StringSerializer"); KafkaProducer<String, String> producer = new KafkaProducer<>(properties); KeyedMessage<String, String> data = new KeyedMessage<>( "sync", userId, steps); producer.send(data); 21
  9. Kafka. Consumer Properties properties = new Properties(); properties.setProperty(" bootstrap.servers", brokers);

    properties.setProperty("key.deserializer","o.a.k.c.s.StringDeserializer"); properties.setProperty("value.deserializer","o.a.k.c.s.StringDeserializer"); properties.setProperty(" group.id", groupId); KafkaConsumer<String, String> consumer = new KafkaConsumer<>(properties); consumer.subscribe(“sync”); while(true) { consumer.poll(100) .forEach(r -> System.out.println(r.key() + ": " + r.value()); } 24
  10. Kafka. Real-world Consumers 1. Metrics 2. Invalid message queue 3.

    Separating message processing in KafkaMessageProcessor 4. Different implementations a. 1 thread for all partitions vs 1 thread per 1 partition b. Autocommit c. Poll periods d. Batch support e. Rebalancing considerations 25
  11. Kafka. Serialization public interface Deserializer<T> { public void configure(Map<String, ?>

    configs, boolean isKey); public T deserialize(String topic, byte[] data); public void close(); } public interface Serializer<T> { public void configure(Map<String, ?> configs, boolean isKey); public byte[] serialize(String topic, T data); public void close(); } 26
  12. Kafka. Consumer Failure 1. Wait for ZooKeeper timeout 2. Controller

    processes event from ZooKeeper 3. Controller notifies consumers 4. Consumers select new partition consumer 27
  13. Kafka. Practices 1. Topics manually created on prod, automatically on

    QA envs 2. Do not delete topics (KAFKA-1397, KAFKA-2937, KAFKA-4834, ...) 3. IMQ implementation 4. Use identical versions on all brokers 31
  14. Kafka. Tuning 1. 20-100 brokers per cluster; hard limit of

    10,000 partitions per cluster (Netflix) 2. Increase replica.lag.time.max.ms and replica.lag.max.messages 3. Increase num.replica.fetchers 4. Reduce retention 5. Increase rebalance.max.retries, rebalance.backoff.ms 32
  15. Monitoring And Alerting 1. Consumer metrics 2. Producer metrics 3.

    Kafka Broker metrics 4. Zookeeper metrics 5. PagerDuty alerts 33
  16. Kafka. Extension Points • Storages ◦ Amazon S3 (Sink) ◦

    Files (Source) ◦ Elasticsearch (Sink) ◦ HDFS (Sink) ◦ JDBC (Source, Sink) ◦ C* (Sink) ◦ PostgreSQL (Sink) ◦ Oracle/MySQL/MSSQL (Sink) ◦ Vertica (Source, Sink) ◦ Ignite (Source, Sink) 37 • Protocols/Queues ◦ MQTT (Source) ◦ SQS (Source) ◦ JMS (Sink) ◦ RabbitMQ (Source) • Others ◦ Mixpanel (Sink)
  17. Alternatives. ActiveMQ 1. Pros a. Simplicity b. Way more rich

    features (standard protocols, TTLs, in-memory) c. DLQ d. Extension points 2. Cons a. Delivery guarantees b. Loosing messages under high load c. Failure Handling scenarios d. Throughput in transactional mode 38
  18. Alternatives. RabbitMQ • Pros ◦ Simpler to start ◦ More

    features ▪ Ability to query/filter ▪ Federated queues ▪ Sophisticated routing ◦ Plugins • Cons ◦ Scales vertically mostly ◦ Consumers are mostly online assumption ◦ Delivery guarantees are less rich 39
  19. Kafka. Strengths and Weaknesses 1. Strengths a. Horizontal scalability b.

    Rich delivery guarantee models c. Disk persistance 2. Weaknesses a. Need for ZooKeeper b. Lack of any kind of backpressure c. Lack of useful features othe queues havr d. Lack of any kind of DLQ e. Limited number of extension points f. Complex internal protocols g. Too smart clients 40
  20. 41