- Fundamental concepts of Kafka - Kafka producers - Kafka consumers and consumer group - Consumer group flow (if we have time) - Introduction to Kafka protocols (if we have time)
main product is a HR platform. - It started in 2012, as a simple Ruby on Rails application developed by some developers. Ruby on Rails Feature A Feature B Feature C
F The problems ... - Now there are > 100 employees, > 30 developers; multiple million dollars fund. - It becomes really huge system, consists of hundred modules; complicated frontend stacks and 2 mobile applications - Finally, we start to follow the microservice path, since 2017. React jQuery Backbone ?! Feature A Feature B Feature C Ruby on Rails Grape Sidekiq ...
audits - Each audit must go through a data pipeline: + Persistent storage + Full-text search indexing + Government reporting - There are too much works in a single request
as a message queue, let us publish and subscribe to streams of records - Allow process the record stream in real time - Able to connect to external systems for importing / exporting
concept of Topic. - Each topic has many Partitions. - Each partition is a list of durable messages. - When a message is sent to Kafka under a topic, the message is “sharded” to one partition of the topic. - The message partition assignment is decided by the producers.
multiple machines. Each machine is called a Broker. - Each broker could have 0, 1 or many partitions of the same topic; or even ones of different topics.
disk - Kafka supports Replication to ensure the high- availability and fault-tolerance. - Each partition could have many replicas, based on replication factor. - The replicas are not necessarily on the same nodes
of the replica is elected to become a new leader partition. - When that partition comes back, it becomes a replica and fetches the missing data from others. - All of this leader-replica mechanism is handled by Apache Zookeeper
At the beginning, the producers fetch the metadata from + List of brokers + Interesting topics and their partitions, replicas - They interact directly with various brokers - There are no centralized coordinators
Just like producers, the consumers start their operations by fetching the medata. - Each consumer is able to connect to multiple brokers and encouraged to read from replicas. - Each broker handles a set of partitions from topics the consumer is interested in at once
concept of Consumer Group is introduced - Each consumer belongs to a Consumer Group - Each message is broadcasted to all the groups - Each group member exclusively handles messages from a partition
messages from more than 1 partition. - Guarantee all partitions are covered - Guarantee the message order within a partition - The members in the group decide how to contribute messages by themselves. - Sometimes, Kafka is called “Dump brokers, smart consumer”
of the same group - Want to scale? Increase brokers, increase partitions and increases number of consumers - Rule of thumb: - Number of consumers <= Number of partitions
User A uploads a signature User A agree the contract terms User A uses the signature in the contract Contract is marked Completed User A is marked Onboarded Partition 1 Broker 101 Replica of 2 Partition 2 Broker 102 Replica of 1 Audit consumer 1 Audit consumer 1 Partition 3 Replica of 3 User B Main app
special broker that takes care of a group called group coordinator - The group coordinator is chosen randomly. Any broker can become a group coordinator of a group - Coordinator handles all group operations: join group, sync group, heartbeat, commit offsets, etc.
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Consumer 1. Ask bootstrap broker about the group coordinator by Group Coordinator API. For example: broker 101 is the group coordinator Broker 101 Broker 102 Audit service
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Consumer Broker 101 Broker 102 Blocked 3. The new consumer is blocked by the group coordinator. The coordinator waits for “other” participants. Typically, it waits until all old group members send join request or exceed a timeout Audit service
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Consumer Broker 101 Broker 102 Error! Need to re-join Blocked 4. After the group coordinator receives the join group request, other consumers will be indicated about the new member (via heartbeat, commit offset, etc). They are required to send join group request again Audit service
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Consumer Broker 101 Broker 102 4. When all members are in or exceed a timeout, the group coordinator releases the block and returns response back to the members. Audit service
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Consumer Broker 101 Broker 102 5. A lucky member is chosen to become this generation’s group leader. Its response attaches a list of group members and each member’s metadata. Leader Audit service
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Consumer Broker 101 Broker 102 6. The group leader assigns the workload to each member based on the member’s metadata. Other members don’t have to do this task Leader Audit service
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Broker 101 Broker 102 Sync Sync Consumer Sync Sync 7. All members continue to send sync group request. Like join group request, sync group is a block request. The leader’s request attaches member assignment Audit service
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Consumer Broker 101 Broker 102 8. Each member receives the sync group response. This response includes the current member assignment Audit service
2 3 Partition 2 1 2 4 3 Partition 3 5 1 2 4 3 Partition 4 Consumer Consumer Consumer Consumer Broker 101 Broker 102 9. Finally, each consumer subscribes to the partitions it is assigned. New consumer becomes a group member Audit service
clients - It implements its own binary protocol over TCP - The protocol follows request - response model - There are about ~20 APIs in newest version - Each API has its own version and Kafka ensures the backward compatibility
has a type - There are primitive types: - int8, int16, int32, int64 - The composed types: - string: [size in int16][string] - bytes: [size in int32][bytes] - Array is supported: [size in int32][e1][e2]...
(int16) Correlation Id (int32) ClientId (string) Each API has a numeric API key Each API has a specific version, which defines the body’s structure The same as Correlation ID in the request TopicMetadataRequest Number of topics (int32) Topic 1 (string) Topic 2 (string)
and crazily scalable. - It is not easy to use. - The client libraries are just the tools. It doesn’t solve all of our problems. - Therefore, it is great understand the underlying to achieve more with Kafka.