Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Going big with Apache Kafka Nikolay Stoitsev - Sr. Software Engineer @ Uber

Slide 3

Slide 3 text

Kafka Cluster

Slide 4

Slide 4 text

Brocker Brocker Brocker

Slide 5

Slide 5 text

Topic Message Message Message Message Message Message

Slide 6

Slide 6 text

Partition - ordered, immutable sequence Message Message Message Message Message Message Message Message Message Message Message Message Message Message Message Partition 0 Partition 1 Partition 2

Slide 7

Slide 7 text

Offset Message Message Message Message Message Partition 0 Offset: 1

Slide 8

Slide 8 text

Offset Message Message Message Message Message Partition 0 Offset: 2

Slide 9

Slide 9 text

Multi Subscriber Message Message Message Message Message Offset: 5 Offset: 3 Consumer: 2 Consumer: 1

Slide 10

Slide 10 text

Broker Broker Broker P0 P0 P0 P1 P1 P1 P2 P2 P2 Partitioned and Replicated

Slide 11

Slide 11 text

Broker Broker Broker P0 P0 P0 P1 P1 P1 P2 P2 P3 Fault Tolerant

Slide 12

Slide 12 text

Broker Broker Broker P0 P0 P0 P1 P1 P1 P2 P2 P2 Producers Producer Producer

Slide 13

Slide 13 text

Broker Broker Broker P0 P0 P0 P1 P1 P1 P2 P2 P2 Producer Producer Consumer Consumer Consumer Group 1 Consumer Consumer Consumer Consumer Group 2 Consumers

Slide 14

Slide 14 text

Broker Broker P0 P0 P1 P1 P2 P2 Producer Consumer Consumer Consumer Group 1 ZooKeeper Get Broker ID Update Offset

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

System for “collecting and delivering high volumes of log data with low latency”

Slide 17

Slide 17 text

Logging Kafka ELK Hadoop

Slide 18

Slide 18 text

Publish - Subscribe Or Kafka for interservice communication

Slide 19

Slide 19 text

Good throughput

Slide 20

Slide 20 text

Built-in partitioning, replication, and fault-tolerance

Slide 21

Slide 21 text

Durability

Slide 22

Slide 22 text

Latency vs. Durability

Slide 23

Slide 23 text

One leader for every partition Follower Leader Follower Follower Producer Consumer

Slide 24

Slide 24 text

In-sync Replicas Follower Leader Follower Follower Producer Consumer

Slide 25

Slide 25 text

In-sync Replicas Follower Leader Follower Follower Producer Consumer

Slide 26

Slide 26 text

In-sync Replicas Leader Follower Follower Producer Consumer

Slide 27

Slide 27 text

Tune for lower latency ● Acknowledgement after persisted on the leader ● Can lost message on leadership changes ● At-most-once semantic

Slide 28

Slide 28 text

Tune for durability ● Acknowledgement after persisted on all ISR (after committed) ● No data loss ● At-least-once sematic

Slide 29

Slide 29 text

At-most-once cluster for logging

Slide 30

Slide 30 text

At-least-one cluster for message bus

Slide 31

Slide 31 text

Kafka as a message bus Kafka Upstream Downstream

Slide 32

Slide 32 text

Failure isolation

Slide 33

Slide 33 text

Message queueing

Slide 34

Slide 34 text

Event driven architecture

Slide 35

Slide 35 text

Avro

Slide 36

Slide 36 text

Kafka Upstream Downstrea m Schema Registry Publish Schema Fetch Schema https://docs.confluent.io/current/schema-registry/docs/index.html

Slide 37

Slide 37 text

How to make sure something is durably stored and will be processed exactly once?

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

Process 1 OS Process 2 OS

Slide 40

Slide 40 text

Process 1 OS Process 2 OS

Slide 41

Slide 41 text

Process 1 OS Process 2 OS

Slide 42

Slide 42 text

Process 1 OS Process 2 OS

Slide 43

Slide 43 text

Process 1 OS Process 2 OS

Slide 44

Slide 44 text

Idempotency

Slide 45

Slide 45 text

Idempotency + at least once delivery Process 1 Process 2 Kafka Consumer

Slide 46

Slide 46 text

Out of the box exactly once delivery after 0.11

Slide 47

Slide 47 text

https://www.confluent.io/blog/exactly-once-semantics-are-possible-heres-how-apache-kafka-does-it/

Slide 48

Slide 48 text

Order Service Kafka Payment Consumer Payment Provider Handling Failures

Slide 49

Slide 49 text

Order Service Kafka Payment Consumer Payment Provider Clogged processing

Slide 50

Slide 50 text

Order Service Kafka Payment Consumer Payment Provider Dead Letter Queue Payment Retry 0 Payment Retry 1 Dead Letter Queue

Slide 51

Slide 51 text

Order Service Kafka Payment Consumer Payment Provider Dead Letter Queue Payment Retry 0 Payment Retry 1 Dead Letter Queue Payment Consumer Payment Consumer

Slide 52

Slide 52 text

Multi Data Center Application

Slide 53

Slide 53 text

Regional Cluster Local Producer Regional Kafka DC1 Local Producer Regional Kafka DC2

Slide 54

Slide 54 text

Aggregated Cluster Local Producer Regional Kafka DC1 Local Producer Regional Kafka DC2 Kafka Replicator Aggregated Kafka DC3

Slide 55

Slide 55 text

https://github.com/uber/uReplicator

Slide 56

Slide 56 text

https://github.com/confluentinc/kafka-rest

Slide 57

Slide 57 text

Kafka Upstream Downstream Kafka REST Proxy Kafka REST Proxy

Slide 58

Slide 58 text

How to monitor Kafka?

Slide 59

Slide 59 text

https://github.com/uber/chaperone

Slide 60

Slide 60 text

Detect data loss, lag and duplication

Slide 61

Slide 61 text

Audit Library Regional Kafka Service Kafka REST Proxy Audit Library Audit Library

Slide 62

Slide 62 text

Chaperone Service Regional Kafka Service Kafka REST Proxy Audit Library Audit Library Chaperone Service

Slide 63

Slide 63 text

Chaperone Collector Regional Kafka Service Kafka REST Proxy Audit Library Audit Library Aggregate Kafka Chaperone Service Chaperone Service Chaperone Collector DB

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

Summary Tune for durability

Slide 68

Slide 68 text

Summary Tune for durability Define Avro Schemas

Slide 69

Slide 69 text

Summary Tune for durability Define Avro Schemas Use Kafka REST Proxy

Slide 70

Slide 70 text

Summary Tune for durability Define Avro Schemas Use Kafka REST Proxy Add idempotency checks

Slide 71

Slide 71 text

Summary Tune for durability Define Avro Schemas Use Kafka REST Proxy Add idempotency checks Use Dead Letter Queue

Slide 72

Slide 72 text

Summary Tune for durability Define Avro Schemas Use Kafka REST Proxy Add idempotency checks Use Dead Letter Queue Monitor everything

Slide 73

Slide 73 text

THANK YOU! Nikolay Stoitsev Sr. Software Engineer @ Uber