Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Plumbr uses Kafka
Search
Nikita Salnikov-Tarnovski
February 04, 2018
Programming
0
98
How Plumbr uses Kafka
Nikita Salnikov-Tarnovski
February 04, 2018
Tweet
Share
More Decks by Nikita Salnikov-Tarnovski
See All by Nikita Salnikov-Tarnovski
Project clarity - random rant from an old engineer
nikem
0
92
Introduction to Druid
nikem
0
850
Deceived by monitoring
nikem
0
66
10% Happier
nikem
0
71
Where is my memory
nikem
0
450
Heap, off you go
nikem
0
1.2k
First steps in GC tuning
nikem
0
1.6k
I bet you have a memory leak
nikem
1
170
Plumbing Memory Leaks
nikem
1
140
Other Decks in Programming
See All in Programming
AHC051解法紹介
eijirou
0
610
decksh - a little language for decks
ajstarks
4
21k
[FEConf 2025] 모노레포 절망편, 14개 레포로 부활하기까지 걸린 1년
mmmaxkim
0
910
バイブコーディングの正体——AIエージェントはソフトウェア開発を変えるか?
stakaya
5
1k
AI OCR API on Lambdaを Datadogで可視化してみた
nealle
0
160
画像コンペでのベースラインモデルの育て方
tattaka
3
1.8k
管你要 trace 什麼、bpftrace 用下去就對了 — COSCUP 2025
shunghsiyu
0
450
エンジニアのための”最低限いい感じ”デザイン入門
shunshobon
0
120
自作OSでDOOMを動かしてみた
zakki0925224
1
1.4k
新世界の理解
koriym
0
140
Dart 参戦!!静的型付き言語界の隠れた実力者
kno3a87
0
200
「リーダーは意思決定する人」って本当?~ 学びを現場で活かす、リーダー4ヶ月目の試行錯誤 ~
marina1017
0
240
Featured
See All Featured
Scaling GitHub
holman
462
140k
Unsuck your backbone
ammeep
671
58k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.5k
Become a Pro
speakerdeck
PRO
29
5.5k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
18
1.1k
How to Ace a Technical Interview
jacobian
279
23k
Side Projects
sachag
455
43k
Making Projects Easy
brettharned
117
6.3k
Transcript
Eating Kafka Nikita Salnikov-Tarnovski @iNikem
Intro to Kafka
What is Kafka • Distributed streaming platform • It lets
you publish and subscribe to streams of records • It lets you store streams of records in a fault-tolerant way.
What is Kafka • Kafka runs as a cluster on
one or more servers. • The Kafka cluster stores streams of records in categories called topics. • Each record consists of a key, a value, and a timestamp.
Four APIs http://kafka.apache.org/documentation/
Append log http://kafka.apache.org/documentation/
Brokers • Several brokers form a cluster • Coordinated with
Zookeeper • All partitions are distributed among brokers
Producers • Producer sends record to a topic • Based
on a key, partition is chosen • Leader broker is found • Wait for requested acks
Fast writes • Brokers cheat and don’t write to disk
• They write to disk cache • And let OS care about flushing to disk
Replication • Each topic can be replicated among brokers •
So for each partition there are X copies • Brokers just consume messages from leader
Consumer groups (c) Confluent
Consumer rebalance (c) Confluent
Commit • Consumer has to commit offsets he consumed •
You have to decide, when and how!
Delivery semantics • At least once • At most once
• Exactly once
Kafka Connect • Off-the-shelf solution to pipe data to or
from Kafka • E.g. DB, Elasticsearch, files, etc…
Kafka Streams • DSL and platform for writing data processing
streams • If you squint enough, very similar to Java8 streams and Fork-Join pool • But across multiple jvms and servers
Kafka in Plumbr
Kafka cluster • 5 brokers • 2x replication • 20T
data for last 90 days • Inflow ~125G per day
Data processing pipeline
Spring Cloud Stream • Greatly simplifies development of Kafka based
apps • Couple of annotations and data flows :)
Solving performance problems is hard. We don’t think it needs
to be. @JavaPlumbr/@iNikem http://plumbr.eu