Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Plumbr uses Kafka
Search
Nikita Salnikov-Tarnovski
February 04, 2018
Programming
0
100
How Plumbr uses Kafka
Nikita Salnikov-Tarnovski
February 04, 2018
Tweet
Share
More Decks by Nikita Salnikov-Tarnovski
See All by Nikita Salnikov-Tarnovski
Project clarity - random rant from an old engineer
nikem
0
93
Introduction to Druid
nikem
0
860
Deceived by monitoring
nikem
0
69
10% Happier
nikem
0
72
Where is my memory
nikem
0
460
Heap, off you go
nikem
0
1.2k
First steps in GC tuning
nikem
0
1.6k
I bet you have a memory leak
nikem
1
170
Plumbing Memory Leaks
nikem
1
140
Other Decks in Programming
See All in Programming
CI_CD「健康診断」のススメ。現場でのボトルネック特定から、健康診断を通じた組織的な改善手法
teamlab
PRO
0
210
Le côté obscur des IA génératives
pascallemerrer
0
140
CSC305 Lecture 06
javiergs
PRO
0
220
Cloudflare AgentsとAI SDKでAIエージェントを作ってみた
briete
0
140
[Kaigi on Rais 2025] 全問正解率3%: RubyKaigiで出題したやりがちな危険コード5選
power3812
0
110
なぜGoのジェネリクスはこの形なのか? Featherweight Goが明かす設計の核心
ryotaros
7
1.1k
AI Coding Meetup #3 - 導入セッション / ai-coding-meetup-3
izumin5210
0
3.3k
バッチ処理を「状態の記録」から「事実の記録」へ
panda728
PRO
0
150
Your Perfect Project Setup for Angular @BASTA! 2025 in Mainz
manfredsteyer
PRO
0
170
Web フロントエンドエンジニアに開かれる AI Agent プロダクト開発 - Vercel AI SDK を観察して AI Agent と仲良くなろう! #FEC余熱NIGHT
izumin5210
3
520
GraphQL×Railsアプリのデータベース負荷分散 - 月間3,000万人利用サービスを無停止で
koxya
1
1.3k
After go func(): Goroutines Through a Beginner’s Eye
97vaibhav
0
370
Featured
See All Featured
Agile that works and the tools we love
rasmusluckow
331
21k
A Modern Web Designer's Workflow
chriscoyier
697
190k
The Illustrated Children's Guide to Kubernetes
chrisshort
49
51k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
9
870
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.1k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
657
61k
Practical Orchestrator
shlominoach
190
11k
Building an army of robots
kneath
306
46k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.7k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
900
The Language of Interfaces
destraynor
162
25k
Rebuilding a faster, lazier Slack
samanthasiow
84
9.2k
Transcript
Eating Kafka Nikita Salnikov-Tarnovski @iNikem
Intro to Kafka
What is Kafka • Distributed streaming platform • It lets
you publish and subscribe to streams of records • It lets you store streams of records in a fault-tolerant way.
What is Kafka • Kafka runs as a cluster on
one or more servers. • The Kafka cluster stores streams of records in categories called topics. • Each record consists of a key, a value, and a timestamp.
Four APIs http://kafka.apache.org/documentation/
Append log http://kafka.apache.org/documentation/
Brokers • Several brokers form a cluster • Coordinated with
Zookeeper • All partitions are distributed among brokers
Producers • Producer sends record to a topic • Based
on a key, partition is chosen • Leader broker is found • Wait for requested acks
Fast writes • Brokers cheat and don’t write to disk
• They write to disk cache • And let OS care about flushing to disk
Replication • Each topic can be replicated among brokers •
So for each partition there are X copies • Brokers just consume messages from leader
Consumer groups (c) Confluent
Consumer rebalance (c) Confluent
Commit • Consumer has to commit offsets he consumed •
You have to decide, when and how!
Delivery semantics • At least once • At most once
• Exactly once
Kafka Connect • Off-the-shelf solution to pipe data to or
from Kafka • E.g. DB, Elasticsearch, files, etc…
Kafka Streams • DSL and platform for writing data processing
streams • If you squint enough, very similar to Java8 streams and Fork-Join pool • But across multiple jvms and servers
Kafka in Plumbr
Kafka cluster • 5 brokers • 2x replication • 20T
data for last 90 days • Inflow ~125G per day
Data processing pipeline
Spring Cloud Stream • Greatly simplifies development of Kafka based
apps • Couple of annotations and data flows :)
Solving performance problems is hard. We don’t think it needs
to be. @JavaPlumbr/@iNikem http://plumbr.eu