Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Plumbr uses Kafka
Search
Nikita Salnikov-Tarnovski
February 04, 2018
Programming
110
0
Share
How Plumbr uses Kafka
Nikita Salnikov-Tarnovski
February 04, 2018
More Decks by Nikita Salnikov-Tarnovski
See All by Nikita Salnikov-Tarnovski
Project clarity - random rant from an old engineer
nikem
0
98
Introduction to Druid
nikem
0
880
Deceived by monitoring
nikem
0
78
10% Happier
nikem
0
79
Where is my memory
nikem
0
480
Heap, off you go
nikem
0
1.3k
First steps in GC tuning
nikem
0
1.6k
I bet you have a memory leak
nikem
1
170
Plumbing Memory Leaks
nikem
1
150
Other Decks in Programming
See All in Programming
ふにゃっとしない名前の付け方 〜哲学で茹で上げる、コシのあるソフトウェア設計〜
shimomura
0
120
「なんか〇〇ライブラリで脆弱性あるみたいなんだけど。。。」から始める脆弱性対応 / First Steps in Vulnerability Response
mackey0225
2
130
20年以上続くプロダクトでも使い続けられる静的解析ツールを求めて
matsuo_atsushi
0
150
プラグインで拡張される Context をtype-safe にする難しさと設計判断
kazupon
1
180
実践ハーネスエンジニアリング:ステアリングループを実例から読み解く / Practical Harness Engineering: Understanding Steering Loops Through Real-World Examples
nrslib
5
5.5k
エラー処理の温故知新 / history of error handling technic
ryotanakaya
7
1.9k
When benchmarks go bad - what I learned from measuring performance wrong
hollycummins
0
390
TypeScriptだけでAIエージェントを作る フロント・エージェント・インフラのフルスタック実践
har1101
3
520
ビジネスモデルから紐解く、AI+型駆動開発
hirokiomote
0
150
なぜあなたのコードには「コシ」がないのか?〜AI時代に問う、最後まで美味しい設計と戦略〜 #phpconkagawa / phpconkagawa2026
shogogg
0
210
Cloudflare で始める Data Platform
ta93abe
0
180
サークル参加から学ぶ、小さな事業の回し方
yuzneri
0
200
Featured
See All Featured
Abbi's Birthday
coloredviolet
2
7.6k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
190
The Cult of Friendly URLs
andyhume
79
6.9k
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
180
Ethics towards AI in product and experience design
skipperchong
2
270
A better future with KSS
kneath
240
18k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.9k
HDC tutorial
michielstock
2
660
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.4k
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
440
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
30 Presentation Tips
portentint
PRO
1
290
Transcript
Eating Kafka Nikita Salnikov-Tarnovski @iNikem
Intro to Kafka
What is Kafka • Distributed streaming platform • It lets
you publish and subscribe to streams of records • It lets you store streams of records in a fault-tolerant way.
What is Kafka • Kafka runs as a cluster on
one or more servers. • The Kafka cluster stores streams of records in categories called topics. • Each record consists of a key, a value, and a timestamp.
Four APIs http://kafka.apache.org/documentation/
Append log http://kafka.apache.org/documentation/
Brokers • Several brokers form a cluster • Coordinated with
Zookeeper • All partitions are distributed among brokers
Producers • Producer sends record to a topic • Based
on a key, partition is chosen • Leader broker is found • Wait for requested acks
Fast writes • Brokers cheat and don’t write to disk
• They write to disk cache • And let OS care about flushing to disk
Replication • Each topic can be replicated among brokers •
So for each partition there are X copies • Brokers just consume messages from leader
Consumer groups (c) Confluent
Consumer rebalance (c) Confluent
Commit • Consumer has to commit offsets he consumed •
You have to decide, when and how!
Delivery semantics • At least once • At most once
• Exactly once
Kafka Connect • Off-the-shelf solution to pipe data to or
from Kafka • E.g. DB, Elasticsearch, files, etc…
Kafka Streams • DSL and platform for writing data processing
streams • If you squint enough, very similar to Java8 streams and Fork-Join pool • But across multiple jvms and servers
Kafka in Plumbr
Kafka cluster • 5 brokers • 2x replication • 20T
data for last 90 days • Inflow ~125G per day
Data processing pipeline
Spring Cloud Stream • Greatly simplifies development of Kafka based
apps • Couple of annotations and data flows :)
Solving performance problems is hard. We don’t think it needs
to be. @JavaPlumbr/@iNikem http://plumbr.eu