Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Plumbr uses Kafka
Search
Nikita Salnikov-Tarnovski
February 04, 2018
Programming
0
100
How Plumbr uses Kafka
Nikita Salnikov-Tarnovski
February 04, 2018
Tweet
Share
More Decks by Nikita Salnikov-Tarnovski
See All by Nikita Salnikov-Tarnovski
Project clarity - random rant from an old engineer
nikem
0
93
Introduction to Druid
nikem
0
860
Deceived by monitoring
nikem
0
69
10% Happier
nikem
0
72
Where is my memory
nikem
0
450
Heap, off you go
nikem
0
1.2k
First steps in GC tuning
nikem
0
1.6k
I bet you have a memory leak
nikem
1
170
Plumbing Memory Leaks
nikem
1
140
Other Decks in Programming
See All in Programming
250830 IaCの選定~AWS SAMのLambdaをECSに乗り換えたときの備忘録~
east_takumi
0
390
AI Coding Agentのセキュリティリスク:PRの自己承認とメルカリの対策
s3h
0
230
概念モデル→論理モデルで気をつけていること
sunnyone
2
270
Ruby Parser progress report 2025
yui_knk
1
450
Swift Updates - Learn Languages 2025
koher
2
490
ProxyによるWindow間RPC機構の構築
syumai
3
1.2k
How Android Uses Data Structures Behind The Scenes
l2hyunwoo
0
460
Cache Me If You Can
ryunen344
2
1.5k
OSS開発者という働き方
andpad
5
1.7k
さようなら Date。 ようこそTemporal! 3年間先行利用して得られた知見の共有
8beeeaaat
3
1.5k
パッケージ設計の黒魔術/Kyoto.go#63
lufia
3
440
1から理解するWeb Push
dora1998
7
1.9k
Featured
See All Featured
Intergalactic Javascript Robots from Outer Space
tanoku
272
27k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.1k
Testing 201, or: Great Expectations
jmmastey
45
7.7k
Practical Orchestrator
shlominoach
190
11k
Unsuck your backbone
ammeep
671
58k
Visualization
eitanlees
148
16k
Context Engineering - Making Every Token Count
addyosmani
3
50
Large-scale JavaScript Application Architecture
addyosmani
513
110k
Code Reviewing Like a Champion
maltzj
525
40k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.1k
Transcript
Eating Kafka Nikita Salnikov-Tarnovski @iNikem
Intro to Kafka
What is Kafka • Distributed streaming platform • It lets
you publish and subscribe to streams of records • It lets you store streams of records in a fault-tolerant way.
What is Kafka • Kafka runs as a cluster on
one or more servers. • The Kafka cluster stores streams of records in categories called topics. • Each record consists of a key, a value, and a timestamp.
Four APIs http://kafka.apache.org/documentation/
Append log http://kafka.apache.org/documentation/
Brokers • Several brokers form a cluster • Coordinated with
Zookeeper • All partitions are distributed among brokers
Producers • Producer sends record to a topic • Based
on a key, partition is chosen • Leader broker is found • Wait for requested acks
Fast writes • Brokers cheat and don’t write to disk
• They write to disk cache • And let OS care about flushing to disk
Replication • Each topic can be replicated among brokers •
So for each partition there are X copies • Brokers just consume messages from leader
Consumer groups (c) Confluent
Consumer rebalance (c) Confluent
Commit • Consumer has to commit offsets he consumed •
You have to decide, when and how!
Delivery semantics • At least once • At most once
• Exactly once
Kafka Connect • Off-the-shelf solution to pipe data to or
from Kafka • E.g. DB, Elasticsearch, files, etc…
Kafka Streams • DSL and platform for writing data processing
streams • If you squint enough, very similar to Java8 streams and Fork-Join pool • But across multiple jvms and servers
Kafka in Plumbr
Kafka cluster • 5 brokers • 2x replication • 20T
data for last 90 days • Inflow ~125G per day
Data processing pipeline
Spring Cloud Stream • Greatly simplifies development of Kafka based
apps • Couple of annotations and data flows :)
Solving performance problems is hard. We don’t think it needs
to be. @JavaPlumbr/@iNikem http://plumbr.eu