Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Plumbr uses Kafka
Search
Nikita Salnikov-Tarnovski
February 04, 2018
Programming
0
97
How Plumbr uses Kafka
Nikita Salnikov-Tarnovski
February 04, 2018
Tweet
Share
More Decks by Nikita Salnikov-Tarnovski
See All by Nikita Salnikov-Tarnovski
Project clarity - random rant from an old engineer
nikem
0
91
Introduction to Druid
nikem
0
850
Deceived by monitoring
nikem
0
65
10% Happier
nikem
0
70
Where is my memory
nikem
0
450
Heap, off you go
nikem
0
1.2k
First steps in GC tuning
nikem
0
1.6k
I bet you have a memory leak
nikem
1
170
Plumbing Memory Leaks
nikem
1
140
Other Decks in Programming
See All in Programming
商品比較サービス「マイベスト」における パーソナライズレコメンドの第一歩
ucchiii43
0
230
テスターからテストエンジニアへ ~新米テストエンジニアが歩んだ9ヶ月振り返り~
non0113
2
240
Reactの歴史を振り返る
tutinoko
1
150
#QiitaBash TDDで(自分の)開発がどう変わったか
ryosukedtomita
1
180
MCPで実現できる、Webサービス利用体験について
syumai
7
2.2k
ご注文の差分はこちらですか? 〜 AWS CDK のいろいろな差分検出と安全なデプロイ
konokenj
4
720
The Niche of CDK Grant オブジェクトって何者?/the-niche-of-cdk-what-isgrant-object
hassaku63
1
720
React 使いじゃなくても知っておきたい教養としての React
oukayuka
17
4.6k
AIのメモリー
watany
11
1.1k
変化を楽しむエンジニアリング ~ いままでとこれから ~
murajun1978
0
590
新しいモバイルアプリ勉強会(仮)について
uetyo
1
230
Git Sync を超える!OSS で実現する CDK Pull 型デプロイ / Deploying CDK with PipeCD in Pull-style
tkikuc
4
480
Featured
See All Featured
Producing Creativity
orderedlist
PRO
346
40k
Balancing Empowerment & Direction
lara
1
510
Build your cross-platform service in a week with App Engine
jlugia
231
18k
What's in a price? How to price your products and services
michaelherold
246
12k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
860
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
50
5.5k
Optimizing for Happiness
mojombo
379
70k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
Faster Mobile Websites
deanohume
308
31k
GitHub's CSS Performance
jonrohan
1031
460k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
26k
Transcript
Eating Kafka Nikita Salnikov-Tarnovski @iNikem
Intro to Kafka
What is Kafka • Distributed streaming platform • It lets
you publish and subscribe to streams of records • It lets you store streams of records in a fault-tolerant way.
What is Kafka • Kafka runs as a cluster on
one or more servers. • The Kafka cluster stores streams of records in categories called topics. • Each record consists of a key, a value, and a timestamp.
Four APIs http://kafka.apache.org/documentation/
Append log http://kafka.apache.org/documentation/
Brokers • Several brokers form a cluster • Coordinated with
Zookeeper • All partitions are distributed among brokers
Producers • Producer sends record to a topic • Based
on a key, partition is chosen • Leader broker is found • Wait for requested acks
Fast writes • Brokers cheat and don’t write to disk
• They write to disk cache • And let OS care about flushing to disk
Replication • Each topic can be replicated among brokers •
So for each partition there are X copies • Brokers just consume messages from leader
Consumer groups (c) Confluent
Consumer rebalance (c) Confluent
Commit • Consumer has to commit offsets he consumed •
You have to decide, when and how!
Delivery semantics • At least once • At most once
• Exactly once
Kafka Connect • Off-the-shelf solution to pipe data to or
from Kafka • E.g. DB, Elasticsearch, files, etc…
Kafka Streams • DSL and platform for writing data processing
streams • If you squint enough, very similar to Java8 streams and Fork-Join pool • But across multiple jvms and servers
Kafka in Plumbr
Kafka cluster • 5 brokers • 2x replication • 20T
data for last 90 days • Inflow ~125G per day
Data processing pipeline
Spring Cloud Stream • Greatly simplifies development of Kafka based
apps • Couple of annotations and data flows :)
Solving performance problems is hard. We don’t think it needs
to be. @JavaPlumbr/@iNikem http://plumbr.eu