Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kafka Will Get The Message Across, Guaranteed.
Search
David Zuelke
December 02, 2016
Programming
0
690
Kafka Will Get The Message Across, Guaranteed.
Presentation given at SymfonyCon 2016 in Berlin, Germany.
David Zuelke
December 02, 2016
Tweet
Share
More Decks by David Zuelke
See All by David Zuelke
Your next Web server will be written in... PHP
dzuelke
0
120
Getting Things Done
dzuelke
1
360
Your next Web server will be written in... PHP
dzuelke
2
230
Your next Web server will be written in... PHP
dzuelke
3
960
Kafka Will Get The Message Across, Guaranteed.
dzuelke
0
220
Heroku at BattleHack Venice 2015
dzuelke
0
110
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
1.2k
The Twelve-Factor App: Best Practices for Modern Web Applications
dzuelke
4
330
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
450
Other Decks in Programming
See All in Programming
Prepare for Jakarta EE 11 - Performance and Developer Productivity
ivargrimstad
0
470
Javaエンジニアのための Nodejs/Nuxt3入門
hidekatsu_izuno
0
280
SwiftUI Performance 不要なViewの再描画と更新を抑える
bigamitiongit
1
160
⼤規模⾔語モデルの拡張(RAG)が 終わったかも知れない件について
nearme_tech
22
15k
元気予報
suu_mire0726
0
860
Java 22 Overview
kishida
1
170
TYPO3 v13 – The road to LTS: What's new and new APIs
luisasofie_xoxo
0
180
Rubyでたのしむクリエイティブコーディング/Enjoy Creative coding with Ruby
chobishiba
1
160
educure_カリキュラム生操作マニュアル.pdf
linew_official
0
480
Site Reliability Engineering for GMO
pyama86
6
960
Ruby Pattern Matching
bkuhlmann
0
920
try!Swift Tokyo 2024 参加報告 LT
akidon0000
1
190
Featured
See All Featured
The Brand Is Dead. Long Live the Brand.
mthomps
48
28k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
1
3.4k
Faster Mobile Websites
deanohume
297
30k
Designing the Hi-DPI Web
ddemaree
276
33k
A Modern Web Designer's Workflow
chriscoyier
688
190k
Docker and Python
trallard
33
2.7k
Building Better People: How to give real-time feedback that sticks.
wjessup
354
18k
In The Pink: A Labor of Love
frogandcode
138
21k
The Straight Up "How To Draw Better" Workshop
denniskardys
227
130k
Gamification - CAS2011
davidbonilla
76
4.6k
Build your cross-platform service in a week with App Engine
jlugia
225
17k
StorybookのUI Testing Handbookを読んだ
zakiyama
11
4.6k
Transcript
KAFKA WILL GET THE MESSAGE ACROSS. GUARANTEED. SymfonyCon 2016 Berlin,
Germany
David Zuelke
None
[email protected]
@dzuelke
KAFKA
LinkedIn
APACHE KAFKA
"uh oh, another Apache project?!"
None
KEEP CALM AND LOOK AT THE WEBSITE
None
"Basically it is a massively scalable pub/sub message queue. architected
as a distributed transaction log."
"so it's a queue?"
it's not a queue
queues are not multi-subscriber :(
"so it's a pubsub thing?"
it's not a pubsub thing
pubsub broadcasts to all subscribers :(
it's a log
None
not that kind of log
WAL
Write-Ahead Log
WRITE-AHEAD LOG
None
1 foo 2 bar 3 baz 4 hi
1 create document: "foo", data: "…" 2 update document: "foo",
data: "…" 3 create document: "bar", data: "…" 4 remove document: "foo"
None
never corrupts
sequential I/O
None
sequential I/O
every message will be read at least once, no random
access
FileChannel.transferTo (shovels data straight from e.g. disk cache to network
interface, no copying via RAM)
"HI, I AM KAFKA" "Buckle up while we process (m|b|tr)illions
of messages/s."
TOPICS
streams of records
1 2 3 4 5 6 7 …
1 2 3 4 5 6 7 8 … producer
writes consumer reads
can have many subscribers
1 2 3 4 5 6 7 8 … producer
writes consumerB reads consumerA reads
can be partitioned
P0 1 2 3 4 5 6 7 … P1
1 2 3 4 … P2 1 2 3 4 5 6 7 8 … P3 1 2 3 4 5 6 …
partitions let you scale storage!
partitions let you scale consuming!
None
all records are retained, whether consumed or not, up to
a configurable limit
PRODUCERS
byte[]
(typically JSON, XML, Avro, Thrift, Protobufs)
(typically not funny GIFs)
can choose explicit partition, or a key (which is used
for auto-partitioning)
https://github.com/edenhill/librdkafka & https://arnaud-lb.github.io/php-rdkafka/
BASIC PRODUCER $rk = new RdKafka\Producer(); $rk->addBrokers("127.0.0.1"); $topic = $rk->newTopic("test");
$topic->produce(RD_KAFKA_PARTITION_UA, 0, "Unassigned partition, let Kafka choose"); $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Yay consistent hashing", $user->getId()); $topic->produce(1, 0, "This will always be sent to partition 1");
CONSUMERS
cheap
only metadata stored per consumer: offset
guaranteed to always have messages in right order (within a
partition)
can themselves produce new messages!
None
BASIC CONSUMER $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =
new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); }
AT-MOST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); $topic->offsetStore($msg->partition, $msg->offset); do_something($msg); }
AT-LEAST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); $topic->offsetStore($msg->partition, $msg->offset); }
EXACTLY-ONCE DELIVERY
you cannot have exactly-once delivery
THE BYZANTINE GENERALS "together we can beat the monsters. let's
both attack at 07:00?" "confirm, we attack at 07:00" ☠
USE CASES
• LinkedIn • Yahoo • Twitter • Netflix • Square
• Spotify • Pinterest • Uber • Goldman Sachs • Tumblr • PayPal • Airbnb • Mozilla • Cisco • Etsy • Foursquare • Shopify • CloudFlare
ingest Twitter firehose and turn it into a pointless demo
;)
None
messaging, of course
track user activity
record runtime metrics
IoT
replicate information between data centers
billing!
"shock absorber" between systems to avoid overload of DBs, APIs,
etc.
in PHP: mostly producing messages; better languages exist for consuming
The End
THANK YOU FOR LISTENING! Questions? Ask me: @dzuelke &
[email protected]