Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kafka Will Get The Message Across, Guaranteed.
Search
David Zuelke
January 28, 2017
Programming
0
290
Kafka Will Get The Message Across, Guaranteed.
Presentation given at PHP Benelux 2017 near Antwerp, Belgium.
David Zuelke
January 28, 2017
Tweet
Share
More Decks by David Zuelke
See All by David Zuelke
Your next Web server will be written in... PHP
dzuelke
0
170
Getting Things Done
dzuelke
1
440
Your next Web server will be written in... PHP
dzuelke
2
280
Your next Web server will be written in... PHP
dzuelke
3
1.2k
Kafka Will Get The Message Across, Guaranteed.
dzuelke
0
840
Heroku at BattleHack Venice 2015
dzuelke
0
140
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
1.5k
The Twelve-Factor App: Best Practices for Modern Web Applications
dzuelke
4
520
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
500
Other Decks in Programming
See All in Programming
MAP, Jigsaw, Code Golf 振り返り会 by 関東Kaggler会|Jigsaw 15th Solution
hasibirok0
0
250
俺流レスポンシブコーディング 2025
tak_dcxi
14
8.9k
GISエンジニアから見たLINKSデータ
nokonoko1203
0
140
「コードは上から下へ読むのが一番」と思った時に、思い出してほしい話
panda728
PRO
38
26k
Cap'n Webについて
yusukebe
0
140
AIエージェントの設計で注意するべきポイント6選
har1101
2
210
dotfiles 式年遷宮 令和最新版
masawada
1
790
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
160
これならできる!個人開発のすゝめ
tinykitten
PRO
0
110
Graviton と Nitro と私
maroon1st
0
100
React Native New Architecture 移行実践報告
taminif
1
160
DevFest Android in Korea 2025 - 개발자 커뮤니티를 통해 얻는 가치
wisemuji
0
150
Featured
See All Featured
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
The Pragmatic Product Professional
lauravandoore
37
7.1k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
RailsConf 2023
tenderlove
30
1.3k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
980
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
How GitHub (no longer) Works
holman
316
140k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
Fireside Chat
paigeccino
41
3.7k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
Transcript
KAFKA WILL GET THE MESSAGE ACROSS. GUARANTEED. PHP Benelux 2017
Belgium
David Zuelke
None
[email protected]
@dzuelke
KAFKA
LinkedIn
APACHE KAFKA
"uh oh, another Apache project?!"
None
KEEP CALM AND LOOK AT THE WEBSITE
None
"Basically it is a massively scalable pub/sub message queue. architected
as a distributed transaction log."
"so it's a queue?"
it's not just a queue
queues are not multi-subscriber :(
"so it's a pubsub thing?"
it's not just a pubsub thing
pubsub broadcasts to all subscribers :(
it's a log
None
not that kind of log
WAL
Write-Ahead Log
WRITE-AHEAD LOG
None
1 foo 2 bar 3 baz 4 hi
1 create document: "foo", data: "…" 2 update document: "foo",
data: "…" 3 create document: "bar", data: "…" 4 remove document: "foo"
None
never corrupts
sequential I/O
None
sequential I/O
every message will be read at least once, no random
access
FileChannel.transferTo (shovels data straight from e.g. disk cache to network
interface, no copying via RAM)
"HI, I AM KAFKA" "Buckle up while we process (m|b|tr)illions
of messages/s."
TOPICS
streams of records
1 2 3 4 5 6 7 …
1 2 3 4 5 6 7 8 … producer
writes consumer reads
can have many subscribers
1 2 3 4 5 6 7 8 … producer
writes consumerB reads consumerA reads
can be partitioned
P0 1 2 3 4 5 6 7 … P1
1 2 3 4 … P2 1 2 3 4 5 6 7 8 … P3 1 2 3 4 5 6 …
partitions let you scale storage!
partitions let you scale consuming!
None
all records are retained, whether consumed or not, up to
a configurable limit
PRODUCERS
byte[]
(typically JSON, XML, Avro, Thrift, Protobufs)
(typically not funny GIFs)
can choose explicit partition, or a key (which is used
for auto-partitioning)
https://github.com/edenhill/librdkafka & https://arnaud-lb.github.io/php-rdkafka/
BASIC PRODUCER $rk = new RdKafka\Producer(); $rk->addBrokers("127.0.0.1"); $topic = $rk->newTopic("test");
$topic->produce(RD_KAFKA_PARTITION_UA, 0, "Unassigned partition, let Kafka choose"); $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Yay consistent hashing", $user->getId()); $topic->produce(1, 0, "This will always be sent to partition 1");
CONSUMERS
cheap
only metadata stored per consumer: offset
guaranteed to always have messages in right order (within a
partition)
can themselves produce new messages! (but there is also a
Streams API for pure transformations)
None
BASIC CONSUMER $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =
new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); }
AT-MOST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); $topic->offsetStore($msg->partition, $msg->offset); do_something($msg); }
AT-LEAST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); $topic->offsetStore($msg->partition, $msg->offset); }
EXACTLY-ONCE DELIVERY
you cannot have exactly-once delivery
THE BYZANTINE GENERALS "together we can beat the monsters. let's
both attack at 07:00?" "confirm, we attack at 07:00" ☠
USE CASES
• LinkedIn • Yahoo • Twitter • Netflix • Square
• Spotify • Pinterest • Uber • Goldman Sachs • Tumblr • PayPal • Airbnb • Mozilla • Cisco • Etsy • Foursquare • Shopify • CloudFlare
ingest the Twitter firehose and turn it into a pointless
demo ;)
None
messaging, of course
track user activity
record runtime metrics
aggregate logs
IoT (you could still e.g. use MQTT over the wire,
and bridge to Kafka)
replicate information between data centers (also see Connector API)
Event Sourcing broker :)
WAL / Commit Log for another system
billing!
"shock absorber" between systems to avoid overload of DBs, APIs,
etc.
in PHP: mostly producing messages; better languages exist for consuming
The End
THANK YOU FOR LISTENING! Questions? Ask me: @dzuelke &
[email protected]