Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kafka Will Get The Message Across, Guaranteed.
Search
David Zuelke
January 28, 2017
Programming
0
270
Kafka Will Get The Message Across, Guaranteed.
Presentation given at PHP Benelux 2017 near Antwerp, Belgium.
David Zuelke
January 28, 2017
Tweet
Share
More Decks by David Zuelke
See All by David Zuelke
Your next Web server will be written in... PHP
dzuelke
0
150
Getting Things Done
dzuelke
1
420
Your next Web server will be written in... PHP
dzuelke
2
270
Your next Web server will be written in... PHP
dzuelke
3
1.2k
Kafka Will Get The Message Across, Guaranteed.
dzuelke
0
810
Heroku at BattleHack Venice 2015
dzuelke
0
130
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
1.4k
The Twelve-Factor App: Best Practices for Modern Web Applications
dzuelke
4
470
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
490
Other Decks in Programming
See All in Programming
“いい感じ“な定量評価を求めて - Four Keysとアウトカムの間の探求 -
nealle
2
11k
なぜ適用するか、移行して理解するClean Architecture 〜構造を超えて設計を継承する〜 / Why Apply, Migrate and Understand Clean Architecture - Inherit Design Beyond Structure
seike460
PRO
3
780
Rubyでやりたい駆動開発 / Ruby driven development
chobishiba
1
740
The Niche of CDK Grant オブジェクトって何者?/the-niche-of-cdk-what-isgrant-object
hassaku63
1
410
Agentic Coding: The Future of Software Development with Agents
mitsuhiko
0
120
イベントストーミング図からコードへの変換手順 / Procedure for Converting Event Storming Diagrams to Code
nrslib
2
930
『自分のデータだけ見せたい!』を叶える──Laravel × Casbin で複雑権限をスッキリ解きほぐす 25 分
akitotsukahara
2
650
おやつのお供はお決まりですか?@WWDC25 Recap -Japan-\(region).swift
shingangan
0
140
スタートアップの急成長を支えるプラットフォームエンジニアリングと組織戦略
sutochin26
1
6.5k
ご注文の差分はこちらですか? 〜 AWS CDK のいろいろな差分検出と安全なデプロイ
konokenj
3
370
脱Riverpod?fqueryで考える、TanStack Queryライクなアーキテクチャの可能性
ostk0069
0
320
ふつうの技術スタックでアート作品を作ってみる
akira888
1
1.1k
Featured
See All Featured
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.4k
Designing for humans not robots
tammielis
253
25k
A better future with KSS
kneath
238
17k
Documentation Writing (for coders)
carmenintech
72
4.9k
Unsuck your backbone
ammeep
671
58k
Done Done
chrislema
184
16k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
126
53k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
The Art of Programming - Codeland 2020
erikaheidi
54
13k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
830
Transcript
KAFKA WILL GET THE MESSAGE ACROSS. GUARANTEED. PHP Benelux 2017
Belgium
David Zuelke
None
[email protected]
@dzuelke
KAFKA
LinkedIn
APACHE KAFKA
"uh oh, another Apache project?!"
None
KEEP CALM AND LOOK AT THE WEBSITE
None
"Basically it is a massively scalable pub/sub message queue. architected
as a distributed transaction log."
"so it's a queue?"
it's not just a queue
queues are not multi-subscriber :(
"so it's a pubsub thing?"
it's not just a pubsub thing
pubsub broadcasts to all subscribers :(
it's a log
None
not that kind of log
WAL
Write-Ahead Log
WRITE-AHEAD LOG
None
1 foo 2 bar 3 baz 4 hi
1 create document: "foo", data: "…" 2 update document: "foo",
data: "…" 3 create document: "bar", data: "…" 4 remove document: "foo"
None
never corrupts
sequential I/O
None
sequential I/O
every message will be read at least once, no random
access
FileChannel.transferTo (shovels data straight from e.g. disk cache to network
interface, no copying via RAM)
"HI, I AM KAFKA" "Buckle up while we process (m|b|tr)illions
of messages/s."
TOPICS
streams of records
1 2 3 4 5 6 7 …
1 2 3 4 5 6 7 8 … producer
writes consumer reads
can have many subscribers
1 2 3 4 5 6 7 8 … producer
writes consumerB reads consumerA reads
can be partitioned
P0 1 2 3 4 5 6 7 … P1
1 2 3 4 … P2 1 2 3 4 5 6 7 8 … P3 1 2 3 4 5 6 …
partitions let you scale storage!
partitions let you scale consuming!
None
all records are retained, whether consumed or not, up to
a configurable limit
PRODUCERS
byte[]
(typically JSON, XML, Avro, Thrift, Protobufs)
(typically not funny GIFs)
can choose explicit partition, or a key (which is used
for auto-partitioning)
https://github.com/edenhill/librdkafka & https://arnaud-lb.github.io/php-rdkafka/
BASIC PRODUCER $rk = new RdKafka\Producer(); $rk->addBrokers("127.0.0.1"); $topic = $rk->newTopic("test");
$topic->produce(RD_KAFKA_PARTITION_UA, 0, "Unassigned partition, let Kafka choose"); $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Yay consistent hashing", $user->getId()); $topic->produce(1, 0, "This will always be sent to partition 1");
CONSUMERS
cheap
only metadata stored per consumer: offset
guaranteed to always have messages in right order (within a
partition)
can themselves produce new messages! (but there is also a
Streams API for pure transformations)
None
BASIC CONSUMER $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =
new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); }
AT-MOST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); $topic->offsetStore($msg->partition, $msg->offset); do_something($msg); }
AT-LEAST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); $topic->offsetStore($msg->partition, $msg->offset); }
EXACTLY-ONCE DELIVERY
you cannot have exactly-once delivery
THE BYZANTINE GENERALS "together we can beat the monsters. let's
both attack at 07:00?" "confirm, we attack at 07:00" ☠
USE CASES
• LinkedIn • Yahoo • Twitter • Netflix • Square
• Spotify • Pinterest • Uber • Goldman Sachs • Tumblr • PayPal • Airbnb • Mozilla • Cisco • Etsy • Foursquare • Shopify • CloudFlare
ingest the Twitter firehose and turn it into a pointless
demo ;)
None
messaging, of course
track user activity
record runtime metrics
aggregate logs
IoT (you could still e.g. use MQTT over the wire,
and bridge to Kafka)
replicate information between data centers (also see Connector API)
Event Sourcing broker :)
WAL / Commit Log for another system
billing!
"shock absorber" between systems to avoid overload of DBs, APIs,
etc.
in PHP: mostly producing messages; better languages exist for consuming
The End
THANK YOU FOR LISTENING! Questions? Ask me: @dzuelke &
[email protected]