Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up for free
Kafka Will Get The Message Across, Guaranteed.
David Zuelke
January 28, 2017
Programming
0
210
Kafka Will Get The Message Across, Guaranteed.
Presentation given at PHP Benelux 2017 near Antwerp, Belgium.
David Zuelke
January 28, 2017
Tweet
Share
More Decks by David Zuelke
See All by David Zuelke
Your next Web server will be written in... PHP
dzuelke
0
100
Getting Things Done
dzuelke
1
300
Your next Web server will be written in... PHP
dzuelke
2
210
Your next Web server will be written in... PHP
dzuelke
3
860
Kafka Will Get The Message Across, Guaranteed.
dzuelke
0
590
Heroku at BattleHack Venice 2015
dzuelke
0
100
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
1.2k
The Twelve-Factor App: Best Practices for Modern Web Applications
dzuelke
4
300
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
400
Other Decks in Programming
See All in Programming
ITエンジニア特化型Q&Aサイトteratailを 言語、DB、クラウドなど フルリプレイスした話
leveragestech
0
410
Makuakeの認証基盤とRe-Architectureチーム
bmf_san
0
560
あなたと 「|」 したい・・・
track3jyo
PRO
2
1.1k
はてなリモートインターンシップ2022 フロントエンドブートキャンプ 講義資料
hatena
0
120
TypeScript 4.9のas const satisfiesが便利
tonkotsuboy_com
9
2.3k
2023年にクル(かもしれない)通信ミドルウェア技術(仮)
s_hosoai
0
200
社会人 20 年目エンジニア、発信で技術学びなおしてる話
e99h2121
1
140
Glance App Widgetでウィジェットを作ろう / MoT TechTalk #15
mot_techtalk
0
120
SwiftPMのPlugin入門 / introduction_to_swiftpm_plugin
uhooi
2
100
子育てとEMと転職と
_atsushisakai
1
380
Quarto Tips for Academic Presentation
nicetak
0
920
WordPress(再)入門 - 基礎知識・環境編
oleindesign
1
130
Featured
See All Featured
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
120
29k
Web development in the modern age
philhawksworth
197
9.6k
The Pragmatic Product Professional
lauravandoore
21
3.4k
10 Git Anti Patterns You Should be Aware of
lemiorhan
643
54k
From Idea to $5000 a Month in 5 Months
shpigford
374
44k
Streamline your AJAX requests with AmplifyJS and jQuery
dougneiner
128
8.8k
How GitHub (no longer) Works
holman
298
140k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
7
570
What the flash - Photography Introduction
edds
64
10k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
270
12k
Java REST API Framework Comparison - PWX 2021
mraible
PRO
13
5.4k
Product Roadmaps are Hard
iamctodd
38
7.7k
Transcript
KAFKA WILL GET THE MESSAGE ACROSS. GUARANTEED. PHP Benelux 2017
Belgium
David Zuelke
None
[email protected]
@dzuelke
KAFKA
LinkedIn
APACHE KAFKA
"uh oh, another Apache project?!"
None
KEEP CALM AND LOOK AT THE WEBSITE
None
"Basically it is a massively scalable pub/sub message queue. architected
as a distributed transaction log."
"so it's a queue?"
it's not just a queue
queues are not multi-subscriber :(
"so it's a pubsub thing?"
it's not just a pubsub thing
pubsub broadcasts to all subscribers :(
it's a log
None
not that kind of log
WAL
Write-Ahead Log
WRITE-AHEAD LOG
None
1 foo 2 bar 3 baz 4 hi
1 create document: "foo", data: "…" 2 update document: "foo",
data: "…" 3 create document: "bar", data: "…" 4 remove document: "foo"
None
never corrupts
sequential I/O
None
sequential I/O
every message will be read at least once, no random
access
FileChannel.transferTo (shovels data straight from e.g. disk cache to network
interface, no copying via RAM)
"HI, I AM KAFKA" "Buckle up while we process (m|b|tr)illions
of messages/s."
TOPICS
streams of records
1 2 3 4 5 6 7 …
1 2 3 4 5 6 7 8 … producer
writes consumer reads
can have many subscribers
1 2 3 4 5 6 7 8 … producer
writes consumerB reads consumerA reads
can be partitioned
P0 1 2 3 4 5 6 7 … P1
1 2 3 4 … P2 1 2 3 4 5 6 7 8 … P3 1 2 3 4 5 6 …
partitions let you scale storage!
partitions let you scale consuming!
None
all records are retained, whether consumed or not, up to
a configurable limit
PRODUCERS
byte[]
(typically JSON, XML, Avro, Thrift, Protobufs)
(typically not funny GIFs)
can choose explicit partition, or a key (which is used
for auto-partitioning)
https://github.com/edenhill/librdkafka & https://arnaud-lb.github.io/php-rdkafka/
BASIC PRODUCER $rk = new RdKafka\Producer(); $rk->addBrokers("127.0.0.1"); $topic = $rk->newTopic("test");
$topic->produce(RD_KAFKA_PARTITION_UA, 0, "Unassigned partition, let Kafka choose"); $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Yay consistent hashing", $user->getId()); $topic->produce(1, 0, "This will always be sent to partition 1");
CONSUMERS
cheap
only metadata stored per consumer: offset
guaranteed to always have messages in right order (within a
partition)
can themselves produce new messages! (but there is also a
Streams API for pure transformations)
None
BASIC CONSUMER $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =
new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); }
AT-MOST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); $topic->offsetStore($msg->partition, $msg->offset); do_something($msg); }
AT-LEAST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); $topic->offsetStore($msg->partition, $msg->offset); }
EXACTLY-ONCE DELIVERY
you cannot have exactly-once delivery
THE BYZANTINE GENERALS "together we can beat the monsters. let's
both attack at 07:00?" "confirm, we attack at 07:00" ☠
USE CASES
• LinkedIn • Yahoo • Twitter • Netflix • Square
• Spotify • Pinterest • Uber • Goldman Sachs • Tumblr • PayPal • Airbnb • Mozilla • Cisco • Etsy • Foursquare • Shopify • CloudFlare
ingest the Twitter firehose and turn it into a pointless
demo ;)
None
messaging, of course
track user activity
record runtime metrics
aggregate logs
IoT (you could still e.g. use MQTT over the wire,
and bridge to Kafka)
replicate information between data centers (also see Connector API)
Event Sourcing broker :)
WAL / Commit Log for another system
billing!
"shock absorber" between systems to avoid overload of DBs, APIs,
etc.
in PHP: mostly producing messages; better languages exist for consuming
The End
THANK YOU FOR LISTENING! Questions? Ask me: @dzuelke &
[email protected]