Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Kafka Will Get The Message Across, Guaranteed.
Search
David Zuelke
January 28, 2017
Programming
0
220
Kafka Will Get The Message Across, Guaranteed.
Presentation given at PHP Benelux 2017 near Antwerp, Belgium.
David Zuelke
January 28, 2017
Tweet
Share
More Decks by David Zuelke
See All by David Zuelke
Your next Web server will be written in... PHP
dzuelke
0
120
Getting Things Done
dzuelke
1
360
Your next Web server will be written in... PHP
dzuelke
2
230
Your next Web server will be written in... PHP
dzuelke
3
960
Kafka Will Get The Message Across, Guaranteed.
dzuelke
0
690
Heroku at BattleHack Venice 2015
dzuelke
0
110
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
1.2k
The Twelve-Factor App: Best Practices for Modern Web Applications
dzuelke
4
330
Designing HTTP Interfaces and RESTful Web Services
dzuelke
6
450
Other Decks in Programming
See All in Programming
Behind VS Code Extensions for JavaScript / TypeScript Linnting and Formatting
unvalley
5
920
Amazon SQSコンシューマー疎結合への旅 - 出張! #DevelopersIO IT技術ブログの中の人が語る勉強会 #3
quiver
0
270
新宿ダンジョンを可視化してみた
satoshi7190
2
260
はてなにおける CSS Modules、及び CSS Modules に足りないもの / CSS Modules in Hatena, and CSS Modules missing parts
mizdra
7
930
Blue/Greenデプロイの導入による 運用フローの改善
kudoas
1
380
冗長なエラーログを削減し、スタックトレースを手に入れる / Reducing Verbose Error Logs and Obtaining Stack Traces
upamune
0
740
#phpcon_odawara オープン・クローズドなテストフィクスチャを求めて / open closed test fixtures
77web
3
230
ADRを一年運用してみた/adr_after_a_year
hanhan1978
7
2.4k
雑に思考を整理する技術と効能
konifar
60
29k
Random\Randomizer クラスで日常のあれこれを解決しよう! / Random\Randomizer class solves familiar trouble
cocoeyes02
0
240
Milestoner
bkuhlmann
1
410
Ruby GitHub Packages
bkuhlmann
0
630
Featured
See All Featured
Code Review Best Practice
trishagee
55
15k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
20
1.9k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
659
120k
Designing Experiences People Love
moore
136
23k
Web Components: a chance to create the future
zenorocha
305
41k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
244
20k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
2
1.3k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
78
42k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
6
1.5k
Infographics Made Easy
chrislema
238
18k
A Modern Web Designer's Workflow
chriscoyier
689
190k
Build The Right Thing And Hit Your Dates
maggiecrowley
24
2k
Transcript
KAFKA WILL GET THE MESSAGE ACROSS. GUARANTEED. PHP Benelux 2017
Belgium
David Zuelke
None
[email protected]
@dzuelke
KAFKA
LinkedIn
APACHE KAFKA
"uh oh, another Apache project?!"
None
KEEP CALM AND LOOK AT THE WEBSITE
None
"Basically it is a massively scalable pub/sub message queue. architected
as a distributed transaction log."
"so it's a queue?"
it's not just a queue
queues are not multi-subscriber :(
"so it's a pubsub thing?"
it's not just a pubsub thing
pubsub broadcasts to all subscribers :(
it's a log
None
not that kind of log
WAL
Write-Ahead Log
WRITE-AHEAD LOG
None
1 foo 2 bar 3 baz 4 hi
1 create document: "foo", data: "…" 2 update document: "foo",
data: "…" 3 create document: "bar", data: "…" 4 remove document: "foo"
None
never corrupts
sequential I/O
None
sequential I/O
every message will be read at least once, no random
access
FileChannel.transferTo (shovels data straight from e.g. disk cache to network
interface, no copying via RAM)
"HI, I AM KAFKA" "Buckle up while we process (m|b|tr)illions
of messages/s."
TOPICS
streams of records
1 2 3 4 5 6 7 …
1 2 3 4 5 6 7 8 … producer
writes consumer reads
can have many subscribers
1 2 3 4 5 6 7 8 … producer
writes consumerB reads consumerA reads
can be partitioned
P0 1 2 3 4 5 6 7 … P1
1 2 3 4 … P2 1 2 3 4 5 6 7 8 … P3 1 2 3 4 5 6 …
partitions let you scale storage!
partitions let you scale consuming!
None
all records are retained, whether consumed or not, up to
a configurable limit
PRODUCERS
byte[]
(typically JSON, XML, Avro, Thrift, Protobufs)
(typically not funny GIFs)
can choose explicit partition, or a key (which is used
for auto-partitioning)
https://github.com/edenhill/librdkafka & https://arnaud-lb.github.io/php-rdkafka/
BASIC PRODUCER $rk = new RdKafka\Producer(); $rk->addBrokers("127.0.0.1"); $topic = $rk->newTopic("test");
$topic->produce(RD_KAFKA_PARTITION_UA, 0, "Unassigned partition, let Kafka choose"); $topic->produce(RD_KAFKA_PARTITION_UA, 0, "Yay consistent hashing", $user->getId()); $topic->produce(1, 0, "This will always be sent to partition 1");
CONSUMERS
cheap
only metadata stored per consumer: offset
guaranteed to always have messages in right order (within a
partition)
can themselves produce new messages! (but there is also a
Streams API for pure transformations)
None
BASIC CONSUMER $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk =
new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.interval.ms', 100); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); }
AT-MOST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); $topic->offsetStore($msg->partition, $msg->offset); do_something($msg); }
AT-LEAST ONCE DELIVERY $conf = new RdKafka\Conf(); $conf->set('group.id', 'myConsumerGroup'); $rk
= new RdKafka\Consumer($conf); $rk->addBrokers("127.0.0.1"); $topicConf = new RdKafka\TopicConf(); $topicConf->set('auto.commit.enable', false); $topic = $rk->newTopic("test", $topicConf); $topic->consumeStart(0, RD_KAFKA_OFFSET_STORED); while (true) { $msg = $topic->consume(0, 120*10000); do_something($msg); $topic->offsetStore($msg->partition, $msg->offset); }
EXACTLY-ONCE DELIVERY
you cannot have exactly-once delivery
THE BYZANTINE GENERALS "together we can beat the monsters. let's
both attack at 07:00?" "confirm, we attack at 07:00" ☠
USE CASES
• LinkedIn • Yahoo • Twitter • Netflix • Square
• Spotify • Pinterest • Uber • Goldman Sachs • Tumblr • PayPal • Airbnb • Mozilla • Cisco • Etsy • Foursquare • Shopify • CloudFlare
ingest the Twitter firehose and turn it into a pointless
demo ;)
None
messaging, of course
track user activity
record runtime metrics
aggregate logs
IoT (you could still e.g. use MQTT over the wire,
and bridge to Kafka)
replicate information between data centers (also see Connector API)
Event Sourcing broker :)
WAL / Commit Log for another system
billing!
"shock absorber" between systems to avoid overload of DBs, APIs,
etc.
in PHP: mostly producing messages; better languages exist for consuming
The End
THANK YOU FOR LISTENING! Questions? Ask me: @dzuelke &
[email protected]