Breaking magical barriers

Breaking “Magical Barriers” Gerhard Lazu @Scale 2021.02

2020 - 0.05M mps 2021 - 1M mps Equinix Metal
c3.small x86, Debian 10, Docker 20.10.2, Erlang 23.2.3, RabbitMQ 3.9.0-alpha.466, 1 publisher, 1 consumer, 1 stream, 1 replica, 12B payload

2014 RabbitMQ Hits One Million Messages Per Second on Google
Compute Engine https://tanzu.vmware.com/content/blog/rabbitmq-hits-one-million-messages-per-second-on-google-compute-engine

2014 vs 2021 Year Nodes Conns. Queues Durable 2014 32
12,690 186 ❌ 2021 1 2 1 ✅

August 2, 2019 - 300mph https://newsroom.bugatti/en/feature-stories/bugatti-breaks-the-300-mph-barrier

Why is 300mph hard? Why is 1M mps hard?

CPU Max Freq. osiris_writer Messages Intel® Xeon® E-2278G 5.0GHz 36.7M
rps 1.4M mps AMD EPYC™ 7601 3.2GHz (2.7Ghz all) 23.6M rps 0.7M mps Your CPU? See ark.intel.com & amd.com for detailed CPU speciﬁcations

How fast is it on your CPU? # 1. Run
RabbitMQ docker run -it --rm --network host pivotalrabbitmq/rabbitmq-stream # 2. Run PerfTest (benchmark) docker run -it --rm --network host pivotalrabbitmq/stream-perf-test # 3. Find out your max reductions docker exec -it [rabbitmq-server] rabbitmq-diagnostics -- observer # rr <ENTER> 10000 <ENTER>

Tires ~ Disks Write IOPS Read IOPS Write MB/s Read
MB/s HDD 7.5k 7.5k 0.4k 1.2k SSD 75k 75k 1.2k 1.2k NVMe 1,200k 2,400k 4.6k 4.6k https://cloud.google.com/compute/docs/disks/performance

Streams & Disks 1 Stream 10 Streams 100 Streams Network
HDD 1.1M mps 7.6M mps 9.0M mps Network SSD 1.1M mps 7.6M mps 9.0M mps Local NVMe 1.1M mps 7.5M mps 8.6M mps GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes, 3 replicas per stream, 1 publisher & 1 consumer per stream, 12B payload

Streams & Disks 1 Stream 10 Streams 100 Streams Network
HDD 1.1M mps 7.6M mps 9.0M mps Network SSD 1.1M mps 7.6M mps 9.0M mps Local NVMe 1.1M mps 7.5M mps 8.6M mps Disks are not the bottleneck GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes, 3 replicas per stream, 1 publisher & 1 consumer per stream, 12B payload

What about a diﬀerent payload size? —How realistic is 12B?
—8kB sounds more real-world —1M mps @ 8kB translates to 8k MB/s (64Gbps) —replicated 2x & streamed to consumers

Aerodynamics

Classic Mirrored Queue max throughput 0.015M mps 1 publisher, 1
consumer, 12B payload, 3 replicas GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes https://rabbitmq.com/blog/category/performance-2/

Quorum Queue max throughput 0.030M mps - 2x 1 publisher,
1 consumer, 12B payload, 3 replicas GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes https://rabbitmq.com/blog/category/performance-2

Stream max throughput 1.1M mps 36x 1 publisher, 1 consumer,
12B payload, 3 replicas GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes

What made the biggest diﬀerence? Binary protocol 1 Stream Binary
protocol 1.113M mps (40x) AMQP 0.9.1 0.027M mps GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes, 3 replicas per stream, 1 publisher & 1 consumer per stream, 12B payload

What is a Stream? —A durable, replicated log of messages
—Much simpler data structures than a queue —With message replay / time-travelling —Built for large fan-outs (many consumers) —Intended for deep backlogs (billions of messages) —Speed is an unintended feature

https://tgi.rabbitmq.com

More is coming —Super Streams that scale better, horizontally —Erlang
v24 JIT with up to 50% more performance —NVMe storage is becoming more common —ARM with more instruction decoders & cores —Better platforms for testing & benchmarking

What if you don't need a car? Gerhard Lazu @Scale
2021.02

Breaking magical barriers

Breaking magical barriers

Gerhard Lazu

More Decks by Gerhard Lazu

Other Decks in Technology

Featured

Transcript