Breaking magical barriers

Slide 1

Slide 1 text

Breaking “Magical Barriers” Gerhard Lazu @Scale 2021.02

Slide 2

Slide 2 text

2020 - 0.05M mps 2021 - 1M mps Equinix Metal c3.small x86, Debian 10, Docker 20.10.2, Erlang 23.2.3, RabbitMQ 3.9.0-alpha.466, 1 publisher, 1 consumer, 1 stream, 1 replica, 12B payload

Slide 3

Slide 3 text

2014 RabbitMQ Hits One Million Messages Per Second on Google Compute Engine https://tanzu.vmware.com/content/blog/rabbitmq-hits-one-million-messages-per-second-on-google-compute-engine

Slide 4

Slide 4 text

2014 vs 2021 Year Nodes Conns. Queues Durable 2014 32 12,690 186 ❌ 2021 1 2 1 ✅

Slide 5

Slide 5 text

August 2, 2019 - 300mph https://newsroom.bugatti/en/feature-stories/bugatti-breaks-the-300-mph-barrier

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Why is 300mph hard? Why is 1M mps hard?

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

CPU Max Freq. osiris_writer Messages Intel® Xeon® E-2278G 5.0GHz 36.7M rps 1.4M mps AMD EPYC™ 7601 3.2GHz (2.7Ghz all) 23.6M rps 0.7M mps Your CPU? See ark.intel.com & amd.com for detailed CPU speciﬁcations

Slide 14

Slide 14 text

How fast is it on your CPU? # 1. Run RabbitMQ docker run -it --rm --network host pivotalrabbitmq/rabbitmq-stream # 2. Run PerfTest (benchmark) docker run -it --rm --network host pivotalrabbitmq/stream-perf-test # 3. Find out your max reductions docker exec -it [rabbitmq-server] rabbitmq-diagnostics -- observer # rr 10000

Slide 15

Slide 15 text

Tires

Slide 16

Slide 16 text

Tires

Slide 17

Slide 17 text

Tires ~ Disks Write IOPS Read IOPS Write MB/s Read MB/s HDD 7.5k 7.5k 0.4k 1.2k SSD 75k 75k 1.2k 1.2k NVMe 1,200k 2,400k 4.6k 4.6k https://cloud.google.com/compute/docs/disks/performance

Slide 18

Slide 18 text

Streams & Disks 1 Stream 10 Streams 100 Streams Network HDD 1.1M mps 7.6M mps 9.0M mps Network SSD 1.1M mps 7.6M mps 9.0M mps Local NVMe 1.1M mps 7.5M mps 8.6M mps GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes, 3 replicas per stream, 1 publisher & 1 consumer per stream, 12B payload

Slide 19

Slide 19 text

Streams & Disks 1 Stream 10 Streams 100 Streams Network HDD 1.1M mps 7.6M mps 9.0M mps Network SSD 1.1M mps 7.6M mps 9.0M mps Local NVMe 1.1M mps 7.5M mps 8.6M mps Disks are not the bottleneck GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes, 3 replicas per stream, 1 publisher & 1 consumer per stream, 12B payload

Slide 20

Slide 20 text

What about a diﬀerent payload size? —How realistic is 12B? —8kB sounds more real-world —1M mps @ 8kB translates to 8k MB/s (64Gbps) —replicated 2x & streamed to consumers

Slide 21

Slide 21 text

Aerodynamics

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

Classic Mirrored Queue max throughput 0.015M mps 1 publisher, 1 consumer, 12B payload, 3 replicas GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes https://rabbitmq.com/blog/category/performance-2/

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

Quorum Queue max throughput 0.030M mps - 2x 1 publisher, 1 consumer, 12B payload, 3 replicas GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes https://rabbitmq.com/blog/category/performance-2

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

Stream max throughput 1.1M mps 36x 1 publisher, 1 consumer, 12B payload, 3 replicas GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

What made the biggest diﬀerence? Binary protocol 1 Stream Binary protocol 1.113M mps (40x) AMQP 0.9.1 0.027M mps GCP c2-standard-16, Ubuntu 18.04, Erlang 23.2.1, RabbitMQ 3.8.10+8.g5247909.25+stream, 3 nodes, 3 replicas per stream, 1 publisher & 1 consumer per stream, 12B payload

Slide 36

Slide 36 text

What is a Stream? —A durable, replicated log of messages —Much simpler data structures than a queue —With message replay / time-travelling —Built for large fan-outs (many consumers) —Intended for deep backlogs (billions of messages) —Speed is an unintended feature

Slide 37

Slide 37 text

https://tgi.rabbitmq.com

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

More is coming —Super Streams that scale better, horizontally —Erlang v24 JIT with up to 50% more performance —NVMe storage is becoming more common —ARM with more instruction decoders & cores —Better platforms for testing & benchmarking

Slide 40

Slide 40 text

What if you don't need a car? Gerhard Lazu @Scale 2021.02