Slide 1

Slide 1 text

Queue theory 101 @nukemberg (Avishai Ish-Shalom) Node.js edition @nukemberg

Slide 2

Slide 2 text

Why are we having this talk?

Slide 3

Slide 3 text

Attack of the killer queues They are everywhere! In your drivers, your sockets, your event loop! No one is safe

Slide 4

Slide 4 text

● Distributions have width ● Improbable results do happen ● Aggregate effects, particular effects ● A single numeric aggregate cannot capture the behavior The world is made of distributions

Slide 5

Slide 5 text

Variability/Dispersion ● How “wide” the distribution is ● Various measures: stddev, Variance, IQD, MAD... ● Distributions are infinite, our systems are not ⇒ cutoffs, timeouts ● Easy to raise variation, hard to reduce it

Slide 6

Slide 6 text

Variability effects on utilization Suppose you need to get from Jerusalem to Tel-Aviv: ● Train takes 40 minutes ● Mean delay = 5 minutes ● Delay P90 = 30 minutes ● Delay P99 = 60 minutes How early should you leave to be in Tel-Aviv by noon? With which SLA? How much time are you wasting in total?

Slide 7

Slide 7 text

The curse of high variation ● Utilization is limited by high variation ● Group work latency follows high percentiles (think Map/Reduce, Fork/Join) ● Customer satisfaction follows high percentiles ● Disasters follows tail behavior ● Failure demand (e.g. retries)

Slide 8

Slide 8 text

Variance is the engineer’s enemy

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Head of line blocking ● When some task takes longer, service center is “blocked” ● Other tasks in the queue are blocked by the “head of line” ● A single slow task will cause a bunch of other tasks to wait ○ Bad news for latency high percentiles

Slide 11

Slide 11 text

Tasks should be independent, but... ● Shared resources have queues ○ Disks, CPUs, Thread pools, connection pools, DB locks, sockets, event loop… ● Event loop phases share the same service center ● Head-of-line blocking → cross task interaction ○ Slow tasks raise latency of unrelated tasks ○ Arrival spikes ● High variance makes this worse

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

Capacity & latency ● Queue length (and latency) rise to infinity as utilization approaches 1 ● Decent latency ⇒ over capacity ● The slower the service, the higher the penalty ρ = arrival rate / service rate = utilization Q = Queue length http://queuemulator.gh.scylladb.com/

Slide 14

Slide 14 text

Implications Infinite queues: ● Memory pressure / OOM ● High latency ● Stale work Always limit queue size! Work item TTL*

Slide 15

Slide 15 text

● 10% fluctuation at 𝜌 = 0.5 will hardly affects latency (~ 1.1x) ● 10% fluctuation at 𝜌 = 0.9 will kill you (~ 10x latency) ● Be careful when overloading resources ● During peak load we must be extra careful ● Highly varied load must be capped Utilization fluctuates

Slide 16

Slide 16 text

Kingman formula ● The higher the variance, the worse the latency/utilization curve gets ● On both service rate and arrival rate ● high variance ⇒ run at low utilization * Oh and btw your percentile curve is worse too Qeuemulator

Slide 17

Slide 17 text

● High utilization → high latency ○ Non-linear! ● High variance → high latency ● Never use unlimited queues ● Interactive systems→ short queues; Batch systems → long queues ● Maintain proper utilization Executive summary

Slide 18

Slide 18 text

Event loop phases

Slide 19

Slide 19 text

Event loop code execution

Slide 20

Slide 20 text

Node queueing summary ● Event loop queues unlimited ● Easy to overload ● Blocking ⇒ high latency ● Large microtasks kill QoS ● await/.then()/process.nextTick() can still hog the event loop

Slide 21

Slide 21 text

Coping strategies

Slide 22

Slide 22 text

Avoid blocking the event loop; specifically, CPU heavy tasks ● Immediate suspects: large JSONs, RegEx, SSR ○ REDoS, JSON DoS ○ Size limits ○ Use async/stream friendly JSON parsers (bfj, JSONStream) ○ Offload server side rendering (react/vue/angular) to workers ● Offload heavy tasks to workers, remote processes (piscina) ● Limit loops, recursion, etc. ● Avoid sync functions Thou shalt not block!

Slide 23

Slide 23 text

● Split to small microtasks ● Use setImmediate to unblock the loop ● Work will continue in check phase ● Remember: Promise.then/await/process.nextTick() will requeue When in doubt, defer const yieldControl = new Promise((resolve) => setImmediate(resolve)) // Do something await yieldControl // Let other tasks run // Do more work after waking up

Slide 24

Slide 24 text

Apply some backpressure baby If the upstream applies pressure on you, apply pressure backwards on the upstream! ● Load needs to be controlled to avoid overload ● How do we tell upstreams we’re overloaded? ● Blocking semantics implicitly apply backpressure ● Network protocols support this (TCP backpressure, HTTP 429, 503, etc)

Slide 25

Slide 25 text

Backpressure? But how? ● For the lazy: Limit HTTP connections (express, koa) ○ TCP backpressure ● Limit concurrency (promise-pool, token buckets) ● Reject requests when event loop lag rises (node-toobusy) ○ HTTP backpressure: 503, 429 ● When in doubt, await

Slide 26

Slide 26 text

require(‘perf_hooks’) ● performance.eventLoopUtilization() ● monitorEventLoopDelay() ● Event loop lag (“hiccup”) ● GC Mind the metrics

Slide 27

Slide 27 text

TLDR ● Never block the event loop ● Break to small microtasks, defer ● Event loop queueing will kill your latency ● Monitor event loop lag ● Do not overload. Use backpressure and load shedding ● Maintain proper (low) utilization ● Reduce variation wherever possible

Slide 28

Slide 28 text

Questions? @nukemberg