Bulletproof Node.js Backends (with load-testing & Artillery)

Bulletproof Node.js Backends Hassy Veldstra <[email protected]> @hveldstra

# whoami • Node.js & DevOps engineer

# whoami • Node.js & DevOps engineer • Worked on
some awesome Node.js projects

some awesome Node.js projects • Open-source: Artillery, Chaos Llama, Dino etc

some awesome Node.js projects • Open-source: Artillery, Chaos Llama, Dino etc • Stalk away! • https://github.com/hassy • https://github.com/shoreditch-ops • https://twitter.com/hveldstra

Node.js is everywhere <3 <3 <3

Node.js Backends • BFF

Node.js Backends • BFF • Web API (something that speaks
HTTP/REST)

HTTP/REST) • IoT backend (big peaks!)

HTTP/REST) • IoT backend (big peaks!) • Realtime app (WebSocket / socket.io) – a chat app or a game backend

HTTP/REST) • IoT backend (big peaks!) • Realtime app (WebSocket / socket.io) – a chat app or a game backend • Good old web application

Performance • Backend performance

Performance • Backend performance • Two options: • You find
problems • Users find them

problems • Users find them • Or: • You crash your app in development • Or it crashes under load in production

problems • Users find them • Or: • You crash your app in development • Or it crashes under load in production • Or: • You don’t need to worry if you like downtime (and getting paged) • Of if your users love every extra 100ms of response time

Load-testing The only way to ship performant backends

Load-testing Load testing is the process of putting demand on
a software system or computing device and measuring its response. Load testing is performed to determine a system's behavior under both normal and anticipated peak load conditions. -Wikipedia

Load-testing • More than just stress testing

Load-testing • More than just stress testing • Learning tool!
• Learn about the limits of your code, your dependencies, your whole stack • Example: New Relic agent for Node can reduce the performance of your up by as much as 20% under high load scenarios

Load-testing • More than just stress testing • Learning tool!
• Learn about the limits of your code, your dependencies, your whole stack • Example: New Relic agent for Node can reduce the performance of your up by as much as 20% under high load scenarios • Design aid • PDD: http://bit.ly/1rndEiw or https://blog.yld.io

Load-testing Checklist • Pick the right tool

Load-testing Checklist • Pick the right tool • Know what
you’re testing

Less this

More this

you’re testing • X-ray vision

you’re testing • X-ray vision • i.e. visibility into what’s happening in your whole system, across the stack

Tools! • To load test, you need a load generator

• Basically, a tool that can send a lot of requests to a server, very quickly

• Basically, a tool that can send a lot of requests to a server, very quickly • Many options (30+ listed on Wikipedia)

• Basically, a tool that can send a lot of requests to a server, very quickly • Many options (30+ listed on Wikipedia) • Where they differ is: • protocols they support • how flexible they are in allowing you to model the load • UI, UX, integrations

Artillery

Artillery • Supports multiple protocols (HTTP + WebSocket out of
the box), extensible ”engine” interface. HTTP/2 and MQTT support coming.

the box), extensible ”engine” interface. HTTP/2 and MQTT support coming. • Has a nice CLI. Integrates well with CI servers.

the box), extensible ”engine” interface. HTTP/2 and MQTT support coming. • Has a nice CLI. Integrates well with CI servers. • Very flexible for modeling realistic load - user scenarios, arrival rates, loops

the box), extensible ”engine” interface. HTTP/2 and MQTT support coming. • Has a nice CLI. Integrates well with CI servers. • Very flexible for modeling realistic load - user scenarios, arrival rates, loops • Virtual user behavior is scriptable with JS

the box), extensible ”engine” interface. HTTP/2 and MQTT support coming. • Has a nice CLI. Integrates well with CI servers. • Very flexible for modeling realistic load - user scenarios, arrival rates, loops • Virtual user behavior is scriptable with JS • Good performance (multicore)

Artillery – Mental Model • Virtual users, arriving to your
service

service • Phases control arrivals

service • Phases control arrivals • Each arrival is a new TCP socket (just like in the real world), picks a scenario, runs through it to completion or failure

service • Phases control arrivals • Each arrival is a new TCP socket (just like in the real world), picks a scenario, runs through it to completion or failure • Can reuse the same connection to run multiple scenarios with `loop`

Artillery • Demo: https://bit.ly/20mUbyf

When NOT to use Artillery

When NOT to use Artillery • You hate CLIs

When NOT to use Artillery • You hate CLIs •
You love Java Swing GUIs from the 90s

http://bit.ly/1RYln1s (though we’re working on a cool React-based terminal UI)

You love Java Swing GUIs from the 90s • You need to benchmark a web server

You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala

You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • Use Gatling (but make sure other devs, ops, and QA all know & love Scala too)

You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • s/Scala/Python/ ?

You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • s/Scala/Python/ ? • Use Locust (same: make sure everyone knows Python. Also it’s HTTP-only)

You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • s/Scala/Python/ ? • Otherwise, ARTILLERY :D

You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • s/Scala/Python/ ? • Otherwise, ARTILLERY :D • Everybody know JS, and JSON/YAML

Load-testing: Modeling

Load-testing: Modeling • One of the most important things when
load testing – being able to model the load accurately

load testing – being able to model the load accurately • Meaning – your load test needs to accurately simulate realistic conditions

load testing – being able to model the load accurately • Meaning – your load test needs to accurately simulate realistic conditions • If it doesn’t, you’re wasting your time

Load-testing: Modeling • Web server – lots of concurrent GET
requests, something like wrk or loadtest is good: • wrk –c20 –d30s http://127.0.0.1/index.html • loadtest –c 20 –n 500 http://127.0.0.1/index.html

requests, something like wrk or loadtest is good: • wrk –c20 –d30s http://127.0.0.1/index.html • loadtest –c 20 –n 500 http://127.0.0.1/index.html • Web application or API – need scenarios, e.g.: • GET /search with some parameters • Pick one of the results • GET /products/$productId • POST /cart {productId: $productId}

requests, something like wrk or loadtest is good: • wrk –c20 –d30s http://127.0.0.1/index.html • loadtest –c 20 –n 500 http://127.0.0.1/index.html • Web application or API – need scenarios, e.g.: • GET /search with some parameters • Pick one of the results • GET /products/$productId • POST /cart {productId: $productId} • For this you want JMeter or Gatling or Artillery

Load-testing: Modeling • Web APIs / applications: • Public facing?
• Behind a load-balancer?

• Behind a load-balancer? • How many TCP connections to open • Open all at once vs create new ones throughout the test

• Behind a load-balancer? • How many TCP connections to open • Open all at once vs create new ones throughout the test • Throughput / load-distribution • For HTTP – requests • For WebSocket – messages/sec, pubsub patterns, active vs idle connections

• Behind a load-balancer? • How many TCP connections to open • Open all at once vs create new ones throughout the test • Throughput / load-distribution • For HTTP – requests • For WebSocket – messages/sec, pubsub patterns, active vs idle connections • Shape of peaks and valleys

Load-testing: Modeling • WebSocket benchmark example https://github.com/geekforbrains/websocket- tests/blob/master/artillery.json

Load-testing: Modeling • WebSocket benchmark example https://github.com/geekforbrains/websocket- tests/blob/master/artillery.json Isn’t testing
what it’s supposed to test! (it’s testing TCP connection establishment speed, not Websocket implementations)

Load-testing: Goals

Load-testing: Goals • Examples: • What is the p95 response
time of this endpoint under heavy load, where heavy load = 25 RPS

time of this endpoint under heavy load, where heavy load = 25 RPS • How many active users can my server sustain, where a user = a WebSocket connection with 1 published message per second which is forwarded onto 75 other users on average

time of this endpoint under heavy load, where heavy load = 25 RPS • How many active users can my server sustain, where a user = a WebSocket connection with 1 published message per second which is forwarded onto 75 other users on average • How many concurrent user sessions above our current production peak can the production system sustain?

Metrics • Response time

Metrics • Response time • Changes in response time under
load • And max acceptable p95 value

load • And max acceptable p95 value • Throughput at capacity

load • And max acceptable p95 value • Throughput at capacity • In Node.js: event loop lag, memory usage, GC activity

Examples

Examples • React server rendering – moving from Rails to
Node.js (using Rails as an API server)

Express Rails

Examples – Server Rendering • How do you measure the
performance of the Express handler? (to compare with Rails)

performance of the Express handler? (to compare with Rails) • Without network latency between client and Node.js server

performance of the Express handler? (to compare with Rails) • Without network latency between client and Node.js server • Without network latency between Node.js and Rails

performance of the Express handler? (to compare with Rails) • Without network latency between client and Node.js server • Without network latency between Node.js and Rails • Without the time Rails takes to return the JSON

Examples – Server Rendering • Send requests to /foo/myview

Examples – Server Rendering • Send requests to /foo/myview •
Vary them (different combos of query params) for a realistic simulation

Vary them (different combos of query params) for a realistic simulation • Have the Node.js app record how long the request to Rails took and expose that as X- Backend-Response-Time header

Vary them (different combos of query params) for a realistic simulation • Have the Node.js app record how long the request to Rails took and expose that as X- Backend-Response-Time header • Use responseTime() middleware to have X- Response-Time

Vary them (different combos of query params) for a realistic simulation • Have the Node.js app record how long the request to Rails took and expose that as X-Backend- Response-Time header • Use responseTime() middleware to have X-Response- Time • For every request, the delta between the two is the time we spent purely in Express code – most of which is rendering React!

Examples – Server Rendering • Full script here: https://gist.github.com/hassy/05bbfbc09ff98a9a5 5706e4490d3d86f

Examples – Server Rendering • Results: Rails over 1s, Node.js
~220ms p95 for the same view

Examples • React server rendering • Testing a proxy

Examples • React server rendering • Testing a proxy •
Tested with production levels of traffic to make sure there were no memory leaks and to plan the number and type of dynos needed

Load-testing an ecommerce app in production (FTSE 500 company)

Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz.

Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours

Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours

Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours • Start adding load during busy hours

Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours • Start adding load during busy hours • Gradually increase to e.g. 2X production

Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours • Start adding load during busy hours • Gradually increase to e.g. 2X production • Separate synthetic traffic (header, cookie)

Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours • Start adding load during busy hours • Gradually increase to e.g. 2X production • Separate synthetic traffic (header, cookie) • No-op certain transactions if needed (e.g. checking out)

Examples • Also see: • http://clarkie.io/devops/2016/04/24/load-testing- dynamodb.html

X-Ray Vision • Must have visibility into your app –
Heroku dashboard, New Relic, Graphana etc

Heroku dashboard, New Relic, Graphana etc • Flamegraphs

Heroku dashboard, New Relic, Graphana etc • Flamegraphs • Heapdumps

What is a flamegraph?

Where do call stacks come from? • Linux: perf •
SmartOS / FreeBSD / OSX: dtrace • Windows: ? L

Flamegraphs – making one • node --perf_basic_prof myprogram.js • sudo
perf record -q -F 99 -p $(pgrep node) -g - - sleep 10 • sudo chown root /tmp/perf-*.map • sudo perf script > /tmp/script • stackvis perf < /tmp/script > /tmp/flamegraph.html • (npm install -g stackvis)

Flamegraphs – making one #2 • Use the 0x package
• https://www.npmjs.com/package/0x • Works on both Linux and OSX (wraps perf or dtrace) • Generates nice flamegraphs • http://davidmarkclements.github.io/0x-demo/

Let’s look at one again • http://davidmarkclements.github.io/0x-demo/

perf vs v8 profiler • Can sample as needed without
restarting the process

restarting the process • Extremely low overhead

restarting the process • Extremely low overhead • Reaches into native code

restarting the process • Extremely low overhead • Reaches into native code • BONUS: Works for other languages too

Flamegraphs tldr • Flamegraphs = awesome • Flamegraphs let you
see what’s “hot” on the CPU in your Node.js code

heapdump

heapdump • npm install heapdump

heapdump • npm install heapdump • require(‘heapdump’)

heapdump • npm install heapdump • require(‘heapdump’) • kill -USR2
$my_node_process

heapdump • http://blog.yld.io/2015/08/10/debugging-memory- leaks-in-node-js-a-walkthrough/

Flamegraphs + Heapdump • Used on Artillery itself – compile
scenarios, analyze request.js overhead, debug a memory leak in request, debug a memory leak in handlebars • IME heapdump is something you’ll reach out for more often

Summary • If you care about performance you must load-
test • When you load-test, do it right • Model what actually happens in the app • Be methodical – hypothesis / question-driven • Learn how to instrument your apps • Profit!

Thanks! • https://artillery.io • Twitter: @hveldstra • http://veldstra.org

Bulletproof Node.js Backends (with load-testing...

Bulletproof Node.js Backends (with load-testing & Artillery)

More Decks by hassy veldstra

Other Decks in Programming

Featured

Transcript