Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bulletproof Node.js Backends (with load-testing & Artillery)

Bulletproof Node.js Backends (with load-testing & Artillery)

Node.js London Meetup, May 2016 - Why load test, do's and don'ts of load-testing, and why Artillery.io is a great choice for load-testing, plus a quick overview of flamegraphs and heapdumps in Node.js

hassy veldstra

May 05, 2016
Tweet

More Decks by hassy veldstra

Other Decks in Programming

Transcript

  1. # whoami • Node.js & DevOps engineer • Worked on

    some awesome Node.js projects • Open-source: Artillery, Chaos Llama, Dino etc
  2. # whoami • Node.js & DevOps engineer • Worked on

    some awesome Node.js projects • Open-source: Artillery, Chaos Llama, Dino etc • Stalk away! • https://github.com/hassy • https://github.com/shoreditch-ops • https://twitter.com/hveldstra
  3. Node.js Backends • BFF • Web API (something that speaks

    HTTP/REST) • IoT backend (big peaks!)
  4. Node.js Backends • BFF • Web API (something that speaks

    HTTP/REST) • IoT backend (big peaks!) • Realtime app (WebSocket / socket.io) – a chat app or a game backend
  5. Node.js Backends • BFF • Web API (something that speaks

    HTTP/REST) • IoT backend (big peaks!) • Realtime app (WebSocket / socket.io) – a chat app or a game backend • Good old web application
  6. Performance • Backend performance • Two options: • You find

    problems • Users find them • Or: • You crash your app in development • Or it crashes under load in production
  7. Performance • Backend performance • Two options: • You find

    problems • Users find them • Or: • You crash your app in development • Or it crashes under load in production • Or: • You don’t need to worry if you like downtime (and getting paged) • Of if your users love every extra 100ms of response time
  8. Load-testing Load testing is the process of putting demand on

    a software system or computing device and measuring its response. Load testing is performed to determine a system's behavior under both normal and anticipated peak load conditions. -Wikipedia
  9. Load-testing • More than just stress testing • Learning tool!

    • Learn about the limits of your code, your dependencies, your whole stack • Example: New Relic agent for Node can reduce the performance of your up by as much as 20% under high load scenarios
  10. Load-testing • More than just stress testing • Learning tool!

    • Learn about the limits of your code, your dependencies, your whole stack • Example: New Relic agent for Node can reduce the performance of your up by as much as 20% under high load scenarios • Design aid • PDD: http://bit.ly/1rndEiw or https://blog.yld.io
  11. Load-testing Checklist • Pick the right tool • Know what

    you’re testing • X-ray vision • i.e. visibility into what’s happening in your whole system, across the stack
  12. Tools! • To load test, you need a load generator

    • Basically, a tool that can send a lot of requests to a server, very quickly
  13. Tools! • To load test, you need a load generator

    • Basically, a tool that can send a lot of requests to a server, very quickly • Many options (30+ listed on Wikipedia)
  14. Tools! • To load test, you need a load generator

    • Basically, a tool that can send a lot of requests to a server, very quickly • Many options (30+ listed on Wikipedia) • Where they differ is: • protocols they support • how flexible they are in allowing you to model the load • UI, UX, integrations
  15. Artillery • Supports multiple protocols (HTTP + WebSocket out of

    the box), extensible ”engine” interface. HTTP/2 and MQTT support coming.
  16. Artillery • Supports multiple protocols (HTTP + WebSocket out of

    the box), extensible ”engine” interface. HTTP/2 and MQTT support coming. • Has a nice CLI. Integrates well with CI servers.
  17. Artillery • Supports multiple protocols (HTTP + WebSocket out of

    the box), extensible ”engine” interface. HTTP/2 and MQTT support coming. • Has a nice CLI. Integrates well with CI servers. • Very flexible for modeling realistic load - user scenarios, arrival rates, loops
  18. Artillery • Supports multiple protocols (HTTP + WebSocket out of

    the box), extensible ”engine” interface. HTTP/2 and MQTT support coming. • Has a nice CLI. Integrates well with CI servers. • Very flexible for modeling realistic load - user scenarios, arrival rates, loops • Virtual user behavior is scriptable with JS
  19. Artillery • Supports multiple protocols (HTTP + WebSocket out of

    the box), extensible ”engine” interface. HTTP/2 and MQTT support coming. • Has a nice CLI. Integrates well with CI servers. • Very flexible for modeling realistic load - user scenarios, arrival rates, loops • Virtual user behavior is scriptable with JS • Good performance (multicore)
  20. Artillery – Mental Model • Virtual users, arriving to your

    service • Phases control arrivals • Each arrival is a new TCP socket (just like in the real world), picks a scenario, runs through it to completion or failure
  21. Artillery – Mental Model • Virtual users, arriving to your

    service • Phases control arrivals • Each arrival is a new TCP socket (just like in the real world), picks a scenario, runs through it to completion or failure • Can reuse the same connection to run multiple scenarios with `loop`
  22. When NOT to use Artillery • You hate CLIs •

    You love Java Swing GUIs from the 90s
  23. When NOT to use Artillery • You hate CLIs •

    You love Java Swing GUIs from the 90s • You need to benchmark a web server
  24. When NOT to use Artillery • You hate CLIs •

    You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala
  25. When NOT to use Artillery • You hate CLIs •

    You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • Use Gatling (but make sure other devs, ops, and QA all know & love Scala too)
  26. When NOT to use Artillery • You hate CLIs •

    You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • s/Scala/Python/ ?
  27. When NOT to use Artillery • You hate CLIs •

    You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • s/Scala/Python/ ? • Use Locust (same: make sure everyone knows Python. Also it’s HTTP-only)
  28. When NOT to use Artillery • You hate CLIs •

    You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • s/Scala/Python/ ? • Otherwise, ARTILLERY :D
  29. When NOT to use Artillery • You hate CLIs •

    You love Java Swing GUIs from the 90s • You need to benchmark a web server • You love Scala • s/Scala/Python/ ? • Otherwise, ARTILLERY :D • Everybody know JS, and JSON/YAML
  30. Load-testing: Modeling • One of the most important things when

    load testing – being able to model the load accurately
  31. Load-testing: Modeling • One of the most important things when

    load testing – being able to model the load accurately • Meaning – your load test needs to accurately simulate realistic conditions
  32. Load-testing: Modeling • One of the most important things when

    load testing – being able to model the load accurately • Meaning – your load test needs to accurately simulate realistic conditions • If it doesn’t, you’re wasting your time
  33. Load-testing: Modeling • Web server – lots of concurrent GET

    requests, something like wrk or loadtest is good: • wrk –c20 –d30s http://127.0.0.1/index.html • loadtest –c 20 –n 500 http://127.0.0.1/index.html
  34. Load-testing: Modeling • Web server – lots of concurrent GET

    requests, something like wrk or loadtest is good: • wrk –c20 –d30s http://127.0.0.1/index.html • loadtest –c 20 –n 500 http://127.0.0.1/index.html • Web application or API – need scenarios, e.g.: • GET /search with some parameters • Pick one of the results • GET /products/$productId • POST /cart {productId: $productId}
  35. Load-testing: Modeling • Web server – lots of concurrent GET

    requests, something like wrk or loadtest is good: • wrk –c20 –d30s http://127.0.0.1/index.html • loadtest –c 20 –n 500 http://127.0.0.1/index.html • Web application or API – need scenarios, e.g.: • GET /search with some parameters • Pick one of the results • GET /products/$productId • POST /cart {productId: $productId} • For this you want JMeter or Gatling or Artillery
  36. Load-testing: Modeling • Web APIs / applications: • Public facing?

    • Behind a load-balancer? • How many TCP connections to open • Open all at once vs create new ones throughout the test
  37. Load-testing: Modeling • Web APIs / applications: • Public facing?

    • Behind a load-balancer? • How many TCP connections to open • Open all at once vs create new ones throughout the test • Throughput / load-distribution • For HTTP – requests • For WebSocket – messages/sec, pubsub patterns, active vs idle connections
  38. Load-testing: Modeling • Web APIs / applications: • Public facing?

    • Behind a load-balancer? • How many TCP connections to open • Open all at once vs create new ones throughout the test • Throughput / load-distribution • For HTTP – requests • For WebSocket – messages/sec, pubsub patterns, active vs idle connections • Shape of peaks and valleys
  39. Load-testing: Modeling • WebSocket benchmark example https://github.com/geekforbrains/websocket- tests/blob/master/artillery.json Isn’t testing

    what it’s supposed to test! (it’s testing TCP connection establishment speed, not Websocket implementations)
  40. Load-testing: Goals • Examples: • What is the p95 response

    time of this endpoint under heavy load, where heavy load = 25 RPS
  41. Load-testing: Goals • Examples: • What is the p95 response

    time of this endpoint under heavy load, where heavy load = 25 RPS • How many active users can my server sustain, where a user = a WebSocket connection with 1 published message per second which is forwarded onto 75 other users on average
  42. Load-testing: Goals • Examples: • What is the p95 response

    time of this endpoint under heavy load, where heavy load = 25 RPS • How many active users can my server sustain, where a user = a WebSocket connection with 1 published message per second which is forwarded onto 75 other users on average • How many concurrent user sessions above our current production peak can the production system sustain?
  43. Metrics • Response time • Changes in response time under

    load • And max acceptable p95 value
  44. Metrics • Response time • Changes in response time under

    load • And max acceptable p95 value • Throughput at capacity
  45. Metrics • Response time • Changes in response time under

    load • And max acceptable p95 value • Throughput at capacity • In Node.js: event loop lag, memory usage, GC activity
  46. Examples • React server rendering – moving from Rails to

    Node.js (using Rails as an API server)
  47. Examples – Server Rendering • How do you measure the

    performance of the Express handler? (to compare with Rails)
  48. Examples – Server Rendering • How do you measure the

    performance of the Express handler? (to compare with Rails) • Without network latency between client and Node.js server
  49. Examples – Server Rendering • How do you measure the

    performance of the Express handler? (to compare with Rails) • Without network latency between client and Node.js server • Without network latency between Node.js and Rails
  50. Examples – Server Rendering • How do you measure the

    performance of the Express handler? (to compare with Rails) • Without network latency between client and Node.js server • Without network latency between Node.js and Rails • Without the time Rails takes to return the JSON
  51. Examples – Server Rendering • Send requests to /foo/myview •

    Vary them (different combos of query params) for a realistic simulation
  52. Examples – Server Rendering • Send requests to /foo/myview •

    Vary them (different combos of query params) for a realistic simulation • Have the Node.js app record how long the request to Rails took and expose that as X- Backend-Response-Time header
  53. Examples – Server Rendering • Send requests to /foo/myview •

    Vary them (different combos of query params) for a realistic simulation • Have the Node.js app record how long the request to Rails took and expose that as X- Backend-Response-Time header • Use responseTime() middleware to have X- Response-Time
  54. Examples – Server Rendering • Send requests to /foo/myview •

    Vary them (different combos of query params) for a realistic simulation • Have the Node.js app record how long the request to Rails took and expose that as X-Backend- Response-Time header • Use responseTime() middleware to have X-Response- Time • For every request, the delta between the two is the time we spent purely in Express code – most of which is rendering React!
  55. Examples • React server rendering • Testing a proxy •

    Tested with production levels of traffic to make sure there were no memory leaks and to plan the number and type of dynos needed
  56. Examples • React server rendering • Testing a proxy •

    Load-testing an ecommerce app in production (FTSE 500 company)
  57. Examples • React server rendering • Testing a proxy •

    Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz.
  58. Examples • React server rendering • Testing a proxy •

    Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours
  59. Examples • React server rendering • Testing a proxy •

    Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours
  60. Examples • React server rendering • Testing a proxy •

    Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours • Start adding load during busy hours
  61. Examples • React server rendering • Testing a proxy •

    Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours • Start adding load during busy hours • Gradually increase to e.g. 2X production
  62. Examples • React server rendering • Testing a proxy •

    Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours • Start adding load during busy hours • Gradually increase to e.g. 2X production • Separate synthetic traffic (header, cookie)
  63. Examples • React server rendering • Testing a proxy •

    Load-testing an ecommerce app in production • Key: communication between dev, ops, and biz. • Start slowly in off-peak hours • Ramp up synthetic load to peak level during off hours • Start adding load during busy hours • Gradually increase to e.g. 2X production • Separate synthetic traffic (header, cookie) • No-op certain transactions if needed (e.g. checking out)
  64. X-Ray Vision • Must have visibility into your app –

    Heroku dashboard, New Relic, Graphana etc
  65. X-Ray Vision • Must have visibility into your app –

    Heroku dashboard, New Relic, Graphana etc • Flamegraphs
  66. X-Ray Vision • Must have visibility into your app –

    Heroku dashboard, New Relic, Graphana etc • Flamegraphs • Heapdumps
  67. Where do call stacks come from? • Linux: perf •

    SmartOS / FreeBSD / OSX: dtrace • Windows: ? L
  68. Flamegraphs – making one • node --perf_basic_prof myprogram.js • sudo

    perf record -q -F 99 -p $(pgrep node) -g - - sleep 10 • sudo chown root /tmp/perf-*.map • sudo perf script > /tmp/script • stackvis perf < /tmp/script > /tmp/flamegraph.html • (npm install -g stackvis)
  69. Flamegraphs – making one #2 • Use the 0x package

    • https://www.npmjs.com/package/0x • Works on both Linux and OSX (wraps perf or dtrace) • Generates nice flamegraphs • http://davidmarkclements.github.io/0x-demo/
  70. perf vs v8 profiler • Can sample as needed without

    restarting the process • Extremely low overhead
  71. perf vs v8 profiler • Can sample as needed without

    restarting the process • Extremely low overhead • Reaches into native code
  72. perf vs v8 profiler • Can sample as needed without

    restarting the process • Extremely low overhead • Reaches into native code • BONUS: Works for other languages too
  73. Flamegraphs tldr • Flamegraphs = awesome • Flamegraphs let you

    see what’s “hot” on the CPU in your Node.js code
  74. Flamegraphs + Heapdump • Used on Artillery itself – compile

    scenarios, analyze request.js overhead, debug a memory leak in request, debug a memory leak in handlebars • IME heapdump is something you’ll reach out for more often
  75. Summary • If you care about performance you must load-

    test • When you load-test, do it right • Model what actually happens in the app • Be methodical – hypothesis / question-driven • Learn how to instrument your apps • Profit!