Benchmarking Pusher using Node.js

Benchmarking Pusher using Node.js Paweł Ledwoń London Node.js User Group,
Nov ’12

OVERVIEW Why we do this Evolution of our approach Architecture
of our framework What we learned about Node.js No Memes

FIND LIMITS ESTIMATE HARDWARE NEEDS CHECK HIGH-LOAD STABILITY TEST SOFTWARE
UPDATES FIND OPTIMAL HARDWARE GAIN MORE CONFIDENCE GET NUMBERS FOR CUSTOMERS

PERFORMANCE is a FEATURE

THIS IS HOW MOST TESTS LOOK LIKE 1. write a
shell script to open connections 2. write another script to send messages 3. run system locally or on a benchmarking cluster 4. start generating load 5. pray for results to be good

WHAT IF SOMETHING BREAKS? 1. first thought: what happened? 2.
you probably don’t gather enough metrics 3. you might be doing too many things at once 4. you can’t be sure your script is not the bottleneck 5. altering scripts is time-consuming and error-prone 6. distributing scripts to vary the load is awkward

SO MANY THINGS TO GO WRONG EC2 PERFORMANCE NODE.JS ISSUES
CRAPPY IO GARBAGE COLLECTION SPOFS LOAD BALANCERS MISCONFIGURATION SLOW LIBRARIES BAD DAY LOGGING RUBY DAEMONS BUGS KERNEL …

HOW CAN WE DEAL WITH ISSUES? unit testing integration testing
micro-benchmarking macro-benchmarking ~ ~ we know a bit about testing software, right?

MICRO-BENCHMARKS prove that components work avoid dependencies are usually quick
and simple are use-case specific don’t test glue code don’t catch side-effects

MACRO-BENCHMARKS show that whole system performs test dependencies and side-effects
can show things you never expected are difficult and time-consuming need systematic approach make finding some issues difficult

EVOLUTION monolithic script ran manually local micro-benchmarks and spreadsheets distributed
version of micro-benchmarks supervised workers automatic metric recording more types of benchmarks implementing indexer ???

GOOD PRACTICES start with micro-benchmarks build them up to proper
integration tests when in trouble, profile change one variable at a time monitor everything you can, seriously work on your graphs, text sucks

THE FRAMEWORK

api api socket socket socket

WORKERS do the actual work send api requests listen on
WebSocket connections gather client metrics have different roles (api, socket) independent of each other live outside of Pusher’s cluster

api api socket socket socket api worker socket worker socket
worker socket worker socket worker

MONITORS report metrics from the system run inside Pusher’s cluster

api api socket socket socket api worker socket worker socket
worker socket worker socket worker monitor monitor monitor monitor monitor

SUPERVISOR keeps track of workers & monitors assigns roles and
tasks synchronizes other processes

api api socket socket socket supervisor api worker socket worker
socket worker socket worker socket worker monitor monitor monitor monitor monitor

REDIS message bus (pub/sub) one channel per process temporary storage

socket worker socket worker socket worker monitor monitor monitor monitor monitor

REPORTING METRICS all workers report their data to logs JSON
objects, basically no schema logs are streamed to a central place currently using just rsyslog

socket worker socket worker socket worker monitor monitor monitor monitor monitor logs

INDEXER reads logs in real-time extracts and aggregates data stores
indexed metrics in db… …which temporarily is Redis can update graphs using Pusher

socket worker socket worker socket worker monitor monitor monitor monitor monitor indexer logs

BENCHMARK SPECIFICATION benchmark name number of connections number of channels
number of connections per channel socket/api hosts, keys number of workers of each type per host list of tasks for each type of workers metrics to collect on workers monitor hosts metrics to collect on monitors steps to prepare for benchmark

IMPLEMENTED IN COFFEESCRIPT

INDEXER

GOALS store benchmarks as cheaply as possible potentially gigabytes of
data per run separate data storage from load generation allow adding metric views without: changing benchmarking code re-running benchmarks

METRIC CACHE represents smallest piece of aggregated data benchmark-wide: start/finish
time per-second: message latency fetches existing data on initialization three methods update – metric specific store – saves data in the db is dirty? – true if modified since last store

indexer log entry processor #1 processor #2 processor #3 cache
#1 cache #2 cache #3 cache #4 benchmark cache per-second benchmark cache database

FAULT TOLERANCE what if indexer fails in the middle of
file? file is potentially several GB of size re-indexing is not an option what if logs are being delayed?

indexer log entry processor #1 processor #2 processor #3 cache
#1 cache #2 cache #3 cache #4 benchmark cache per-second benchmark cache log offset state

RECOVERY 1. start the process 2. fetch whole state from
Redis 3. restore caches 4. start reading from saved offset

IMPLEMENTED IN RUBY

NODE.JS

KNOW YOUR GARBAGE COLLECTOR

DOS AND DON’TS use short-living objects minimize # of long-living
objects profile gc (--trace_gc, etc.) premature optimization buffer pools on V8 heap

TRUST V8

V8 performance of v8 is decent writing JS is much
easier… …not to mention debugging see Felix Geisendörfer’s talk C++ is premature optimization

DON’T TRUST NPM

DON’T TRUST NPM … YET

MANY NPM MODULES untested unmaintained over-engineered npm spam obviously, there
are good modules this should change with time

SUB-SECOND METRICS

1 SECOND IS NOT GRANULAR ENOUGH we can send messages
with few ms latency aggregates hide some issues garbage collection specific code inefficiencies care about percentiles, not averages

MONITOR EVENT LOOP

SIMPLE TRICK 1. use setInterval with 10ms delay 2. every
tick increment value of current time window 3. show on a graph like 2 slides before visualize delays as a sub-second heatmap

FUTURE

FUTURE store more metrics test for more use-cases update graphs
in real-time run benchmarks continuously generate reports automatically easier setup

THANKS! Paweł Ledwoń @pawel_ledwon github.com/pl

Benchmarking Pusher using Node.js

Benchmarking Pusher using Node.js

More Decks by Paweł Ledwoń

Other Decks in Programming

Featured

Transcript