A talk about Node.js profiling on production using Linux perf_events and FlameGraph/FlameScope, and a couple of findings
See https://shuheikagawa.com/blog/2018/09/16/node-js-under-a-microscope/ for more details
1Profiling Node.js apps onproduction
View Slide
2Shuhei Kagawa@shuheikagawaSoftware Engineerat ZalandoHi, I’m...
3Microservices
4Node.js serverswith React SSR
5Performance issue
6Mysterious GapAPI serverAPI client500 milliseconds
7test envproduction env
8
9Linux perf
10Small overhead
11JS & native
12node --perf-basic-prof-only-functions
13# Install dependencies for `perf` commandsudo apt-get install linux-tools-commonsudo apt-get install linux-tools-$(uname -r)
14# Record stack traces 99 times per second for 30secondssudo perf record -F 99 -p ${pid} -g -- sleep 30s# Generate human readable stack tracessudo perf script > stacks.${pid}.out
15•Image from http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html2,970 Stack Traces!!!
16
17Image from https://github.com/brendangregg/FlameGraphCPU Flame Graph by Brendan Gregg
18CPU Flame GraphGZIPJSON.parseJSON.parseReact
19Nothing looks sowrong…?
20https://github.com/Netflix/flamescopeFlameScopeby Netflix cloud performance team
21Finding 1: Metrics Collection fromHistogramsBusy for ~1.5s!
22Finding 1: Metrics Collection fromHistogramsJSON.stringify()JSON.parse()in a metrics library
23Finding 2: Garbage CollectionBusy for ~400msonce in ~10s
24Finding 2: Garbage CollectionUnused fallback cache wascausing slow GCs
25
26p99 response time: 50% ⬇
27
28Be Bold
29Thank you!
30• CPU Flame Graphs• FlameScope• A sample project• How to fix wrong symbols• Node.js under a MicroscopeLinks