CodeFest 2018. Eric Proegler (Medidata Solutions) — Interpreting Performance Test Results

Interpreting Performance Test Results Eric Proegler Director, Test Engineering

2 Eric Proegler • Former Perf Consultant, Perf Tool Product
Manager • 22 Years in Software, 18 in Testing • VP and Treasurer, AST • Lead Organizer, WOPR • Podcast Host, PerfBytes • @ericproegler on Twitter

6 Summary • 0.5 sec is baseline response time •
2.4 sec is almost five times as long (2x heuristic) • Linear Increase with Load means exhausted resource • Stability under load means no crashing or obvious decay • Beware Average/Median data sources

8 Summary • Load Test to 2,000 Users, ramp over
34 mins, 70 min test • Node 1 CPU @100% @1500 users. Node 2 CPU @0% • Bad Load Balancing - very common in test environments • Overloaded CPU on one node - two nodes likely enough ◦ Beware aggregation ◦ Incorrect information about capacity ◦ Bad Response time info ◦ Errors?

10 Summary • Load Test to 100 Users ramp over
45 mins, 12 hr soak test • Response time for most expensive transaction tracked - variable but consistent • Web Server CPU also steady • Lack of degradation suggests system is stable/not leaky

11 Increase in RT

12 Summary • Load Test to 200 Users ramp over
45 mins, 1:45 test • Response time for most expensive transaction again - increase seems to track load, but returns • Web Server CPU steady throughout, ~2x of Soak Test • 3rd party calls behind transaction causing variability

14 Summary • Load Test to 20,000 Users ramp over
15 mins, 30 min test • Response time ~30 ms, inflection @20 mins • Message Queue backup was cause

20 Session Stats, Summarized • 9.8M Sessions/157M Pageviews peak month
• 990K Sessions/19M Pageviews peak day • 88K Sessions/1.47M Pageviews peak hour • 8k Sessions/25K Pageviews peak minute

21 Month Day Hour Minute Ratio Sessions (M) 9,800,000 326,667
13,611 227 Page Views (M) 157,000,000 5,233,333 218,056 3,634 16 Sessions (D) 29,700,000 990,000 41,250 688 Page Views (D) 570,000,000 19,000,000 791,667 13,194 19 Sessions (H) 63,360,000 2,112,000 88,000 1,467 Page Views (H) 1,058,400,000 35,280,000 1,470,000 24,500 17 Sessions (m) 345,600,000 11,520,000 480,000 8,000 Page Views (m) 1,080,000,000 36,000,000 1,500,000 25,000 3 Comical Wrong Close Good

22 …Think “CAVIAR” C ollecting A ggregating V isualizing I
nterpreting A ssessing R eporting

23 Collecting: Gather all results from test that • help
gain confidence in results validity • Portray system scalability, throughput & capacity • provide bottleneck / resource limit diagnostics • help formulate hypotheses

24 Aggregating: Summarize measurements using • Various sized time-buckets to
provide tree & forest views • Consistent time-buckets across types to enable accurate correlation • Meaningful statistics: scatter, min-max range, variance, percentiles • Multiple metrics to “triangulate”, confirm (or invalidate) hypotheses

25 Visualizing: Data Sensemaking Key graphs, in order of importance
• Errors over load (“results valid?”) • Bandwidth throughput over load (“system bottleneck?”) • Response time over load (“how does system scale?”) • Business process end-to-end • Page level (min-avg-max-SD-90th percentile) • System resources (“how’s the infrastructure capacity?”) • Server cpu over load • JVM heap memory/GC • DB lock contention, I/O Latency

26 Interpreting: Draw conclusions from observations, hypotheses • Observations: objective,
quantitative observations “I observe that…”; no evaluation at this point! • Correlations: Correlate / triangulate graphs / data “Comparing graph A to graph B…” – relate observations to each other • Hypotheses: Develop from correlated observations, test and achieve consensus among tech teams “It appears as though…” – test these with extended team; corroborate with other information (anecdotal observations, manual tests) • Conclusions: Turn validated hypotheses into conclusions “From observations a, b, c, corroborated by d, I conclude that…”

27 Assessing: Draw conclusions from observations, hypotheses Objective: Turn conclusions
into recommendations Tie conclusions back to test objectives – were objectives met? Determine remediation options at appropriate level – business, middleware, application, infrastructure, network Perform agreed-to remediation, retest as appropriate Recommendations: Specific and actionable at a business or technical level Should be reviewed (and if possible, supported) by the teams that need to perform the actions (nobody likes surprises!) Should quantify the benefit, if possible the cost, and the risk of not doing it Final outcome is collective’s judgment, not yours

28 Reporting: Data + Analysis = INFORMATION Who is your
audience? • Do they want 50 graphs and 20 tables? Make rich detail available, but don’t make people wade through it • Summarize to one page. Summarize to three paragraphs. Summarize to 30 seconds. This last report reaches more people than an email or written report What will you tell them? • What did you learn? Study your results, look for correlations. • What 3 things will you convey? What is needed to support these 3 things? • Discuss findings with technical team members: “What does this look like to you?” Get feedback

@ericproegler Questions? [email protected] Eric Proegler Director, Test Engineering

CodeFest 2018. Eric Proegler (Medidata Solution...

CodeFest 2018. Eric Proegler (Medidata Solutions) — Interpreting Performance Test Results

CodeFest

More Decks by CodeFest

Other Decks in Programming

Featured

Transcript

Interpreting Performance Test Results Eric Proegler Director, Test Engineering

2 Eric Proegler • Former Perf Consultant, Perf Tool Product

3

4

5

6 Summary • 0.5 sec is baseline response time •

7

8 Summary • Load Test to 2,000 Users, ramp over

9

10 Summary • Load Test to 100 Users ramp over

11 Increase in RT

12 Summary • Load Test to 200 Users ramp over

13

14 Summary • Load Test to 20,000 Users ramp over

15

16

17

18

19

20 Session Stats, Summarized • 9.8M Sessions/157M Pageviews peak month

21 Month Day Hour Minute Ratio Sessions (M) 9,800,000 326,667

22 …Think “CAVIAR” C ollecting A ggregating V isualizing I

23 Collecting: Gather all results from test that • help

24 Aggregating: Summarize measurements using • Various sized time-buckets to

25 Visualizing: Data Sensemaking Key graphs, in order of importance

26 Interpreting: Draw conclusions from observations, hypotheses • Observations: objective,

27 Assessing: Draw conclusions from observations, hypotheses Objective: Turn conclusions

28 Reporting: Data + Analysis = INFORMATION Who is your

@ericproegler Questions? [email protected] Eric Proegler Director, Test Engineering