Non-Functional Testing and Tuning - A Scientific Approach

@YourTwitterHandle #YourSessionHashtag Kevin Rudland & Andrew Harmel-Law, Capgemini UK http://capgemini.github.io
Non-Functional Testing and Tuning: A Scientific Approach @kevinrudland / @al94781 #test_nfrs_with_science

Who Are We ? • Andrew Harmel-Law • Kevin Rudland
@kevinrudland / @al94781 #test_nfrs_with_science

The bit you always write at the end after you’ve
done all the work which summarises your findings. ABSTRACT @kevinrudland / @al94781 #test_nfrs_with_science

Meta-Hypothesis: H1: We can give ourselves confidence that our hyper-distributed
systems will meet our Non Functional Requirements by applying the same method that scientists use to understand and predict the world around us: Science; specifically the Scientific Method. @kevinrudland / @al94781 #test_nfrs_with_science

INTRODUCTION (Pt. 1) The bit where we try and get
you to agree with us about the nature of the problem we need to solve. @kevinrudland / @al94781 #test_nfrs_with_science

Image from: http://www.slideshare.net/InfoQ/migrating-to-cloud-native-with-microservices (slide 66) @kevinrudland / @al94781 #test_nfrs_with_science

Have we lost, or are we in the process of
losing, predictability in the systems we build? @kevinrudland / @al94781 #test_nfrs_with_science

But Wait... “...if you are trying to break new ground
and be really innovative, that's where you have to apply first-principle thinking and try to identify the most fundamental truths in any particular arena and you reason up from there.” Elon Musk [Profiles in versatility - http: //www.aps. org/publications/apsnews/201310/profiles. cfm] talking about applying the methods of Physics to problems Musk at the 2015 Tesla Motors Annual Meeting Photo by Steve Jurvetson - https://www.flickr.com/photos/jurvetson/18659265152/, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=40974345 @kevinrudland / @al94781 #test_nfrs_with_science

COUNTER-PROPOSAL: This proliferation is a good thing Scalability? - Up!
Resilience? - Up! Service Availability? - Up! Maintainability? - Up! Monitorability? - Up! Deployment Frequency? - Up! Time-to-change? - Down! Flexibility? - Up! Cost? - Focussed! @kevinrudland / @al94781 #test_nfrs_with_science

Rather than having lost predictability, did we ever really have
it in the first place? And haven’t we actually gained as a result? @kevinrudland / @al94781 #test_nfrs_with_science

So then how do we know we have these qualities?
(Given that we’re agreed we've fundamentally relinquished *control* of them.) @kevinrudland / @al94781 #test_nfrs_with_science

from “The Phoenix Project: A Novel About IT, DevOps, and
Helping Your Business Win" by Gene Kim, Kevin Behr & George Spafford @kevinrudland / @al94781 #test_nfrs_with_science

1. Testing simply to validate is no longer enough 16th-century
illustration of Archimedes in the bath, with Hiero's crown at bottom right By Unknown - Historia N° 767 - Novembre 2010 - page 38, Public Domain, https: //commons.wikimedia.org/w/index.php?curid=12080104 @kevinrudland / @al94781 #test_nfrs_with_science

@kevinrudland / @al94781 #test_nfrs_with_science 1. Testing simply to validate is
no longer enough 2. We must test to investigate and learn 16th-century illustration of Archimedes in the bath, with Hiero's crown at bottom right By Unknown - Historia N° 767 - Novembre 2010 - page 38, Public Domain, https: //commons.wikimedia.org/w/index.php?curid=12080104

1. Testing simply to validate is no longer enough 2.
We must test to investigate and learn 3. And the best way to do that is to hypothesise, experiment, measure and adapt… 16th-century illustration of Archimedes in the bath, with Hiero's crown at bottom right By Unknown - Historia N° 767 - Novembre 2010 - page 38, Public Domain, https: //commons.wikimedia.org/w/index.php?curid=12080104 @kevinrudland / @al94781 #test_nfrs_with_science

1. Testing simply to validate is no longer enough 2.
We must test to investigate and learn 3. And the best way to do that is to hypothesise, experiment, measure and adapt… Repeatedly. 16th-century illustration of Archimedes in the bath, with Hiero's crown at bottom right By Unknown - Historia N° 767 - Novembre 2010 - page 38, Public Domain, https: //commons.wikimedia.org/w/index.php?curid=12080104 @kevinrudland / @al94781 #test_nfrs_with_science

Meta-Hypothesis #1: H1: We can give ourselves confidence that our
hyper-distributed systems will meet our Non Functional Requirements by applying the same method that scientists use to understand and predict the world around us: Science; specifically the Scientific Method. @kevinrudland / @al94781 #test_nfrs_with_science

INTRODUCTION (Pt. 2) The after we agree what the problem
is and have come to a possible solution but still need some details about the approach @kevinrudland / @al94781 #test_nfrs_with_science

“The scientific method is a body of techniques for investigating
phenomena [&] acquiring new knowledge" Ibn al-Haytham (Alhazen), considered the father of the modern scientific methodology by Unknown - http://islamicrenaissance.tumblr. com/post/38457296805/ibn-al-haytham-father-of-fiber-optics-born-in, Public Domain, https://commons.wikimedia.org/w/index.php? curid=47132269 Tip: Acquire Knowledge @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Theories == NFRs Science is driven by theories, and
theories are detailed statements about what we believe to be true. Roger Bacon sometimes credited as being one of the earliest European Advocates of the Modern Scientific Method By Jan Verhas - Britannica upload from a 1867 original, Public Domain, https: //commons.wikimedia.org/w/index.php?curid=44315585 @kevinrudland / @al94781 #test_nfrs_with_science

Example: NFR-Driven “Theories” Example 1: “The System will be able
to support peak volumes of 100,000 requests per minute. These requests are to have a response time of 0.5 seconds or less, with less than 1% errors per request reported, considering 2000 requests per second.” Example 2: “Scaling Out - Adding additional instances of the API Service on additional servers increases the throughput near-linearly.” @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Falsifiability "A scientific hypothesis must be falsifiable [...] otherwise,
[it] cannot be meaningfully tested" Johannes Kepler, believed by some to be the archetype of the inductive scientific genius By Unknown - Kopie eines verlorengegangenen Originals von 1610 im Benediktinerkloster in Kremsmünster, Public Domain, https://commons.wikimedia. org/w/index.php?curid=470711 @kevinrudland / @al94781 #test_nfrs_with_science

Example: Hypotheses Example 1: H1: WHEN xxx requests per second
are sent to the ABC_SERVICE THEN the average will respond successfully within 0.5 seconds and less than 1% will result in an error H0: WHEN … THEN the average will respond in greater than 0.5 seconds OR more than 1% will result in an error @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Prioritise Counterexamples "no number of positive outcomes [...] can
confirm a scientific theory, but a single counterexample is logically decisive" Karl Popper, By LSE library - http://www.flickr.com/photos/lselibrary/3833724834/in/set- 72157623156680255/, No restrictions, https://commons.wikimedia.org/w/index.php? curid=9694262 @kevinrudland / @al94781 #test_nfrs_with_science

never happens Prioritisation : Scalability @kevinrudland / @al94781 #test_nfrs_with_science

Prioritisation : SLAs & Resilience @kevinrudland / @al94781 #test_nfrs_with_science

Prioritisation : Ohhh, Shiny and New! @kevinrudland / @al94781 #test_nfrs_with_science

Example: A Scary Hypotheses : Tolerance of Downstream Failures Example
3: H1: Failure of the ABC service to connect to the XYZ service is handled without manual intervention, logged, and reported back to the ABC service within 100ms of the call to the XYZ service being made. If there are > X errors in a 10 second window, the circuit-breaker in the ABC service opens H0: Failure of the ABC service to connect to the XYZ service causes failures which need manual intervention, which are not logged, and the circuit-breaker in the ABC service does not open, no matter how many, or how frequently errors are seen @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Apply Occam’s Razor "the simplest explanation is usually the
correct one” William of Ockham, credited with originating Occam’s Razor By self-created (Moscarlop) - Own work, CC BY-SA 3.0, https://commons.wikimedia. org/w/index.php?curid=5523066 @kevinrudland / @al94781 #test_nfrs_with_science

METHOD The bit where we discuss how we’re going to
get to where we think we want to get @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Be Repeatable "If an experiment cannot be repeated to
produce the same results, this implies that the original results might have been in error." Marie Curie, conducted pioneering research on radioactivity By Unknown - Christie's, [1], Public Domain, https://commons.wikimedia.org/w/index.php? curid=32735756 @kevinrudland / @al94781 #test_nfrs_with_science

• Subjects • Their Environment • The Independent Variables •
(The Dependent Variables) Milgram’s Obedience Experiment Image from: https://thoughtmaybe.com/the-milgram-experiment-obedience/ Tip: Experiments Typically Consist of... @kevinrudland / @al94781 #test_nfrs_with_science

Subjects Think: • How many of them? • How are
they connected? • What do they already know? Milgram’s Obedience Experiment Image from: https://thoughtmaybe.com/the-milgram-experiment-obedience/ @kevinrudland / @al94781 #test_nfrs_with_science

Example: Hypothesis - v.2 GIVEN n instances of the ABC_SERVICE
with the DEFAULT_CONFIGURATION and empty REQUEST_CACHES, and which are connected to n instances of the downstream XYZ_SERVICE with the DEFAULT_CONFIGURATION via EUREKA over the PROD_NETWORK H1: WHEN xxx requests per second are sent to the ABC_SERVICE THEN the average will respond successfully within 0.5 seconds and less than 1% will result in an error H0: WHEN xxx requests per second are sent to the ABC_SERVICE THEN the average will not respond successfully within 0.5 seconds and more than 1% will result in an error @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Sometimes; Keep Looking “Absence of evidence is not evidence
of absence.” Chien-Shiung Wu, Chinese American experimental physicist who made significant contributions in the field of nuclear physics. She is best known for conducting the Wu experiment, which contradicted the hypothetical law of conservation of parity By Smithsonian Institution from United States - Chien-shiung Wu (1912-1997)Uploaded by Fæ, No restrictions, https://commons.wikimedia.org/w/index.php?curid=18877882 @kevinrudland / @al94781 #test_nfrs_with_science

WARNING! Sometimes things go very well in test and this
can give you a false sense of security: Sometimes you can be live-like, but not live-like enough (and didn’t realise). What went wrong? How did we manage to predict things? @kevinrudland / @al94781 #test_nfrs_with_science

Meta-Hypothesis #2: H1: the only environment exactly like production is
production, therefore only results from testing in production really count. @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Fake it “To be ecologically valid, the methods, materials
and setting of a study must approximate the real-life situation that is under investigation.” Francis Bacon, philosophical advocate and practitioner of the scientific method By Paul van Somer (1576/1578–1622) - pl.pinterest.com, Public Domain, https: //commons.wikimedia.org/w/index.php?curid=19958108 @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Axes of Realism So, if science can do it,
what are our axes of realism?: • Environment ◦ Configuration ◦ Data ◦ Platform and connectivity ◦ Monitoring • Also, are you in; ◦ a Normal-Running or a DR scenario? • And are there; ◦ events such as a fail and recover,? ◦ or a slow down and speed up? Etc. @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Realism Techniques • Take a copy of prod data
• Take a copy of prod requests • Close observation of prod followed by replication in the “lab” @kevinrudland / @al94781 #test_nfrs_with_science

Method: Data • Of all the environmental elements, data can
have a MOST SIGNIFICANT IMPACT • Easiest to change • This is the type of data, its mix and variety • Look at your data model – Examine: 1:n – Examine: nullable @kevinrudland / @al94781 #test_nfrs_with_science

Charles R. Drew, developed improved techniques for blood storage, and
applied his expert knowledge to developing large-scale blood banks By Associated Photographic Services, Inc - Original Repository: Howard University. Moorland-Spingarn Research Center. Charles R. Drew Papers http://profiles.nlm.nih. gov/ps/retrieve/ResourceMetadata/BGBBCT, https://en.wikipedia.org/w/index.php? curid=47837720 Tip: Manipulate Independent Variables @kevinrudland / @al94781 #test_nfrs_with_science

Hard-Coded Successful Requests @kevinrudland / @al94781 #test_nfrs_with_science

A Mix of Good and Bad Requests @kevinrudland / @al94781
#test_nfrs_with_science

Example: Hypothesis - v.3.1 GIVEN n instances of the ABC_SERVICE
with the DEFAULT_CONFIGURATION and empty REQUEST_CACHES, and which are connected to n instances of the downstream XYZ_SERVICE with the DEFAULT_CONFIGURATION via EUREKA over the PROD_NETWORK H1: WHEN a mix of xxx VALID requests per second and yyy INVALID requests per second are sent to the ABC_SERVICE THEN the average will respond correctly within 0.5 seconds and less than 1% will result in an error H0: WHEN a mix of xxx VALID requests per second and yyy INVALID requests per second are sent to the ABC_SERVICE THEN the average will not respond correctly within 0.5 seconds and more than 1% will result in an error @kevinrudland / @al94781 #test_nfrs_with_science

JMeter - Dynamic Request Gen @kevinrudland / @al94781 #test_nfrs_with_science

Dynamic Request Generation @kevinrudland / @al94781 #test_nfrs_with_science

Test Plus Background Load Mix @kevinrudland / @al94781 #test_nfrs_with_science

Example: Hypothesis - v.3.2 GIVEN n instances of the ABC_SERVICE
with the DEFAULT_CONFIGURATION and empty REQUEST_CACHES, and which are connected to n instances of the downstream XYZ_SERVICE with the DEFAULT_CONFIGURATION via EUREKA over the PROD_NETWORK with BACKGROUND_LOAD H1: WHEN a mix of xxx VALID requests per second and yyy INVALID requests per second are sent to the ABC_SERVICE THEN the average will respond correctly within 0.5 seconds and less than 1% will result in an error H0: WHEN a mix of xxx VALID requests per second and yyy INVALID requests per second are sent to the ABC_SERVICE THEN the average will not respond correctly within 0.5 seconds and more than 1% will result in an error @kevinrudland / @al94781 #test_nfrs_with_science

Method: Events Dear boy… An 1888 lithograph of the 1883
eruption of Krakatoa. By Lithograph: Parker & Coward, Britain; - Image published as Plate 1 in The eruption of Krakatoa, and subsequent phenomena. Report of the Krakatoa Committee of the Royal Society (London, Trubner & Co., 1888)., Public Domain, https://commons.wikimedia. org/w/index.php?curid=7696837 @kevinrudland / @al94781 #test_nfrs_with_science

Method: Execute • Manipulate the independent variables: – e.g. req./sec.
– e.g. concurrent users – e.g. req. mix • One at a time @kevinrudland / @al94781 #test_nfrs_with_science

So, back at Meta-Hypothesis #2: Proven - H0: We can
proceed without testing in production @kevinrudland / @al94781 #test_nfrs_with_science

The bit where we look at the outputs collected from
testing our hypotheses RESULTS @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Be Empirical "To be termed scientific, a method of
inquiry is commonly based on empirical or measurable evidence" Rosalind Franklin, had a key role (largely unacknowledged during her lifetime) in discovering the helical structure of DNA By Jewish Chronicle Archive/Heritage-Imageshttp://www.britannica.com/EBchecked/topic- art/217394/99712/Rosalind-Franklin, Fair use, https://en.wikipedia.org/w/index.php? curid=24959067 @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Measure Dependent Variables Sir C. V. Rahman, winner of
the 1030 Nobel Prize for Physics for his work on light scattering By Nobel Foundation - From Nobel Lectures, Physics 1922-1941, Elsevier Publishing Company, Amsterdam, 1965, Public Domain, https://commons.wikimedia.org/w/index. php?curid=4213636 @kevinrudland / @al94781 #test_nfrs_with_science

DISCUSSION The bit where we try and make coherent sense
of what we found @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Investigate • Tally your counts • Attribute all noted
errors • and resource utilization The Stanford Prison Experiment Image from http://www.prisonexp.org/the-story/ @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Stats Can Help • Why do you care?: •
Means mean nothing • The impacts of (statistical) distribution @kevinrudland / @al94781 #test_nfrs_with_science

Stats: Data Distribution @kevinrudland / @al94781 #test_nfrs_with_science

A Better Example @kevinrudland / @al94781 #test_nfrs_with_science

Stats: Percentiles @kevinrudland / @al94781 #test_nfrs_with_science

A Better Example (Again) @kevinrudland / @al94781 #test_nfrs_with_science

Example: Hypothesis - v.4 GIVEN n instances of the ABC_SERVICE
with the DEFAULT_CONFIGURATION and empty REQUEST_CACHES, and which are connected to n instances of the downstream XYZ_SERVICE with the DEFAULT_CONFIGURATION via EUREKA over the PROD_NETWORK with BACKGROUND_LOAD H1: WHEN a mix of xxx VALID requests per second and xxx INVALID requests per second are sent to the ABC_SERVICE THEN 95% will respond correctly within 0.5 seconds and less than 1% will result in an error H0: WHEN a mix of xxx VALID requests per second and xxx INVALID requests per second are sent to the ABC_SERVICE THEN 95% will not respond correctly within 0.5 seconds and more than 1% will result in an error @kevinrudland / @al94781 #test_nfrs_with_science

Decide: H1 or H0? Did the test pass or fail?

The bit where we sum up. (And also perhaps invite
some questions.) CONCLUSION @kevinrudland / @al94781 #test_nfrs_with_science

Conclusion: Refine and Retest • We did not verify H1
:( • Do you need to look elsewhere? • Revisit H0 / H1 - Apply Occam’s Razor • Only change 1 thing at a time @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Tune & Fix Too • Change one thing at
a time (unless it’s your DVs) • Think!: ◦ Will it have the desired effect? ◦ Will it have side-effects? ◦ Tuning the slowest won’t always win Image: cc: Kalle Hyttinen - https://www.flickr.com/photos/129560755@N03 @kevinrudland / @al94781 #test_nfrs_with_science

Tip: Re-Run (again) Always re-baseline after a change / fix
has gone in - make no other changes @kevinrudland / @al94781 #test_nfrs_with_science

N.b. Your Targets Don't get carried away @kevinrudland / @al94781
#test_nfrs_with_science

The bit you always write at the end after you’ve
done all the work which summarises your findings. ABSTRACT @kevinrudland / @al94781 #test_nfrs_with_science

Meta-Hypothesis #1: H1: We can give ourselves confidence that our
hyper-distributed systems will meet our Non Functional Requirements by applying the same method that scientists use to understand and predict the world around us: Science; specifically the Scientific Method. @kevinrudland / @al94781 #test_nfrs_with_science P R O V E N

The bit everyone dreads in case they got something wrong
along the way. QUESTIONS? @kevinrudland / @al94781 #test_nfrs_with_science

cc: ptrlx - https://www.flickr.com/photos/58615912@N05 Oh, and did we mention we
(Capgemini) are hiring? (see bit.ly/cg-jvm-jobs-nfr for lots more info) @kevinrudland / @al94781 #test_nfrs_with_science

The bit where you dump things that didn’t fit anywhere
else. APPENDICES @kevinrudland / @al94781 #test_nfrs_with_science

Marginalia: Record It • Use templates • Visualise • Summarize
• Supporting evidence Marginalia Image from: http://spec.lib.miamioh.edu/home/marginalia-from-the-stacks/ @kevinrudland / @al94781 #test_nfrs_with_science

Non-Functional Testing and Tuning - A Scientifi...

Non-Functional Testing and Tuning - A Scientific Approach

More Decks by Andrew Harmel-Law

Other Decks in Technology

Featured

Transcript