Optimising Analytical Quality Assurance

Software Quality Days 2020 Prof. Dr. Stefan Wagner Optimising Analytical
Software Quality Assurance

You can copy, share and change, film and photograph, blog,
live-blog and tweet this presentation given that you attribute it to its author and respect the rights and licences of its parts. based on slides by @SMEasterbrook und @ethanwhite

2013 2014 2015 2016 2017 2018 2019 23 26 26
31 35 26 23 Budget for quality assurance decreases Proportion of total budget allocated to QA and testing in % Data from: Capgemini, Microfocus. World Quality Report 2019–20

Increased amount of developments and releases Shift to Agile/DevOps causing
more test iteration cycles Increased challnges with test environments Business demands higher IT quality Increased inefficiency of test activities 4.85 4.95 5.08 5.39 5.7 Factors that increase costs Mean on scale from 1 to 7, where 7 is „most important“ Data from: Capgemini, Microfocus. World Quality Report 2019–20

cost of quality appraisal costs internal failure prevention costs external
failure nonconformance conformance The PAF Model J.M. Juran, Juran's Quality Control Handbook, 2009 D. Galin, Software Quality Assurance, 2004 Control cost Failure of control cost

Cost Quality level Total cost of software quality Total control
cost Total failure of control cost Minimal total cost Optimal quality level D. Galin, Software Quality Assurance, 2004

E[d(t)] = u + e(t) + X i (1 (i,
t))v(i) Quality economics optimisation – theory Setup costs Product Execution costs Removal costs Effort t Fault i detected with difficulty θ Wagner (2008)

Quality economics optimisation – application There is not enough data
about QA techniques and defects to make sensible optimisations! Many more studies needed…

Example: Static bug pattern analysis Found by testing Defects Found
by reviews Found by bug pattern tools S. Wagner et al. An evaluation of two bug pattern tools for java. ICST 2008,   S. Wagner et al. Comparing Bug Finding Tools with Reviews and Tests. TestCom 2005 Rutar, Almazan, Foster. A Comparison of Bug Finding Tools for Java. ISSRE’04. IEEE CS Press, 2004 Cost Type Project X Project Y StdConfig StdConfig Report analysis costs 2,240 402 Internal removal costs 1,120 201 Saved costs (per defect) 228 228 Break even (average) 14.7 2.7

There are many factors besides techniques Platform Language Development Process
Business Domain People Quality Economics

Does personality influence the usage of static analysis tools? J.
Ostberg, E. Weilemann, S. Wagner. CHASE 2016 High agreeableness High neuroticism Small changes in cycles Larger changes and impulsive decisions Personality trait Work strategy

Intermediate summary: We need to make QA more efficient. Quantitative,
general optimisations not possible. Many factors influencing QA economics. Empirical studies are too few. So what shall we do?

Empirical studies will inform us better and better! So: Do
more studies!

Optimise more locally where we understand what to optimise!

Example 1: Too trivial to test Identify code regions with
low fault risk FAULTY NON-FAULTY (Traditional) Defect Prediction OTHER LOW FAULT RISK Inverse Defect Prediction Idea: Inverse Defect Prediction (IDP) R. Niedermayr, T. Röhm, S. Wagner. PeerJ Computer Science, 2019

Association Rule Mining ▪ Identify rules in a large dataset
of transactions ▪ Rules describe implications between items { antecedent } → { consequent } ▪ Properties of rules ▪ support (significance) ▪ confidence (precision)

# Rule Support Conﬁdence # 1 { NoMethodInvocations, SlocLessThan4, NoArithmeticOperations,
NoNullLiterals } ) { NotFaulty } 10.43% 96.76% 2 { NoMethodInvocations, SlocLessThan4, NoArithmeticOperations } ) { NotFaulty } 11.03% 96.09% 3 { NoMethodInvocations, SlocLessThan4, NoCastExpressions, NoNullLiterals } ) { NotFaulty } 10.43% 95.43% 4 { NoMethodInvocations, SlocLessThan4, NoCastExpressions, NoInstantiations } ) { NotFaulty } 10.13% 95.31% 5 { NoMethodInvocations, SlocLessThan4, NoCastExpressions } ) { NotFaulty } 11.03% 94.85%

Proportion of low-fault-risk methods 0% 25% 50% 75% 100% Chart
Closure Lang Math Mockito Time 0% 25% 50% 75% 100% Chart Closure Lang Math Mockito Time Methods SLOC Only 0.3% contain a fault. Are 5.7 times less likely to contain a fault.

Example 2: Pseudo-tested code Will my tests be able to
tell? Idea: Remove whole method implementation R. Niedermayr, E. Juergens, S. Wagner. CSED, 2016 + – public int computeFactorial(int n) { int result = 1; for (int i = n; i > 1; i--) { result = result * i; } return result; return 0; }

Implementation of a tool to do that: https://github.com/STAMP-project/pitest- descartes

Pseudo-tested code in popular open source Between 6% and 53%

Advantages of pseudo-tested code detection Much faster to compute than
mutations No equivalent mutant problem More valid than code coverage

Conclusions We need to make QA more efficient. Quantitative, general
optimisations not possible. Many factors influencing QA economics. Empirical studies are too few. Let’s work on more and better studies. Find ways to optimise where we understand what to measure.

Prof. Dr. Stefan Wagner e-mail [email protected] phone +49 (0) 711
685-88455 WWW www.iste.uni-stuttgart.de/se Twitter prof_wagnerst ORCID 0000-0002-5256-8429 Institute of Software Technology Slides are available at www.stefan-wagner.biz.

Pictures used in this slide deck Max paraboloid by IkamusumeFan
under CC BY-SA 4.0 (https:// commons.wikimedia.org/wiki/File:Max_paraboloid.svg)

Optimising Analytical Quality Assurance

Optimising Analytical Quality Assurance

Stefan Wagner

More Decks by Stefan Wagner

Other Decks in Science

Featured

Transcript

Software Quality Days 2020 Prof. Dr. Stefan Wagner Optimising Analytical

You can copy, share and change, film and photograph, blog,

2013 2014 2015 2016 2017 2018 2019 23 26 26

Increased amount of developments and releases Shift to Agile/DevOps causing

cost of quality appraisal costs internal failure prevention costs external

Cost Quality level Total cost of software quality Total control

E[d(t)] = u + e(t) + X i (1 (i,

Quality economics optimisation – application There is not enough data

Example: Static bug pattern analysis Found by testing Defects Found

There are many factors besides techniques Platform Language Development Process

Does personality influence the usage of static analysis tools? J.

Intermediate summary: We need to make QA more efficient. Quantitative,

Empirical studies will inform us better and better! So: Do

Optimise more locally where we understand what to optimise!

Example 1: Too trivial to test Identify code regions with

Association Rule Mining ▪ Identify rules in a large dataset

# Rule Support Conﬁdence # 1 { NoMethodInvocations, SlocLessThan4, NoArithmeticOperations,

Proportion of low-fault-risk methods 0% 25% 50% 75% 100% Chart

Example 2: Pseudo-tested code Will my tests be able to

Implementation of a tool to do that: https://github.com/STAMP-project/pitest- descartes

Pseudo-tested code in popular open source Between 6% and 53%

Advantages of pseudo-tested code detection Much faster to compute than

Conclusions We need to make QA more efficient. Quantitative, general

Prof. Dr. Stefan Wagner e-mail [email protected] phone +49 (0) 711

Pictures used in this slide deck Max paraboloid by IkamusumeFan