Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Too trivial to test? 
An inverse view on defect...

Stefan Wagner
February 28, 2020

Too trivial to test? 
An inverse view on defect prediction to identify methods with low fault risk

Presentation of the corresponding paper published in PeerJ Computer Science at the German-speaking Software Engineering Conference in Innsbruck, Austria.

Stefan Wagner

February 28, 2020
Tweet

More Decks by Stefan Wagner

Other Decks in Research

Transcript

  1. SE 2020 Innsbruck, Austria Stefan Wagner Too trivial to test?

    An inverse view on defect prediction to identify methods with low fault risk
  2. You can copy, share and change, film and photograph, blog,

    live-blog and tweet this presentation given that you attribute it to its author and respect the rights and licences of its parts. based on slides by @SMEasterbrook und @ethanwhite
  3. • Test resources are limited ➞ Need for prioritising testing

    activities • Defect prediction overall with little practical usage • Cross-project defect prediction even less 4 Motivation
  4. Identify code regions with low fault risk 5 Idea: Inverse

    Defect Prediction (IDP) FAULTY NON-FAULTY (Traditional) Defect Prediction OTHER LOW FAULT RISK Inverse Defect Prediction
  5. 8

  6. 10 All Methods Compute Metrics Remove Faulty Methods Faulty Methods

    Non-Faulty Methods Compute Metrics Merge Multiple Occurrences Method Fixes
  7. 11 All Methods Compute Metrics Remove Faulty Methods Faulty Methods

    Non-Faulty Methods Compute Metrics Merge Multiple Occurrences Method Fixes
  8. 12 All Methods Compute Metrics Remove Faulty Methods Faulty Methods

    Non-Faulty Methods Compute Metrics Merge Multiple Occurrences Method Fixes Address Data Imbalance
  9. 13 All Methods Compute Metrics Remove Faulty Methods Faulty Methods

    Non-Faulty Methods Compute Metrics Merge Multiple Occurrences Method Fixes Compute Association Rules Address Data Imbalance
  10. • Identify rules in a large dataset of transactions •

    Rules describe implications between items { antecedent } ! { consequent } • Properties of rules • support (significance) • confidence (precision) 14 Association Rule Mining
  11. 15 Method Properties (Items) Status setX(…) { IsSetter, NoNullChecks, NoLoops,

    NoConditions, … } Not faulty toString() { IsToString, NoNullChecks, NoLoops, … } Not faulty computeResult() { NoStringLiterals, NoSynchronizedBlocks, … } Faulty convert(…) { NoLoops, NoAssignments, NoArrayAccesses, … } Faulty getX() { IsGetter, NoLoops, NoConditions, NoAssignments, … } Not faulty getY() { IsGetter, NoLoops, NoConditions, NoAssignments, … } Not faulty checkArgs(…) { NoAssignments, NoSynchronizedBlocks, … } Not faulty Transactions
  12. 16 Rule { NoLoops } ! { Not faulty} {

    NoLoops, NoConditions, NoAssignments } ! { Not faulty } { IsSetter, NoConditions, NoAssignments, NoStringLiterals } ! { Not faulty } { LessThan3Sloc, NoConditions } ! { Not faulty } … Rules
  13. 17 Method Properties (Items) Status setX(…) { IsSetter, NoNullChecks, NoLoops,

    NoConditions, … } Not faulty toString() { IsToString, NoNullChecks, NoLoops, … } Not faulty computeResult() { NoStringLiterals, NoSynchronizedBlocks, … } Faulty convert(…) { NoLoops, NoAssignments, NoArrayAccesses, … } Faulty getX() { IsGetter, NoLoops, NoConditions, NoAssignments, … } Not faulty getY() { IsGetter, NoLoops, NoConditions, NoAssignments, … } Not faulty checkArgs(…) { NoAssignments, NoSynchronizedBlocks, … } Not faulty … Rule: { NoLoops } ! { Not faulty } Support: 4/7 = 57.1 % Confidence: 4/5 = 80.0 %
  14. 18 All Methods Compute Metrics Remove Faulty Methods Faulty Methods

    Non-Faulty Methods Compute Metrics Merge Multiple Occurrences Method Fixes IDP Classifier Select top n rules Compute Association Rules Address Data Imbalance
  15. RQ 1 RQ 2 RQ 3 21 Research Questions How

    many faults do methods classified as «low fault risk» contain? How large is the fraction of the code base consisting of methods classified as «low fault risk»? Is a trained classifier for methods with low fault risk generalizable to other projects?
  16. RQ 1 RQ 2 RQ 3 22 Research Questions How

    many faults do methods classified as «low fault risk» contain? How large is the fraction of the code base consisting of methods classified as «low fault risk»? Is a trained classifier for methods with low fault risk generalizable to other projects?
  17. 25 Proportion of Low-Fault-Risk Methods Within-Project Predictions: Chart Closure Lang

    Math Mockito Time 80% 34% 17% 29% 29% 44% Method- based Chart Closure Lang Math Mockito Time 78% 25% 5% 14% 11% 16% SLOC
  18. RQ 1 RQ 2 RQ 3 26 Research Questions How

    many faults do methods classified as «low fault risk» contain? How large is the fraction of the code base consisting of methods classified as «low fault risk»? Is a trained classifier for methods with low fault risk generalizable to other projects?
  19. 28 Fault-Density Reduction Factor Within-Project Predictions: proportion of low-fault-risk methods

    out of all methods proportion of faulty low-fault-risk methods out of all faulty methods Example: 40% of all methods are classified as «low fault risk» and 10% of all methods are faulty low-fault-risk methods ! factor = 4
  20. 29 Fault-Density Reduction Factor Within-Project Predictions: 0 5 10 15

    20 Chart Closure Lang Math Mockito Time 1.5 2.6 3.4 3.1 3.2 4.4 4.3 7.1 7 10.9 4.4 4.5 Methods SLOC Threshold
  21. RQ 1 RQ 2 RQ 3 30 Research Questions How

    many faults do methods classified as «low fault risk» contain? How large is the fraction of the code base consisting of methods classified as «low fault risk»? Is a trained classifier for methods with low fault risk generalizable to other projects?
  22. 32 Proportion of Low-Fault-Risk Methods Cross-Project Predictions: Chart Closure Lang

    Math Mockito Time 18% 22% 27% 23% 25% 32% Chart Closure Lang Math Mockito Time 7% 7% 9% 8% 8% 11% Methods SLOC
  23. 33 Fault-Density Reduction Factor Cross-Project Predictions: 0 5 10 15

    20 Chart Closure Lang Math Mockito Time 1.4 5.8 6.1 1.6 4.4 3.3 4.2 18.3 16.3 4 13.6 8.3 Methods SLOC Threshold
  24. 34 Fault-Density Reduction Factor Cross-Project Predictions: 0 5 10 15

    20 Chart Closure Lang Math Mockito Time 1.4 5.8 6.1 1.6 4.4 3.3 4.2 18.3 16.3 4 13.6 8.3 Methods (cross) SLOC (cross) Methods (within) SLOC (within) Threshold
  25. • Inverse Defect Prediction can successfully predict low-fault-risk methods •

    ~20–40% of the methods (5–20% of SLOC) classified as «low fault risk» • Applicable for cross-project predictions 35 Conclusion
  26. # Rule Support Confidence # 1 { NoMethodInvocations, SlocLessThan4, NoArithmeticOperations,

    NoNullLiterals } ) { NotFaulty } 10.43% 96.76% 2 { NoMethodInvocations, SlocLessThan4, NoArithmeticOperations } ) { NotFaulty } 11.03% 96.09% 3 { NoMethodInvocations, SlocLessThan4, NoCastExpressions, NoNullLiterals } ) { NotFaulty } 10.43% 95.43% 4 { NoMethodInvocations, SlocLessThan4, NoCastExpressions, NoInstantiations } ) { NotFaulty } 10.13% 95.31% 5 { NoMethodInvocations, SlocLessThan4, NoCastExpressions } ) { NotFaulty } 11.03% 94.85%
  27. Prof. Dr. Stefan Wagner e-mail [email protected] phone +49 (0) 711

    685-88455 WWW www.iste.uni-stuttgart.de/se Twitter prof_wagnerst ORCID 0000-0002-5256-8429 Institute of Software Technology Slides are available at www.stefan-wagner.biz. Joint work with Rainer Niedermayr and Tobias Röhm
  28. Pictures used in this slide deck Trivial Pursuit by Bram

    Cymet published under CC BY-NC 2.0 (https://flic.kr/p/ 67AFso)