Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using Evidential Reasoning to Make Qualified Predictions of Software Quality

Using Evidential Reasoning to Make Qualified Predictions of Software Quality

by Neil Walkinshaw

More Decks by PROMISE'13: The 9th International Conference on Predictive Models in Software Engineering

Other Decks in Research

Transcript

  1. Using Evidential Reasoning to Make Qualied Predictions of Software Quality

    Neil Walkinshaw Department of Computer Science University of Leicester le.ac.uk
  2. The Problem Evidential Reasoning Example Conclusions Motivation Quality Assessment is

    rigorous . . . • Lots of sources of information about program quality • Formal verication, testing, code reviews, design inspection, . . . • Lots of metrics • Test coverage, complexity, cohesion, churn, . . . • Lots of notions / formalised models of software quality • Standards, certications, quality models, . . .
  3. The Problem Evidential Reasoning Example Conclusions Motivation Quality Assessment is

    rigorous . . . • Lots of sources of information about program quality • Formal verication, testing, code reviews, design inspection, . . . • Lots of metrics • Test coverage, complexity, cohesion, churn, . . . • Lots of notions / formalised models of software quality • Standards, certications, quality models, . . . . . . but unscientic • Path from evidence to conclusions is ad-hoc • Evidence is often ambiguous or partial
  4. The Problem Evidential Reasoning Example Conclusions What is quality? •

    Notoriously dicult to characterise • Commonly decomposed into a tree of (diverse) sub-factors • Lots of models. • Several industry standards (e.g. ISO/IEC 9126), lots of models for specic domains • Wagner's interviews of 25 practitioners yielded 31 dierent quality models.
  5. The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?

    Ad-hoc • Use metrics etc. to make an intuitive assessment of the quality • Ensure minimal criteria • E.g. ensure there is a minimum of 80% code coverage in the test sets. • No methods have a cyclomatic complexity > 10. • . . . Problems • Unsystematic • Does not quantify how good or bad the system is • Subjective, lots of implicit presumptions.
  6. The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?

    Bayesian Belief Networks (Wagner, 2009) Test quality Test oracles Test adequacy Execution traces Soundness Completeness Requirements Oracles Source code Quality Model Indicators Software artifacts Metric A low/medium/high Metric B low/medium/high Metric C low/medium/high Metric D low/medium/high Metric E low/medium/high Mutation analysis
  7. The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?

    Conditional Probability Table Mutation Score low high Branch coverage low med high low med high Adequacy low 0.9 0.5 0.3 0.7 0.4 0.1 medium 0.07 0.4 0.6 0.2 0.4 0.5 high 0.03 0.1 0.3 0.1 0.2 0.4 • Table provides complete mapping from inputs to probabilistic outputs
  8. The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?

    Several limitations • Eort • Generating probability tables - where do the probabilities come from? • Emphasis on metrics • Quality cannot be reduced to metrics • Human intuition is a vital part of quality assessment • Ignorance, uncertainty, and doubt • Cannot be captured in probability distributions • To be trustworthy, quality assessment must expose and highlight any uncertainty or ignorance.
  9. The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Accounting for

    doubt and ignorance • Proposed by Yang et al. in Operations Research • Models cause-eect relations between dierent factors (as with BBN) • Based on Dempster Schäfer theory • Dempster's Mathematical theory of evidence (1968) • Generalises probability to enable expression of ignorance / doubt • Does not require conditional probability tables • Does not take single values (e.g. metrics) as inputs • Inputs are `belief functions', capturing developer's subjective doubt and ignorance • More suited for modelling human opinion, with its inherent uncertainty
  10. The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Test quality

    Test oracles Test adequacy Execution traces Soundness Completeness Requirements Oracles Source code Mutation analysis Developers
  11. The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Test quality

    Test oracles Test adequacy Execution traces Soundness Completeness Requirements Oracles Source code Mutation analysis Quality Confidence Confidence Quality Confidence Quality Developers
  12. The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Test quality

    Test oracles Test adequacy Execution traces Soundness Completeness Requirements Oracles Source code Mutation analysis Quality Confidence Confidence Quality Confidence Quality Quality Confidence Uncertainty Developers
  13. The Problem Evidential Reasoning Example Conclusions Belief Functions Capturing condence

    and ignorance • Apportion our belief mass to dierent quality levels • Any unallocated mass indicates ignorance • Sum of belief masses must be ≤ 1 Example - Test Adequacy • All we know is that: • Average branch coverage is 80%. • Program involves lots of statistical routines. • Code coverage is a poor indicator of adequacy, especially for data-intensitve computations. • Uncertain if remaining 20% of branches are infeasible. • Whatever we decide, we're only 50% condent that it will be accurate.
  14. The Problem Evidential Reasoning Example Conclusions Belief Functions Example -

    Test Adequacy Belief mass Quality 0 0.5 poor indifferent awful good excellent 0 0.05 0.05 0.3 0.1 ? 0.5
  15. The Problem Evidential Reasoning Example Conclusions Evidential reasoning Aggregating Belief

    Functions • Given a quality model (a weighted tree of factors) • Developer provides belief functions for the leaf nodes • Combine belief functions and propagate belief mass to produce an aggregate belief function Test quality Test oracles Test adequacy Soundness Completeness Quality Confidence Confidence Quality Confidence Quality Quality Confidence Uncertainty 0.5 0.5 0.7 0.3
  16. The Problem Evidential Reasoning Example Conclusions Evidential reasoning Aggregating Belief

    Functions • Given a quality model (a weighted tree of factors) • Developer provides belief functions for the leaf nodes • Combine belief functions and propagate belief mass to produce an aggregate belief function • Combination must maintain following properties: • Must not be assessed to a given grade if there is no supporting evidence. • Should be precisely assessed to a given grade if all of the evidence supports this level. • If all attributes are completely assessed (no doubt), the aggregate assessment should be complete too. • If there is any incompleteness (ignorance or doubt), then this should be reected in the aggregate assessment.
  17. The Problem Evidential Reasoning Example Conclusions Assessing the NASA CM1

    Software What do we know? • CM1 is a NASA spacecraft instrument written in C. • Metrics data from the PROMISE repository (Shepperd et al.'s cleaned up version) • LOC, Halstead metrics, Cyclomatic complexity, Multiple-condition count, %age comments Lots of ignorance here . . . • No access to source code • Unfamiliar with domain • Unfamiliar with development procedures • Vulnerable metrics
  18. The Problem Evidential Reasoning Example Conclusions 1. Choose a quality

    model Maintainability Analysability 0.4 Quality Assurance 0.4 Implementation quality 0.2 Size 0.4 Complexity 0.6 Testability 0.6 Adequacy 0.4 Comment Quality 0.5 Data Coverage 0.5 Code Coverage 1
  19. The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

    for lowest-level factors Size 0 500 1000 1500 2000 0 100 200 300 400 LOC_EXECUTABLE HALSTEAD_LENGTH 0 0.1 0.2 0.3 0.4 0.5 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 0.05
  20. The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

    for lowest-level factors Complexity 0 50 100 0 25 50 75 100 CYCLOMATIC_COMPLEXITY HALSTEAD_DIFFICULTY 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 0.1
  21. The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

    for lowest-level factors Testability 0 25 50 75 100 125 0 5 10 15 PARAMETER_COUNT MULTIPLE_CONDITION_COUNT 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 0.1
  22. The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

    for lowest-level factors Code coverage / data coverage NO DATA 0 0.2 0.4 0.6 0.8 1 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 1
  23. The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

    for lowest-level factors Comment quality 0 10 20 30 40 0 20 40 60 PERCENT_COMMENTS count 0 0.05 0.1 0.15 0.2 0.25 0.3 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 0.2
  24. The Problem Evidential Reasoning Example Conclusions 3. Aggregate the Belief

    functions Maintainability Analysability 0.4 Quality Assurance 0.4 Implementation quality 0.2 Size 0.4 Complexity 0.6 Testability 0.6 Adequacy 0.4 Comment Quality 0.5 Data Coverage 0.5 Code Coverage 1 0 0.1 0.2 0.3 0.4 0.5 awful poor indifferent good excellent Belief mass Assessment 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 awful poor indifferent good excellent Belief mass Assessment 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 awful poor indifferent good excellent Belief mass Assessment 0 0.2 0.4 0.6 0.8 1 awful poor indifferent good excellent Belief mass Assessment 0 0.2 0.4 0.6 0.8 1 awful poor indifferent good excellent Belief mass Assessment 0 0.05 0.1 0.15 0.2 0.25 0.3 awful poor indifferent good excellent Belief mass Assessment
  25. The Problem Evidential Reasoning Example Conclusions ERTool Basic implementation •

    https://bitbucket.org/nwalkinshaw/ertool • Input: ((comment_quality:1)implementation_quality:0.2, (testability:0.6,(code_coverage:0.5, data_coverage:0.5)adequacy:0.4) quality_assurance:0.4,(complexity:0.6,size:0.4) analysability:0.4)maintainability; 5 comment_quality 0.05 0.05 0.1 0.3 0.3 code_coverage 0 0 0 0 0 data_coverage 0 0 0 0 0 testability 0.25 0.4 0.25 0 0 complexity 0.2 0.4 0.3 0 0 size 0 0 0.35 0.5 0.1
  26. The Problem Evidential Reasoning Example Conclusions Notable Features • Traceability

    • Reasons for assessments can be traced down to specic factors. • Doubt and Ignorance • Explicit throughout. • A better GUI could highlight this at every node in the tree. • Even allows for complete ignorance. • Requires a relatively small amount of input • A quality model of choice • Belief functions for the lowest-level factors
  27. The Problem Evidential Reasoning Example Conclusions Conclusions and Ongoing Work

    Conclusions • Evidential Reasoning presents a plausible basis for reasoning about software quality • Accommodates ignorance and doubt - intrinsic human factors • Openly available implementation Ongoing Work • Evaluation • Research question: Is ER applicable in an industrial context? • Is use of belief functions is realistic? • Are assessors liable to admit ignorance and doubt? • . . . • Will carry out a more detailed case study.