Using Evidential Reasoning to Make Qualified Predictions of Software Quality

Using Evidential Reasoning to Make Qualied Predictions of Software Quality
Neil Walkinshaw Department of Computer Science University of Leicester le.ac.uk

The Problem Evidential Reasoning Example Conclusions Motivation Quality Assessment is
rigorous . . . • Lots of sources of information about program quality • Formal verication, testing, code reviews, design inspection, . . . • Lots of metrics • Test coverage, complexity, cohesion, churn, . . . • Lots of notions / formalised models of software quality • Standards, certications, quality models, . . .

The Problem Evidential Reasoning Example Conclusions Motivation Quality Assessment is
rigorous . . . • Lots of sources of information about program quality • Formal verication, testing, code reviews, design inspection, . . . • Lots of metrics • Test coverage, complexity, cohesion, churn, . . . • Lots of notions / formalised models of software quality • Standards, certications, quality models, . . . . . . but unscientic • Path from evidence to conclusions is ad-hoc • Evidence is often ambiguous or partial

The Problem Evidential Reasoning Example Conclusions What is quality? •
Notoriously dicult to characterise • Commonly decomposed into a tree of (diverse) sub-factors • Lots of models. • Several industry standards (e.g. ISO/IEC 9126), lots of models for specic domains • Wagner's interviews of 25 practitioners yielded 31 dierent quality models.

The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?
Ad-hoc • Use metrics etc. to make an intuitive assessment of the quality • Ensure minimal criteria • E.g. ensure there is a minimum of 80% code coverage in the test sets. • No methods have a cyclomatic complexity > 10. • . . . Problems • Unsystematic • Does not quantify how good or bad the system is • Subjective, lots of implicit presumptions.

Bayesian Belief Networks (Wagner, 2009) Test quality Test oracles Test adequacy Execution traces Soundness Completeness Requirements Oracles Source code Quality Model Indicators Software artifacts Metric A low/medium/high Metric B low/medium/high Metric C low/medium/high Metric D low/medium/high Metric E low/medium/high Mutation analysis

Conditional Probability Table Mutation Score low high Branch coverage low med high low med high Adequacy low 0.9 0.5 0.3 0.7 0.4 0.1 medium 0.07 0.4 0.6 0.2 0.4 0.5 high 0.03 0.1 0.3 0.1 0.2 0.4 • Table provides complete mapping from inputs to probabilistic outputs

Several limitations • Eort • Generating probability tables - where do the probabilities come from? • Emphasis on metrics • Quality cannot be reduced to metrics • Human intuition is a vital part of quality assessment • Ignorance, uncertainty, and doubt • Cannot be captured in probability distributions • To be trustworthy, quality assessment must expose and highlight any uncertainty or ignorance.

The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Accounting for
doubt and ignorance • Proposed by Yang et al. in Operations Research • Models cause-eect relations between dierent factors (as with BBN) • Based on Dempster Schäfer theory • Dempster's Mathematical theory of evidence (1968) • Generalises probability to enable expression of ignorance / doubt • Does not require conditional probability tables • Does not take single values (e.g. metrics) as inputs • Inputs are `belief functions', capturing developer's subjective doubt and ignorance • More suited for modelling human opinion, with its inherent uncertainty

The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Test quality
Test oracles Test adequacy Execution traces Soundness Completeness Requirements Oracles Source code Mutation analysis Developers

Test oracles Test adequacy Execution traces Soundness Completeness Requirements Oracles Source code Mutation analysis Quality Confidence Confidence Quality Confidence Quality Developers

Test oracles Test adequacy Execution traces Soundness Completeness Requirements Oracles Source code Mutation analysis Quality Confidence Confidence Quality Confidence Quality Quality Confidence Uncertainty Developers

The Problem Evidential Reasoning Example Conclusions Belief Functions Capturing condence
and ignorance • Apportion our belief mass to dierent quality levels • Any unallocated mass indicates ignorance • Sum of belief masses must be ≤ 1 Example - Test Adequacy • All we know is that: • Average branch coverage is 80%. • Program involves lots of statistical routines. • Code coverage is a poor indicator of adequacy, especially for data-intensitve computations. • Uncertain if remaining 20% of branches are infeasible. • Whatever we decide, we're only 50% condent that it will be accurate.

The Problem Evidential Reasoning Example Conclusions Belief Functions Example -
Test Adequacy Belief mass Quality 0 0.5 poor indifferent awful good excellent 0 0.05 0.05 0.3 0.1 ? 0.5

The Problem Evidential Reasoning Example Conclusions Evidential reasoning Aggregating Belief
Functions • Given a quality model (a weighted tree of factors) • Developer provides belief functions for the leaf nodes • Combine belief functions and propagate belief mass to produce an aggregate belief function Test quality Test oracles Test adequacy Soundness Completeness Quality Confidence Confidence Quality Confidence Quality Quality Confidence Uncertainty 0.5 0.5 0.7 0.3

The Problem Evidential Reasoning Example Conclusions Evidential reasoning Aggregating Belief
Functions • Given a quality model (a weighted tree of factors) • Developer provides belief functions for the leaf nodes • Combine belief functions and propagate belief mass to produce an aggregate belief function • Combination must maintain following properties: • Must not be assessed to a given grade if there is no supporting evidence. • Should be precisely assessed to a given grade if all of the evidence supports this level. • If all attributes are completely assessed (no doubt), the aggregate assessment should be complete too. • If there is any incompleteness (ignorance or doubt), then this should be reected in the aggregate assessment.

The Problem Evidential Reasoning Example Conclusions Assessing the NASA CM1
Software What do we know? • CM1 is a NASA spacecraft instrument written in C. • Metrics data from the PROMISE repository (Shepperd et al.'s cleaned up version) • LOC, Halstead metrics, Cyclomatic complexity, Multiple-condition count, %age comments Lots of ignorance here . . . • No access to source code • Unfamiliar with domain • Unfamiliar with development procedures • Vulnerable metrics

The Problem Evidential Reasoning Example Conclusions 1. Choose a quality
model Maintainability Analysability 0.4 Quality Assurance 0.4 Implementation quality 0.2 Size 0.4 Complexity 0.6 Testability 0.6 Adequacy 0.4 Comment Quality 0.5 Data Coverage 0.5 Code Coverage 1

The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions
for lowest-level factors Size 0 500 1000 1500 2000 0 100 200 300 400 LOC_EXECUTABLE HALSTEAD_LENGTH 0 0.1 0.2 0.3 0.4 0.5 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 0.05

for lowest-level factors Complexity 0 50 100 0 25 50 75 100 CYCLOMATIC_COMPLEXITY HALSTEAD_DIFFICULTY 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 0.1

for lowest-level factors Testability 0 25 50 75 100 125 0 5 10 15 PARAMETER_COUNT MULTIPLE_CONDITION_COUNT 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 0.1

for lowest-level factors Code coverage / data coverage NO DATA 0 0.2 0.4 0.6 0.8 1 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 1

for lowest-level factors Comment quality 0 10 20 30 40 0 20 40 60 PERCENT_COMMENTS count 0 0.05 0.1 0.15 0.2 0.25 0.3 awful poor indifferent good excellent Belief mass Assessment Doubt / ignorance = 0.2

The Problem Evidential Reasoning Example Conclusions 3. Aggregate the Belief
functions Maintainability Analysability 0.4 Quality Assurance 0.4 Implementation quality 0.2 Size 0.4 Complexity 0.6 Testability 0.6 Adequacy 0.4 Comment Quality 0.5 Data Coverage 0.5 Code Coverage 1 0 0.1 0.2 0.3 0.4 0.5 awful poor indifferent good excellent Belief mass Assessment 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 awful poor indifferent good excellent Belief mass Assessment 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 awful poor indifferent good excellent Belief mass Assessment 0 0.2 0.4 0.6 0.8 1 awful poor indifferent good excellent Belief mass Assessment 0 0.2 0.4 0.6 0.8 1 awful poor indifferent good excellent Belief mass Assessment 0 0.05 0.1 0.15 0.2 0.25 0.3 awful poor indifferent good excellent Belief mass Assessment

The Problem Evidential Reasoning Example Conclusions ERTool Basic implementation •
https://bitbucket.org/nwalkinshaw/ertool • Input: ((comment_quality:1)implementation_quality:0.2, (testability:0.6,(code_coverage:0.5, data_coverage:0.5)adequacy:0.4) quality_assurance:0.4,(complexity:0.6,size:0.4) analysability:0.4)maintainability; 5 comment_quality 0.05 0.05 0.1 0.3 0.3 code_coverage 0 0 0 0 0 data_coverage 0 0 0 0 0 testability 0.25 0.4 0.25 0 0 complexity 0.2 0.4 0.3 0 0 size 0 0 0.35 0.5 0.1

The Problem Evidential Reasoning Example Conclusions ERTool Output

The Problem Evidential Reasoning Example Conclusions Notable Features • Traceability
• Reasons for assessments can be traced down to specic factors. • Doubt and Ignorance • Explicit throughout. • A better GUI could highlight this at every node in the tree. • Even allows for complete ignorance. • Requires a relatively small amount of input • A quality model of choice • Belief functions for the lowest-level factors

The Problem Evidential Reasoning Example Conclusions Conclusions and Ongoing Work
Conclusions • Evidential Reasoning presents a plausible basis for reasoning about software quality • Accommodates ignorance and doubt - intrinsic human factors • Openly available implementation Ongoing Work • Evaluation • Research question: Is ER applicable in an industrial context? • Is use of belief functions is realistic? • Are assessors liable to admit ignorance and doubt? • . . . • Will carry out a more detailed case study.

Using Evidential Reasoning to Make Qualified Pr...

Using Evidential Reasoning to Make Qualified Predictions of Software Quality

PROMISE'13: The 9th International Conference on Predictive Models in Software Engineering

More Decks by PROMISE'13: The 9th International Conference on Predictive Models in Software Engineering

Other Decks in Research

Featured

Transcript

Using Evidential Reasoning to Make Qualied Predictions of Software Quality

The Problem Evidential Reasoning Example Conclusions Motivation Quality Assessment is

The Problem Evidential Reasoning Example Conclusions Motivation Quality Assessment is

The Problem Evidential Reasoning Example Conclusions What is quality? •

The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?

The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?

The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?

The Problem Evidential Reasoning Example Conclusions How is Quality Assessed?

The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Accounting for

The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Test quality

The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Test quality

The Problem Evidential Reasoning Example Conclusions Evidential Reasoning Test quality

The Problem Evidential Reasoning Example Conclusions Belief Functions Capturing condence

The Problem Evidential Reasoning Example Conclusions Belief Functions Example -

The Problem Evidential Reasoning Example Conclusions Evidential reasoning Aggregating Belief

The Problem Evidential Reasoning Example Conclusions Evidential reasoning Aggregating Belief

The Problem Evidential Reasoning Example Conclusions Assessing the NASA CM1

The Problem Evidential Reasoning Example Conclusions 1. Choose a quality

The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

The Problem Evidential Reasoning Example Conclusions 2. Produce belief functions

The Problem Evidential Reasoning Example Conclusions 3. Aggregate the Belief

The Problem Evidential Reasoning Example Conclusions ERTool Basic implementation •

The Problem Evidential Reasoning Example Conclusions ERTool Output

The Problem Evidential Reasoning Example Conclusions Notable Features • Traceability

The Problem Evidential Reasoning Example Conclusions Conclusions and Ongoing Work