Paloma Oliveira
Marco Túlio Valente
Fernando P. Lima
Extracting Relative Thresholds for
Source Code Metrics
APPLIED SOFTWARE ENGINEERING
RESEARCH GROUP
/
Slide 2
Slide 2 text
Source code metrics
Metrics are rarely used to measure internal quality
It is essential to establish credible thresholds.
Motivation
2
Cohesion
Complexity
Size
Coupling
CSMR-WCRE-2014
Slide 3
Slide 3 text
Our idea: Relative Thresholds
Thresholds followed by most code entities
Example:
We accept a tail (100% - p%): NOA = 20%
3
CSMR-WCRE-2014
Slide 4
Slide 4 text
Relative Thresholds
Format:
M: a source code metric
p: minimal % of entities in each system
k: upper limit
4
CSMR-WCRE-2014
Slide 5
Slide 5 text
p and k characterize a relative threshold
To calculate p and k we need:
Corpus: a set of systems
Two constants:
MIN: real design rules
TAIL: idealized design rules
Extracting Relative Thresholds
5
CSMR-WCRE-2014
Slide 6
Slide 6 text
Extracting Relative Thresholds
MIN: real design rules
Relative thresholds should be followed by at
least MIN% of the systems in the Corpus
6
CSMR-WCRE-2014
Slide 7
Slide 7 text
Extracting Relative Thresholds
TAIL: idealized design rules
The tail starts at the TAIL-th percentile of
the values of a given metric;
7
classes with very
high values
CSMR-WCRE-2014
Slide 8
Slide 8 text
An Example
Corpus: 106 systems
MIN: 90% of the systems
TAIL: 90th percentile
8
CSMR-WCRE-2014
Slide 9
Slide 9 text
Empirical Method
9
Functions to calculate the parameters p and k
CSMR-WCRE-2014
Slide 10
Slide 10 text
ComplianceRate Function
10
Returns the % of systems in the corpus that follows
the relative threshold defined by the pair [p, k]
CSMR-WCRE-2014
Slide 11
Slide 11 text
ComplianceRate – Example #1 - NOA
ComplianceRate [85, 17] = 100%
Maximal ComplianceRate (100%)
But relies on a high value for k (17 attrib.)
11
CSMR-WCRE-2014
Slide 12
Slide 12 text
ComplianceRate – Example #2 – NOA
ComplianceRate [90, 8] = 50%
Smaller ComplianceRate (half of the systems)
But using acceptable value for k (8 attributes)
12
CSMR-WCRE-2014
Slide 13
Slide 13 text
Penalization Functions
13
Both examples are penalized:
High value for k
Example #1: ComplianceRate[85, 17] = 100%
Small value for ComplianceRate
Example #2: ComplianceRate[90, 8] = 50%
CSMR-WCRE-2014
Slide 14
Slide 14 text
Penalty1
Function
14
To penalize CompliaceRate < MIN%
Penalty1 formalizes real design rules
CSMR-WCRE-2014
Penalty2 Function
Penalty
2
formalizes idealized design rules.
TAIL[S]: TAIL-th percentile of the values of M in a system S
TailMedian is the median of the values in TAIL[S].
17
To penalize CompliaceRate when k > TailMedian
CSMR-WCRE-2014
ComplianceRatePenalty
ComplianceRatePenalty is a sum of penalties
20
The Relative Threshold is the one with
the lowest ComplianceRatePenalty
CSMR-WCRE-2014
Slide 21
Slide 21 text
Empirical Method - ComplianceRatePenalty
ComplianceRatePenalty [p, k] = penalty1 + penalty2
ComplianceRatePenalty [85,17] = 0 + 0.9 = 0.9
ComplianceRatePenalty [90,8] = 0.4 + 0 = 0.4
21
The relative threshold is defined by the
lowest value calculated
for ComplianceRatePenalty
CSMR-WCRE-2014
Slide 22
Slide 22 text
ComplianceRatePenalty - Example - NOA
22
ComplianceRatePenalty = 0 in five cases:
[75,7] [75,8] [75,9] [80,8] [80,9]
Tiebreaker criteria
the highest p and the lowest k
[80,8] [80,9]
CSMR-WCRE-2014
Case Study
Includes:
1. Relative thresholds extraction for seven metrics
2. Relative thresholds extraction for a subcorpus
3. Historical analysis
25
CSMR-WCRE-2014
Slide 26
Slide 26 text
26
NOM
LOC
FAN-OUT
RFC
WMC
PUBA/NOA
LCOM
Metrics & Systems
Qualitas Corpus - 106 systems
CSMR-WCRE-2014
Slide 27
Slide 27 text
* Systems that do not follow the thresholds
Extracted Relative Thresholds
27
*
CSMR-WCRE-2014
Slide 28
Slide 28 text
28
Subcorpus: Metrics & Systems
NOM
LOC
FAN-OUT
RFC
WMC
PUBA/NOA
LCOM
Qualitas Corpus - 26 Tools
CSMR-WCRE-2014
Slide 29
Slide 29 text
Subcorpus: Extracted Relative Thresholds
The thresholds rely on relatively high values for k.
29
CSMR-WCRE-2014
Slide 30
Slide 30 text
30
NOM
FAN-OUT
WMC
PUBA/NOA
Historical Analysis: Metrics & Systems
Previous versions - COMETS
CSMR-WCRE-2014
Slide 31
Slide 31 text
Historical analysis - Dataset
COMETS: Come Metrics Time Series Dataset
31
outlier
CSMR-WCRE-2014
Slide 32
Slide 32 text
Historical analysis
32
Along the extracted versions, the systems did not change
their status.
CSMR-WCRE-2014
Slide 33
Slide 33 text
Conclusion
Relative Thresholds
Thresholds should be valid for most of entities
But not for all entities
Case Study: Qualitas Corpus
The extracted threshold represent a balance between
real and idealized design rules.
33
CSMR-WCRE-2014
Slide 34
Slide 34 text
Future Work
New metrics and new corpus;
Different contexts and programming languages
New studies on Relative Thresholds (RT)
Can we use RT to measure technical debt?
What is the impact of not following the RTs?
Are outliers really different from non-outliers?
34
CSMR-WCRE-2014
Slide 35
Slide 35 text
Thank you!
35
APPLIED SOFTWARE
ENGINEERING
RESEARCH GROUP
/
[email protected]
http://aserg.labsoft.dcc.ufmg.br
CSMR-WCRE-2014