Extracting Relative Thresholds for Source Code Metrics (CSMR-WCRE 2014)

Slide 1

Slide 1 text

Paloma Oliveira Marco Túlio Valente Fernando P. Lima Extracting Relative Thresholds for Source Code Metrics APPLIED SOFTWARE ENGINEERING RESEARCH GROUP /

Slide 2

Slide 2 text

Source code metrics Metrics are rarely used to measure internal quality It is essential to establish credible thresholds. Motivation 2 Cohesion Complexity Size Coupling CSMR-WCRE-2014

Slide 3

Slide 3 text

Our idea: Relative Thresholds Thresholds followed by most code entities Example: We accept a tail (100% - p%): NOA = 20% 3 CSMR-WCRE-2014

Slide 4

Slide 4 text

Relative Thresholds Format: M: a source code metric p: minimal % of entities in each system k: upper limit 4 CSMR-WCRE-2014

Slide 5

Slide 5 text

p and k characterize a relative threshold To calculate p and k we need: Corpus: a set of systems Two constants: MIN: real design rules TAIL: idealized design rules Extracting Relative Thresholds 5 CSMR-WCRE-2014

Slide 6

Slide 6 text

Extracting Relative Thresholds MIN: real design rules Relative thresholds should be followed by at least MIN% of the systems in the Corpus 6 CSMR-WCRE-2014

Slide 7

Slide 7 text

Extracting Relative Thresholds TAIL: idealized design rules The tail starts at the TAIL-th percentile of the values of a given metric; 7 classes with very high values CSMR-WCRE-2014

Slide 8

Slide 8 text

An Example Corpus: 106 systems MIN: 90% of the systems TAIL: 90th percentile 8 CSMR-WCRE-2014

Slide 9

Slide 9 text

Empirical Method 9 Functions to calculate the parameters p and k CSMR-WCRE-2014

Slide 10

Slide 10 text

ComplianceRate Function 10 Returns the % of systems in the corpus that follows the relative threshold defined by the pair [p, k] CSMR-WCRE-2014

Slide 11

Slide 11 text

ComplianceRate – Example #1 - NOA ComplianceRate [85, 17] = 100% Maximal ComplianceRate (100%) But relies on a high value for k (17 attrib.) 11 CSMR-WCRE-2014

Slide 12

Slide 12 text

ComplianceRate – Example #2 – NOA ComplianceRate [90, 8] = 50% Smaller ComplianceRate (half of the systems) But using acceptable value for k (8 attributes) 12 CSMR-WCRE-2014

Slide 13

Slide 13 text

Penalization Functions 13 Both examples are penalized: High value for k Example #1: ComplianceRate[85, 17] = 100% Small value for ComplianceRate Example #2: ComplianceRate[90, 8] = 50% CSMR-WCRE-2014

Slide 14

Slide 14 text

Penalty1 Function 14 To penalize CompliaceRate < MIN% Penalty1 formalizes real design rules CSMR-WCRE-2014

Slide 15

Slide 15 text

Penalty1 – Example #1 - NOA ComplianceRate [85, 17] = 100% MIN = 90% Penalty1 [90, 8] = 0 15 CSMR-WCRE-2014

Slide 16

Slide 16 text

Penalty1 – Example #2 - NOA ComplianceRate [90, 8] = 50% MIN = 90% Penalty1 [90, 8] = (90 - 50) / 90 = 0.4 16 CSMR-WCRE-2014

Slide 17

Slide 17 text

Penalty2 Function Penalty 2 formalizes idealized design rules. TAIL[S]: TAIL-th percentile of the values of M in a system S TailMedian is the median of the values in TAIL[S]. 17 To penalize CompliaceRate when k > TailMedian CSMR-WCRE-2014

Slide 18

Slide 18 text

Penalty2 – Example #1 - NOA TailMedian = 9 ComplianceRate [85, 17] = 100% K = 17 > TailMedian Penalty2 [17]: (17 - 9) / 9 = 0.9 18 CSMR-WCRE-2014

Slide 19

Slide 19 text

Penalty2 – Example #2 - NOA TailMedian = 9 ComplianceRate [90, 8] = 50% k = 8 < TailMedian Penalty2 [8] = 0 19 CSMR-WCRE-2014

Slide 20

Slide 20 text

ComplianceRatePenalty ComplianceRatePenalty is a sum of penalties 20 The Relative Threshold is the one with the lowest ComplianceRatePenalty CSMR-WCRE-2014

Slide 21

Slide 21 text

Empirical Method - ComplianceRatePenalty ComplianceRatePenalty [p, k] = penalty1 + penalty2 ComplianceRatePenalty [85,17] = 0 + 0.9 = 0.9 ComplianceRatePenalty [90,8] = 0.4 + 0 = 0.4 21 The relative threshold is defined by the lowest value calculated for ComplianceRatePenalty CSMR-WCRE-2014

Slide 22

Slide 22 text

ComplianceRatePenalty - Example - NOA 22 ComplianceRatePenalty = 0 in five cases: [75,7] [75,8] [75,9] [80,8] [80,9] Tiebreaker criteria the highest p and the lowest k [80,8] [80,9] CSMR-WCRE-2014

Slide 23

Slide 23 text

Empirical Method 23 Relative threshold for NOA metric [p,k] = [80,8] CSMR-WCRE-2014

Slide 24

Slide 24 text

Case Study 24

Slide 25

Slide 25 text

Case Study Includes: 1. Relative thresholds extraction for seven metrics 2. Relative thresholds extraction for a subcorpus 3. Historical analysis 25 CSMR-WCRE-2014

Slide 26

Slide 26 text

26 NOM LOC FAN-OUT RFC WMC PUBA/NOA LCOM Metrics & Systems Qualitas Corpus - 106 systems CSMR-WCRE-2014

Slide 27

Slide 27 text

* Systems that do not follow the thresholds Extracted Relative Thresholds 27 * CSMR-WCRE-2014

Slide 28

Slide 28 text

28 Subcorpus: Metrics & Systems NOM LOC FAN-OUT RFC WMC PUBA/NOA LCOM Qualitas Corpus - 26 Tools CSMR-WCRE-2014

Slide 29

Slide 29 text

Subcorpus: Extracted Relative Thresholds The thresholds rely on relatively high values for k. 29 CSMR-WCRE-2014

Slide 30

Slide 30 text

30 NOM FAN-OUT WMC PUBA/NOA Historical Analysis: Metrics & Systems Previous versions - COMETS CSMR-WCRE-2014

Slide 31

Slide 31 text

Historical analysis - Dataset COMETS: Come Metrics Time Series Dataset 31 outlier CSMR-WCRE-2014

Slide 32

Slide 32 text

Historical analysis 32 Along the extracted versions, the systems did not change their status. CSMR-WCRE-2014

Slide 33

Slide 33 text

Conclusion Relative Thresholds Thresholds should be valid for most of entities But not for all entities Case Study: Qualitas Corpus The extracted threshold represent a balance between real and idealized design rules. 33 CSMR-WCRE-2014

Slide 34

Slide 34 text

Future Work New metrics and new corpus; Different contexts and programming languages New studies on Relative Thresholds (RT) Can we use RT to measure technical debt? What is the impact of not following the RTs? Are outliers really different from non-outliers? 34 CSMR-WCRE-2014

Slide 35

Slide 35 text

Thank you! 35 APPLIED SOFTWARE ENGINEERING RESEARCH GROUP / [email protected] http://aserg.labsoft.dcc.ufmg.br CSMR-WCRE-2014