Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Validating Metric Thresholds with Developers: an Early Result (ICSME 2015, ERA Track, presented by A. Bergel)

Validating Metric Thresholds with Developers: an Early Result (ICSME 2015, ERA Track, presented by A. Bergel)

Thresholds are essential for promoting source code metrics as an effective instrument to control the internal quality of software applications. However, little is known about the relation between software quality as identified by metric thresholds and as perceived by real developers. In this paper, we report the first results of a study designed to validate a technique that extracts relative metric thresholds from benchmark data. We use this technique to extract thresholds from a benchmark of 79 Pharo/Smalltalk applications, which are validated with five experts and 25 developers. Our preliminary results indicate that good quality applications—as cited by experts—respect metric thresholds. In contrast, we observed that noncompliant applications are not largely viewed as requiring more effort to maintain than other applications.

ASERG, DCC, UFMG

October 05, 2015
Tweet

More Decks by ASERG, DCC, UFMG

Other Decks in Research

Transcript

  1. Paloma Oliveira - @palomaifmg
    Marco Tulio Valente - @mtov
    Alexandre Bergel - @alexbergel
    Alexander Serebrenik - @aserebrenik
    Validating Metric Thresholds with Developers:
    An Early Result
    Federal Institute
    Minas Gerais
    University of Chile
    /
    APPLIED SOFTWARE ENGINEERING
    RESEARCH GROUP

    View Slide

  2. 2

    View Slide

  3. 3

    View Slide

  4. ▪ Difficult to give a meaning to software metrics
    values
    ▪ Establishing credible thresholds is essential
    Coming back to software metrics
    4

    View Slide

  5. Relative Thresholds
    ▪ Must be followed by most source code entities.
    ▪ Example:
    5
    Oliveira et al. CSMR-WCRE, 2014

    View Slide

  6. Relative Thresholds
    ▪ p - minimal % of entities in each system
    ▪ M - source code metric
    ▪ k – upper limit
    ▪ Relative Thresholds are extracted from a Corpus
    6
    Oliveira et al. CSMR-WCRE, 2014

    View Slide

  7. This paper
    ▪ Evaluate Relative Threshold against developers
    ▪ Step1: define the corpus
    ▪ Step2: compute the thresholds
    ▪ Step3: asked 5 experts to indicates well and poorly written
    applications
    ▪ Step4: contrast result of Step2 and 3
    ▪ Step5: Asked 25 maintainers of non-compliant applications
    how they perceive the maintenance effort
    7

    View Slide

  8. Relative Thresholds for Pharo
    ▪ Corpus: 79 Pharo applications
    ▪ Metrics: NOA, NOM, FAN-OUT, and WMC
    8
    Metric p k
    NOA 75 5
    NOM 75 29
    FAN-OUT 80 9
    WMC 75 46

    View Slide

  9. Noncompliant systems
    9
    Noncompliant NOA NOM FAN-OUT WMC
    Collections x x
    CommandShell x x x
    Files x x
    Graphics x x x x
    Kernel x x
    Manifest x x x
    Morphic x x
    Shout x x x x
    Tool x x x x

    View Slide

  10. RQ #1: Well-written applications
    ▪ Well-written applications respect the relative thresholds
    ▪ with exception of FAN-OUT
    10
    Systems Descriptions Voted by
    PetitParser Parser framework Expert #1
    PharoLaucher Platform to manage Pharo images Expert #2
    Pillar Markup language and tools Expert #2
    Roassal Visualization engine Expert #3
    Seaside Web framework Expert #4
    SystemLogger Log framework Expert #5
    Zinc HTTP framework Expert #5
    ▪ Do applications perceived as well-written by experts
    respect the derived relative thresholds?

    View Slide

  11. RQ #1: Well-written applications
    ▪ High FAN-OUT
    ▪ presence of extensive inheritance hierarchies with
    many instances of overridden methods.
    11

    View Slide

  12. RQ #2: Poorly-written applications
    ▪ Morphic does not respect the relative thresholds
    “Morphic is an old system and there is no test and sparse
    documentation”. Expert #2
    ▪ Metacello respect the relative thresholds
    ▪ It was cited as poorly-written due to the complexity of its domain.
    12
    Systems Descriptions Voted by
    Metacello Versioning system Expert #4
    Morphic Graphical interface framework Expert #2 and # 4

    View Slide

  13. RQ #3: Noncompliant applications
    ▪ Four (out of nine) noncompliant applications are harder
    to maintain
    ▪ “Graphics is a sum of patches over patches without a
    clear direction on design, with tons of duplicates and
    several design errors/conflicts. So is a pain to introduce
    any change there.” Graphics Maintainer
    ▪ Noncompliant applications are not largely viewed as
    requiring more effort to maintain than other applications
    13

    View Slide

  14. Conclusion
    ▪ Well-designed applications respect the thresholds.
    ▪ Developers usually have difficulties to indicate poorly-
    designed applications.
    ▪ Noncompliant applications are not largely viewed as
    requiring more effort to maintain.
    14
    Federal Institute
    Minas Gerais
    University of Chile
    /
    APPLIED SOFTWARE ENGINEERING
    RESEARCH GROUP

    View Slide