Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Generation of Testing Metrics by Using Cluster Analysis of Bug Reports

Exactpro
November 07, 2019

Generation of Testing Metrics by Using Cluster Analysis of Bug Reports

Anna Gromova, Iosif Itkin and Sergey Pavlov

International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)
7-9 November 2019, Tbilisi

Video: https://youtu.be/jJepN0LctqM

TMPA Conference website https://tmpaconf.org/
TMPA Conference on Facebook https://www.facebook.com/groups/tmpaconf/

Exactpro

November 07, 2019
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Generation of Testing Metrics by Using
    Cluster Analysis of Bug Reports
    Anna Gromova, Iosif Itkin, Sergey Pavlov
    Exactpro

    View full-size slide

  2. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Testing metrics are the measurement of the defect management process.
    ● time to fix,
    ● which defects get reopened,
    ● which defects get rejected,
    ● which defects get fixed,
    ● etc.
    Testing Metrics

    View full-size slide

  3. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    The Problem of Corresponding Metrics

    View full-size slide

  4. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Contribution
    ● The approach to generating the testing metrics via defect reports clustering;
    ● Enhancement of a set of defect report attributes;
    ● An empirical comparison of different clustering algorithms for the considered task.

    View full-size slide

  5. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Related Work
    ● Metrics prediction is a very important task of
    defect management
    ● It is not yet obvious which testing metrics are
    more suitable for which software development
    projects
    ● Clustering allows to reveal duplicates and related
    bugs

    View full-size slide

  6. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Defect Attributes
    Attribute Data type Values
    Priority Ordinal “Blocker”, “Critical”, “Major”, “Minor”, “Optional”, “Trivial”
    Resolution Categorical “Cannot Reproduce”, “Done”, “Duplicate Issue”, “Explained”, “Migrated to
    another IS”, “Out of Date”, “Rejected”, “Resolved at Apache”, “Unresolved”,
    “Won’t Fix”
    Was_reopened Boolean {0;1}
    Time to resolve Numeric
    Count of comments Numeric
    Count of attachments Numeric
    Area
    i
    Boolean {0;1}
    D={d
    1
    , d
    2
    .. d
    n
    },
    where n is the number of defects in the project.
    d
    j
    = {Priority, Resolution, was_reopened, Time to resolve, Count of attachments, Count of comments, Area
    1
    , .. Area
    k
    }
    where k is the number of the defined areas of testing

    View full-size slide

  7. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Dataset Information
    AS7 SOA AP6
    Project name JBOSS Application Server 7 JBoss Enterprise SOA Platform JBoss Enterprise Application
    Platform 6
    Number of defects 2944 2298 1066
    Number of areas of testing 15 10 15
    Number of reopened defects 253 510 408
    Time to resolve:
    min / max/ mean
    0 / 1792 / 39.535 0 / 1354 / 118.827 0 / 609 / 112.282
    Count of attachments:
    min / max/ mean
    0 / 15 / 0.215 0 / 13 /0.327 0 /17 / 0.280
    Count of comments:
    min / max/ mean
    0 / 83 / 2.909 1 / 34 / 4.535 1 / 126 / 7.280

    View full-size slide

  8. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Distribution of Defect Reports by the “Resolution” Attribute
    AS7
    SOA
    AP6

    View full-size slide

  9. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    New Variables
    "Resolution_Out of Date_new":
    ● "Resolution_Out of Date",
    ● "Resolution_Deferred",
    ● "Resolution_Partially Completed".
    "Resolution_Wont Fix_new":
    ● "Resolution_Won’t Fix",
    ● "Resolution_Cannot Reproduce",
    ● "Resolution_Incomplete Description".

    View full-size slide

  10. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Distribution of Defect Reports by the “Area of Testing”
    Attribute
    AS7 SOA AP6

    View full-size slide

  11. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Clustering
    Algorithms:
    ● k-means
    ● EM
    Validity criteria:
    ● Silhouette index
    ● Akaike's Information Criterion (AIC) / Bayesian Information Criterion (BIC)

    View full-size slide

  12. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Results of the Validity Criteria for k-means
    AS7
    SOA
    AP6

    View full-size slide

  13. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Results of the Validity Criteria for EM
    AS7 SOA
    AP6

    View full-size slide

  14. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Results of k-means Clustering
    Types of bugs:
    ● “inexpensive-to-resolve”,
    TTR, count of comments, count of attachment ● “longest-to-resolve”,
    TTR>=mean , Minor, “Out of date” / “Migrated to another IS”
    ● “problematic”
    TTR, count of comments, count of attachment >=mean , Major / Blocker, “Rejected” / “Cannot
    reproduce”, “Won’t Fix”

    View full-size slide

  15. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Results of EM Clustering
    Types of bugs:
    ● “inexpensive-to-resolve”,
    TTR, count of comments, count of attachment ● “longest-to-resolve”,
    TTR>=mean , Minor, “Out of date” / “Migrated to another IS”
    ● “expensive-to-resolve”,
    TTR, count of comments, count of attachment >=mean
    ● “Invalid”,
    TTR, count of comments, count of attachment ● “underestimated”
    TTR>=mean , Major / Critical / Blocker , “Done”, was reopened

    View full-size slide

  16. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Determined Testing Metrics
    ● The number of “underestimated” defects
    Prevent Critical/Blocker bug persistence in system
    ● The number of “expensive-to-resolve”/”inexpensive-to-resolve” defects
    Change the release decision
    ● The number of “invalid” defects
    Evaluate the quality of testing

    View full-size slide

  17. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Excluded Testing Metrics
    ● The number of “longest-to resolve” defects (aka outdated)
    ● The number of “accidentally reopened” defects

    View full-size slide

  18. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Generating Metrics via Cluster Interpretation

    View full-size slide

  19. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    ● Cluster analysis helps to understand the nature of software defects and determine
    new testing metrics or improve the existing ones;
    ● The EM algorithm allows getting a detailed picture of the project;
    ● Using the enhanced attribute set helps to consider the clusters in a detailed way.
    Conclusion

    View full-size slide