Slide 1

Slide 1 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University 7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Generation of Testing Metrics by Using Cluster Analysis of Bug Reports Anna Gromova, Iosif Itkin, Sergey Pavlov Exactpro

Slide 2

Slide 2 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Testing metrics are the measurement of the defect management process. ● time to fix, ● which defects get reopened, ● which defects get rejected, ● which defects get fixed, ● etc. Testing Metrics

Slide 3

Slide 3 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University The Problem of Corresponding Metrics

Slide 4

Slide 4 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Contribution ● The approach to generating the testing metrics via defect reports clustering; ● Enhancement of a set of defect report attributes; ● An empirical comparison of different clustering algorithms for the considered task.

Slide 5

Slide 5 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Related Work ● Metrics prediction is a very important task of defect management ● It is not yet obvious which testing metrics are more suitable for which software development projects ● Clustering allows to reveal duplicates and related bugs

Slide 6

Slide 6 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Defect Attributes Attribute Data type Values Priority Ordinal “Blocker”, “Critical”, “Major”, “Minor”, “Optional”, “Trivial” Resolution Categorical “Cannot Reproduce”, “Done”, “Duplicate Issue”, “Explained”, “Migrated to another IS”, “Out of Date”, “Rejected”, “Resolved at Apache”, “Unresolved”, “Won’t Fix” Was_reopened Boolean {0;1} Time to resolve Numeric Count of comments Numeric Count of attachments Numeric Area i Boolean {0;1} D={d 1 , d 2 .. d n }, where n is the number of defects in the project. d j = {Priority, Resolution, was_reopened, Time to resolve, Count of attachments, Count of comments, Area 1 , .. Area k } where k is the number of the defined areas of testing

Slide 7

Slide 7 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Dataset Information AS7 SOA AP6 Project name JBOSS Application Server 7 JBoss Enterprise SOA Platform JBoss Enterprise Application Platform 6 Number of defects 2944 2298 1066 Number of areas of testing 15 10 15 Number of reopened defects 253 510 408 Time to resolve: min / max/ mean 0 / 1792 / 39.535 0 / 1354 / 118.827 0 / 609 / 112.282 Count of attachments: min / max/ mean 0 / 15 / 0.215 0 / 13 /0.327 0 /17 / 0.280 Count of comments: min / max/ mean 0 / 83 / 2.909 1 / 34 / 4.535 1 / 126 / 7.280

Slide 8

Slide 8 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Distribution of Defect Reports by the “Resolution” Attribute AS7 SOA AP6

Slide 9

Slide 9 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University New Variables "Resolution_Out of Date_new": ● "Resolution_Out of Date", ● "Resolution_Deferred", ● "Resolution_Partially Completed". "Resolution_Wont Fix_new": ● "Resolution_Won’t Fix", ● "Resolution_Cannot Reproduce", ● "Resolution_Incomplete Description".

Slide 10

Slide 10 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Distribution of Defect Reports by the “Area of Testing” Attribute AS7 SOA AP6

Slide 11

Slide 11 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Clustering Algorithms: ● k-means ● EM Validity criteria: ● Silhouette index ● Akaike's Information Criterion (AIC) / Bayesian Information Criterion (BIC)

Slide 12

Slide 12 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Results of the Validity Criteria for k-means AS7 SOA AP6

Slide 13

Slide 13 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Results of the Validity Criteria for EM AS7 SOA AP6

Slide 14

Slide 14 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Results of k-means Clustering Types of bugs: ● “inexpensive-to-resolve”, TTR, count of comments, count of attachment =mean , Minor, “Out of date” / “Migrated to another IS” ● “problematic” TTR, count of comments, count of attachment >=mean , Major / Blocker, “Rejected” / “Cannot reproduce”, “Won’t Fix”

Slide 15

Slide 15 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Results of EM Clustering Types of bugs: ● “inexpensive-to-resolve”, TTR, count of comments, count of attachment =mean , Minor, “Out of date” / “Migrated to another IS” ● “expensive-to-resolve”, TTR, count of comments, count of attachment >=mean ● “Invalid”, TTR, count of comments, count of attachment =mean , Major / Critical / Blocker , “Done”, was reopened

Slide 16

Slide 16 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Determined Testing Metrics ● The number of “underestimated” defects Prevent Critical/Blocker bug persistence in system ● The number of “expensive-to-resolve”/”inexpensive-to-resolve” defects Change the release decision ● The number of “invalid” defects Evaluate the quality of testing

Slide 17

Slide 17 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Excluded Testing Metrics ● The number of “longest-to resolve” defects (aka outdated) ● The number of “accidentally reopened” defects

Slide 18

Slide 18 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University Generating Metrics via Cluster Interpretation

Slide 19

Slide 19 text

7-9 November, Tbilisi Ivane Javakhishvili Tbilisi State University ● Cluster analysis helps to understand the nature of software defects and determine new testing metrics or improve the existing ones; ● The EM algorithm allows getting a detailed picture of the project; ● Using the enhanced attribute set helps to consider the clusters in a detailed way. Conclusion