Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Assessing Modularity using Co-Change Clusters (Modularity 2014) - Best Paper Award

Assessing Modularity using Co-Change Clusters (Modularity 2014) - Best Paper Award

The traditional modular structure defined by the package hierarchy suffers from the dominant decomposition problem and it is widely accepted that alternative forms of modularization are necessary to increase developer’s productivity. In this paper, we propose an alternative form to understand and assess package modularity based on co-change clusters, which are highly inter-related classes considering co-change relations. We evaluate how co-change clusters relate to the package decomposition of three real-world systems. The results show that the projection of co-change clusters to packages follow different patterns in each system. Therefore, we claim that modular views based on co-change clusters can improve developers’ understanding on how well-modularized are their systems, considering that modularity is the ability to confine changes and evolve components in parallel.

ASERG, DCC, UFMG

April 23, 2014
Tweet

More Decks by ASERG, DCC, UFMG

Other Decks in Research

Transcript

  1. Assessing Modularity using Co-Change Clusters Luciana Lourdes Silva, Marco Túlio

    Valente, and Marcelo Maia Federal University of Minas Gerais, Brazil http://aserg.labsoft.dcc.ufmg.br/ ACM Modularity, Lugano, Switzerland, April 22-25, 2014
  2. Standard Approach to Assess Modularity • Typical cohesion and coupling

    metrics • Some effects of coupling cannot be captured by structural coupling • Evolutionary view
  3. Our Approach • New approach for assessing modularity • Extract

    co-change graph – Co-change clusters • Distribution maps – Understand and assess package modularity • Similarity analysis
  4. Pre-Processing Tasks #1 – not associated to maintenance issues #2

    – not changing classes #3 – merge commits #4 – tangled changes #5 – highly scattered commits
  5. #1 – Not Associated to Maintenance issues Commit #10: changes

    two different maintenance tasks Commit #20 - contains an unfinished task Commit #21 - the rest of the previous task
  6. #2 – Not Changing Classes • There are commits that

    only change artifacts like: – Configuration files; – Documentation; – Script files. • We also eliminate unit testing classes
  7. Co-Change Graph Class #3 Class #6 Class #7 Class #5

    Class #2 Class #4 Class #1 Class #7 1 10 20 6 8 1 1 12 9 Min_Weight = 2 co-changes
  8. Co-Change Graph Post-Processing Class #3 Class #6 Class #7 Class

    #5 Class #2 Class #4 Class #1 Class #7 1 10 20 6 8 1 1 12 9
  9. Co-Change Graph Post-Processing Class #3 Class #6 Class #7 Class

    #5 Class #2 Class #4 Class #1 Class #7 10 20 6 8 12 9
  10. Modularity Analysis • Goal: – How co-change clusters can be

    used to assess the quality of package decompositions? • We rely on distribution maps: – Focus – Spread
  11. Summing Up Well Encapsulated Clusters – Geronimo: 45% Crosscutting Clusters

    – Lucene: 24.5% – JDT Core: 42% Clusters Partially Encapsulated – Suggests a possible ripple effect – Geronimo: 4 clusters, Lucene: 5 clusters
  12. Similarity Analysis • Goal: – Improve the understanding of cluster’s

    meaning. • We rely on: – Pre-processing text – LSA – Latent Semantic Analysis – Cosine similarity
  13. Semantic Similarity Analysis • Spearman Correlation Test Geronimo The higher

    is the #similarIssues in a cluster, The higher is the focus Lucene The higher is the #similarIssues in a cluster, The lower is the spread
  14. Semantic Similarity Analysis • Spearman Correlation Test JDT Core Induces

    different properties in the clusters, either spread and focus.
  15. Summary • Method to extract an alternative view • Co-change

    graphs are sparse • But, co-change clusters are dense – High internal similarity concerning co-changes – Semantic similarity concerning their issue reports – Co-change cluster patterns
  16. Further Work • Consider co-change at a finer-granularity level •

    Include non-source code artifacts • Clusters can be used as an alternative to the Package Explorer?