Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Refactoring Graphs: Assessing Refactoring over Time

Refactoring Graphs: Assessing Refactoring over Time

Refactoring is an essential activity during software evolution. Frequently, practitioners rely on such transformations to improve source code maintainability and quality. As a consequence, this process may produce new source code entities or change the structure of existing ones. Sometimes, the transformations are atomic, i.e., performed in a single commit. In other cases, they generate sequences of modifications performed over time. To study and reason about refactorings over time, in this paper, we propose a novel concept called refactoring graphs and provide an algorithm to build such graphs. Then, we investigate the history of 10 popular open-source Java-based projects. After eliminating trivial graphs, we characterize a large sample of 1,150 refactoring graphs, providing quantitative data on their size, commits, age, refactoring composition, and developers. We conclude by discussing applications and implications of refactoring graphs, for example, to improve code comprehension, detect refactoring patterns, and support software evolution studies.

ASERG, DCC, UFMG

February 19, 2020
Tweet

More Decks by ASERG, DCC, UFMG

Other Decks in Research

Transcript

  1. Motivation Refactoring is an essential activity during software evolution 2

    Refactoring engines Motivation Benefits and challenges Refactoring over time
  2. Motivation Refactoring is an essential activity during software evolution 3

    Refactoring engines Motivation Benefits and challenges Refactoring over time
  3. Motivation Refactoring is an essential activity during software evolution 4

    Refactoring engines Motivation Benefits and challenges Refactoring over time
  4. Motivation Refactoring is an essential activity during software evolution 5

    Refactoring engines Motivation Benefits and challenges Refactoring over time
  5. Motivation Refactoring is an essential activity during software evolution 6

    Refactoring engines Motivation Benefits and challenges Refactoring over time
  6. Motivation Refactoring is an essential activity during software evolution 7

    Refactoring engines Motivation Benefits and challenges Refactoring over time
  7. Example of Refactoring Subgraph class Foo{ A(){…} } Method A()

    from class Foo Alice Bob Two developers 13
  8. Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{

    A(){…} } MOVE Alice moved method A() from class Foo to Bar 14
  9. Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{

    A(){…} } class Bar{ B(){…} } MOVE RENAME Six days later, Bob renamed method A() to B() 15
  10. Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{

    A(){…} } class Bar{ B(){…} } MOVE These operations create a refactoring subgraph over time RENAME 16
  11. Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{

    A(){…} } class Bar{ B(){…} } MOVE The refactoring subgraph contains three vertices RENAME 17
  12. Example of Refactoring Subgraph MOVE RENAME The refactoring subgraph contains

    two edges class Foo{ A(){…} } class Bar{ A(){…} } class Bar{ B(){…} } 18
  13. Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{

    A(){…} } class Bar{ B(){…} } MOVE The edge represents the refactoring operation RENAME 19
  14. Example of Refactoring Subgraph class Bar{ A(){…} } class Bar{

    B(){…} } MOVE Method A() from class Foo and package util util.Foo#A() RENAME 21
  15. Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{

    A(){…} } class Bar{ B(){…} } MOVE A refactoring subgraph can include refactorings performed by one or more developers RENAME 22
  16. Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{

    A(){…} } class Bar{ B(){…} } MOVE The subgraph contains refactorings performed by two authors RENAME 23
  17. RefDiff A multi-language refactoring detection tool Refactorings between two versions

    of a git-based project 26 RefDiff 2.0: A Multi-language Refactoring Detection Tool TSE, 2020
  18. RefDiff Rename Extract Move Extract and Move Rename and Move

    Push Down Inline Pull Up We center on eight refactorings at the method level 27
  19. Rename Method util.Foo#a() util.Foo#b() util.Foo#a() util.Bar#b() Move and Rename Method

    Change in method’s class util.Foo#m() util.Bar#m() Move Method 30
  20. util.Foo#m() util.Bar#m() Move Method Change in method’s name and class

    Rename Method util.Foo#a() util.Foo#b() util.Foo#a() util.Bar#b() Move and Rename Method 31
  21. Push Down Method util.SubFoo2#m() util.SuperFoo#m() util.SubFoo1#m() util.SubFooi#m() Pull up Method

    util.SubFoo2#m() util.SuperFoo#m() util.SubFoo1#m() util.SubFooi#m() 32
  22. Dataset + 100 Java files + 1K commits 10 popular

    Java projects in terms of stars on GitHub 44
  23. Building Refactoring Graphs 46 Scripts INPUT OUTPUT We implement a

    set of scripts to build refactoring graphs
  24. Building Refactoring Graphs 49 INPUT OUTPUT Creation of a directed

    edge representing this refactoring Algorithm
  25. Number of vertices by refactoring subgraph The most frequent cases

    are subgraphs with three vertices (639 occurrences, 56%) 57
  26. Number of edges by refactoring subgraph The most frequent cases

    are subgraphs with two edges (772 occurrences, 67%) 60
  27. Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT

    AND MOVE A developer renamed drawYLegend() to drawYLabels() 62
  28. Refactoring subgraph from MPAndroidChart EXTRACT EXTRACT AND MOVE EXTRACT AND

    MOVE RENAME 13 days later The same developer extracted a new method 63
  29. Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT

    AND MOVE Two days later The developer made new extractions to another class 64
  30. Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT

    AND MOVE Three commits Commit C1 Commit C2 Commit C3 68
  31. Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT Commit

    C1 A developer moved two methods to another class 74
  32. Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT A

    second developer extracted duplicated code from three methods Three months later Commit C2 75
  33. Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT Two

    authors were responsible for this refactoring subgraph 77
  34. Refactoring subgraph from Spring Framework RENAME RENAME Commit C1 A

    developer renamed method before(...) to filterBefore(...) 86
  35. Refactoring subgraph from Spring Framework RENAME The same developer reverted

    the operation, renaming filterBefore(...) to before(...) RENAME 87 Six days later
  36. Refactoring subgraph from Spring Framework RENAME RENAME A single developer

    was responsible for this refactoring subgraph 88
  37. Homogeneous: Subgraphs with a single refactoring operation Heterogeneous: Subgraphs with

    two or more distinct refactoring operations Two groups: 95
  38. Homogeneous refactoring subgraph from Facebook Fresco EXTRACT EXTRACT EXTRACT EXTRACT

    A developer extracted method fetchDecodedImage(...) 101
  39. Homogeneous refactoring subgraph from Facebook Fresco EXTRACT EXTRACT EXTRACT years

    later EXTRACT A second developer made two new extract operations 102
  40. Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT

    EXTRACT EXTRACT Commit C1 A developer renamed three methods 110
  41. Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT

    EXTRACT EXTRACT Commit C2 A second developer extracted method checkDuration(...) 111
  42. Refactoring subgraph from Square Okhttp MOVE EXTRACT EXTRACT EXTRACT Commit

    C2 Commit C3 … moving to a new class named Util RENAME RENAME RENAME 112
  43. Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT

    EXTRACT EXTRACT Seven refactoring operations 114
  44. Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT

    EXTRACT EXTRACT Commit C3 Three commits Commit C2 Commit C1 115
  45. Large refactoring subgraph from Square Okhttp public int readInt() throws

    IOException { require(4, Deadline.NONE); return buffer.readInt(); } A developer performed 21 extract method operations to move this duplicated code to a single method 121
  46. Refactoring-aware Software Evolution Refactoring Graphs is a key data structure

    to improve the results of current software evolution tools 123
  47. Example: Git Blame Bob creates a method to calculate the

    area of a square class Math{ } float squareArea(float l){ + return l * l * l; } + float squareArea(float l){ + return l * l; + } 125
  48. Example: Git Blame Git-blame shows Bob as a creator Bob

    class Math{ } float squareArea(float l){ + return l * l * l; } + float squareArea(float l){ + return l * l; + } 126
  49. Example: Git Blame class Math{ } float squareArea(float l){ +

    return l * l * l; } Bob introduces a bug in a second commit + return l * l * l; l 127
  50. Example: Git Blame class Math{ } float squareArea(float l){ return

    l * l; } + return l * l * l; Git-blame shows Bob as responsible for the last change (bug) Bob 128
  51. Example: Git Blame class Math{ } - float squareArea(float l){

    - return l * l * l; - } Alice moves the method to a utility class class Utility{ } + float squareArea(float l){ + return l * l * l; + } 129
  52. Example: Git Blame Git-blame shows Alice as creator of method

    squareArea Alice class Utility{ } float squareArea(float l){ + return l * l * l; } + float squareArea(float l){ + return l * l * l; + } 130
  53. Example: Git Blame class Math{ float squareArea(float l){ return l

    * l; } } class Math{ float squareArea(float l){ return l * l * l; } } class Utility{ float squareArea(float l){ return l * l * l; } } Bob is the real creator of squareArea() 131
  54. Example: Git Blame class Math{ float squareArea(float l){ return l

    * l; } } class Math{ float squareArea(float l){ return l * l * l; } } class Utility{ float squareArea(float l){ return l * l * l; } } Bob is responsible for the bug 132
  55. Example: Git Blame class Math{ float squareArea(float l){ return l

    * l; } } class Math{ float squareArea(float l){ return l * l * l; } } class Utility{ float squareArea(float l){ return l * l * l; } } git-blame may miss relevant data due to refactoring operations 133
  56. Example: Git Blame class Math{ float squareArea(float l){ return l

    * l; } } class Math{ float squareArea(float l){ return l * l * l; } } class Utility{ float squareArea(float l){ return l * l * l; } } A refactoring history can improve existing tools and techniques 134
  57. Refactoring subgraphs... … are small … have up to three

    commits … are often heterogeneous … are mostly created by a single developer … span from a few days to months RQ1 RQ2 RQ3 RQ4 RQ5 136