Slide 1

Slide 1 text

Refactoring Graphs: Assessing Refactoring over Time Aline Brito, Andre Hora, Marco Tulio Valente IEEE SANER 2020

Slide 2

Slide 2 text

Motivation Refactoring is an essential activity during software evolution 2 Refactoring engines Motivation Benefits and challenges Refactoring over time

Slide 3

Slide 3 text

Motivation Refactoring is an essential activity during software evolution 3 Refactoring engines Motivation Benefits and challenges Refactoring over time

Slide 4

Slide 4 text

Motivation Refactoring is an essential activity during software evolution 4 Refactoring engines Motivation Benefits and challenges Refactoring over time

Slide 5

Slide 5 text

Motivation Refactoring is an essential activity during software evolution 5 Refactoring engines Motivation Benefits and challenges Refactoring over time

Slide 6

Slide 6 text

Motivation Refactoring is an essential activity during software evolution 6 Refactoring engines Motivation Benefits and challenges Refactoring over time

Slide 7

Slide 7 text

Motivation Refactoring is an essential activity during software evolution 7 Refactoring engines Motivation Benefits and challenges Refactoring over time

Slide 8

Slide 8 text

“Refactoring takes time...” 8 Fowler, 1999

Slide 9

Slide 9 text

Refactoring Graph 9

Slide 10

Slide 10 text

A refactoring graph is a set of disconnected subgraphs 10 Refactoring Graph Software history

Slide 11

Slide 11 text

Example of Refactoring Subgraph 11

Slide 12

Slide 12 text

Example of Refactoring Subgraph class Foo{ A(){…} } Method A() from class Foo 12

Slide 13

Slide 13 text

Example of Refactoring Subgraph class Foo{ A(){…} } Method A() from class Foo Alice Bob Two developers 13

Slide 14

Slide 14 text

Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{ A(){…} } MOVE Alice moved method A() from class Foo to Bar 14

Slide 15

Slide 15 text

Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{ A(){…} } class Bar{ B(){…} } MOVE RENAME Six days later, Bob renamed method A() to B() 15

Slide 16

Slide 16 text

Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{ A(){…} } class Bar{ B(){…} } MOVE These operations create a refactoring subgraph over time RENAME 16

Slide 17

Slide 17 text

Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{ A(){…} } class Bar{ B(){…} } MOVE The refactoring subgraph contains three vertices RENAME 17

Slide 18

Slide 18 text

Example of Refactoring Subgraph MOVE RENAME The refactoring subgraph contains two edges class Foo{ A(){…} } class Bar{ A(){…} } class Bar{ B(){…} } 18

Slide 19

Slide 19 text

Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{ A(){…} } class Bar{ B(){…} } MOVE The edge represents the refactoring operation RENAME 19

Slide 20

Slide 20 text

Example of Refactoring Subgraph MOVE util.Foo#A() util.Bar#A() util.Bar#B() The vertices are the full signature of methods RENAME 20

Slide 21

Slide 21 text

Example of Refactoring Subgraph class Bar{ A(){…} } class Bar{ B(){…} } MOVE Method A() from class Foo and package util util.Foo#A() RENAME 21

Slide 22

Slide 22 text

Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{ A(){…} } class Bar{ B(){…} } MOVE A refactoring subgraph can include refactorings performed by one or more developers RENAME 22

Slide 23

Slide 23 text

Example of Refactoring Subgraph class Foo{ A(){…} } class Bar{ A(){…} } class Bar{ B(){…} } MOVE The subgraph contains refactorings performed by two authors RENAME 23

Slide 24

Slide 24 text

Outline 1. RefDiff Tool 2. Dataset 3. Building refactoring graphs 4. Results 24

Slide 25

Slide 25 text

RefDiff Tool 25

Slide 26

Slide 26 text

RefDiff A multi-language refactoring detection tool Refactorings between two versions of a git-based project 26 RefDiff 2.0: A Multi-language Refactoring Detection Tool TSE, 2020

Slide 27

Slide 27 text

RefDiff Rename Extract Move Extract and Move Rename and Move Push Down Inline Pull Up We center on eight refactorings at the method level 27

Slide 28

Slide 28 text

Rename Method util.Foo#a() util.Foo#b() util.Foo#m() util.Bar#m() util.Foo#a() util.Bar#b() Move and Rename Method Move Method Most trivial operations 28

Slide 29

Slide 29 text

Rename Method util.Foo#a() util.Foo#b() util.Foo#m() util.Bar#m() util.Foo#a() util.Bar#b() Move and Rename Method Move Method Change in method’s name 29

Slide 30

Slide 30 text

Rename Method util.Foo#a() util.Foo#b() util.Foo#a() util.Bar#b() Move and Rename Method Change in method’s class util.Foo#m() util.Bar#m() Move Method 30

Slide 31

Slide 31 text

util.Foo#m() util.Bar#m() Move Method Change in method’s name and class Rename Method util.Foo#a() util.Foo#b() util.Foo#a() util.Bar#b() Move and Rename Method 31

Slide 32

Slide 32 text

Push Down Method util.SubFoo2#m() util.SuperFoo#m() util.SubFoo1#m() util.SubFooi#m() Pull up Method util.SubFoo2#m() util.SuperFoo#m() util.SubFoo1#m() util.SubFooi#m() 32

Slide 33

Slide 33 text

Push Down Method util.SubFoo2#m() util.SuperFoo#m() util.SubFoo1#m() util.SubFooi#m() Moving a method from a superclass to one or more subclasses 33

Slide 34

Slide 34 text

Pull up Method util.SubFoo2#m() util.SuperFoo#m() util.SubFoo1#m() util.SubFooi#m() Moving one or more methods from subclasses to a superclass 34

Slide 35

Slide 35 text

Extract Method util.Foo#m2() util.Foo#m() util.Foo#m1() util.Foo#mi() util.Foo#m2() util.Foo#m() util.Foo#m1() util.Foo#mi() 35

Slide 36

Slide 36 text

Extract Method util.Foo#m2() util.Foo#m() util.Foo#m1() util.Foo#mi() Extracting multiple methods from a single method 36

Slide 37

Slide 37 text

Extract Method util.Foo#m2() util.Foo#m() util.Foo#m1() util.Foo#mi() Extracting a single method from multiple methods 37

Slide 38

Slide 38 text

Inline Method util.Bar2#m2() util.Foo#m() util.Bar1#m1() util.Bari#mi() Extract and Move Method util.Foo#m2() util.Bar#m() util.Foo#m1() util.Foo#mi() 38

Slide 39

Slide 39 text

Extract and Move Method util.Foo#m2() util.Bar#m() util.Foo#m1() util.Foo#mi() Extracting a method to a distinct class 39

Slide 40

Slide 40 text

Inline Method util.Bar2#m2() util.Foo#m() util.Bar1#m1() util.Bari#mi() Removal of trivial elements and replacement of the respective calls 40

Slide 41

Slide 41 text

Dataset 41

Slide 42

Slide 42 text

Dataset 10 popular Java projects in terms of stars on GitHub 42

Slide 43

Slide 43 text

Dataset 10 popular Java projects in terms of stars on GitHub 43

Slide 44

Slide 44 text

Dataset + 100 Java files + 1K commits 10 popular Java projects in terms of stars on GitHub 44

Slide 45

Slide 45 text

Building Refactoring Graphs 45

Slide 46

Slide 46 text

Building Refactoring Graphs 46 Scripts INPUT OUTPUT We implement a set of scripts to build refactoring graphs

Slide 47

Slide 47 text

Building Refactoring Graphs 47 Algorithm INPUT OUTPUT The input comprises a list of refactorings

Slide 48

Slide 48 text

Building Refactoring Graphs 48 INPUT OUTPUT Identification of each refactoring and the two methods involved Algorithm

Slide 49

Slide 49 text

Building Refactoring Graphs 49 INPUT OUTPUT Creation of a directed edge representing this refactoring Algorithm

Slide 50

Slide 50 text

Building Refactoring Graphs 50 Algorithm INPUT OUTPUT The output includes sets of refactoring subgraphs in text format

Slide 51

Slide 51 text

Building Refactoring Graphs 51 We detected a total of 8,926 refactoring subgraphs

Slide 52

Slide 52 text

Building Refactoring Graphs 52 We assess 1,150 (13%) refactoring subgraphs with more than one commit

Slide 53

Slide 53 text

Results 53

Slide 54

Slide 54 text

RQ1: What is the size of refactoring subgraphs? 54

Slide 55

Slide 55 text

Number of vertices by refactoring subgraph 55

Slide 56

Slide 56 text

Number of vertices by refactoring subgraph Number of vertices ranges from two to four (85%) 56

Slide 57

Slide 57 text

Number of vertices by refactoring subgraph The most frequent cases are subgraphs with three vertices (639 occurrences, 56%) 57

Slide 58

Slide 58 text

Number of edges by refactoring subgraph 58

Slide 59

Slide 59 text

Number of edges by refactoring subgraph Number of edges ranges between two and three (83%) 59

Slide 60

Slide 60 text

Number of edges by refactoring subgraph The most frequent cases are subgraphs with two edges (772 occurrences, 67%) 60

Slide 61

Slide 61 text

Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT AND MOVE 61

Slide 62

Slide 62 text

Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT AND MOVE A developer renamed drawYLegend() to drawYLabels() 62

Slide 63

Slide 63 text

Refactoring subgraph from MPAndroidChart EXTRACT EXTRACT AND MOVE EXTRACT AND MOVE RENAME 13 days later The same developer extracted a new method 63

Slide 64

Slide 64 text

Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT AND MOVE Two days later The developer made new extractions to another class 64

Slide 65

Slide 65 text

Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT AND MOVE Only one developer 65

Slide 66

Slide 66 text

Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT AND MOVE Five vertices 66

Slide 67

Slide 67 text

Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT AND MOVE Four edges 67

Slide 68

Slide 68 text

Refactoring subgraph from MPAndroidChart RENAME EXTRACT EXTRACT AND MOVE EXTRACT AND MOVE Three commits Commit C1 Commit C2 Commit C3 68

Slide 69

Slide 69 text

RQ2: How many commits are there in refactoring subgraphs? 69

Slide 70

Slide 70 text

Number of commits by refactoring subgraph 70

Slide 71

Slide 71 text

Number of commits by refactoring subgraph Most refactoring subgraphs are created in two or three commits (95%) 71

Slide 72

Slide 72 text

Number of commits by refactoring subgraph Most recurrent case has two commits (81%) 72

Slide 73

Slide 73 text

Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT Commit C1 Commit C2 73

Slide 74

Slide 74 text

Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT Commit C1 A developer moved two methods to another class 74

Slide 75

Slide 75 text

Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT A second developer extracted duplicated code from three methods Three months later Commit C2 75

Slide 76

Slide 76 text

Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT Two methods are the ones moved early 76

Slide 77

Slide 77 text

Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT Two authors were responsible for this refactoring subgraph 77

Slide 78

Slide 78 text

Refactoring subgraph from Elasticsearch MOVE MOVE EXTRACT EXTRACT EXTRACT Commit C1 Commit C2 Two commits 78

Slide 79

Slide 79 text

RQ3: What is the age of refactoring subgraphs? 79

Slide 80

Slide 80 text

Age of the refactoring subgraphs Most recent commit Oldest commit Age 80

Slide 81

Slide 81 text

Age of the refactoring subgraphs 81

Slide 82

Slide 82 text

Age of the refactoring subgraphs 82 67% of the refactoring subgraphs have more than one month

Slide 83

Slide 83 text

Age of the refactoring subgraphs Some subgraphs have few days 83

Slide 84

Slide 84 text

Age of the refactoring subgraphs Most subgraphs have weeks or even months 84

Slide 85

Slide 85 text

Refactoring subgraph from Spring Framework RENAME RENAME 85

Slide 86

Slide 86 text

Refactoring subgraph from Spring Framework RENAME RENAME Commit C1 A developer renamed method before(...) to filterBefore(...) 86

Slide 87

Slide 87 text

Refactoring subgraph from Spring Framework RENAME The same developer reverted the operation, renaming filterBefore(...) to before(...) RENAME 87 Six days later

Slide 88

Slide 88 text

Refactoring subgraph from Spring Framework RENAME RENAME A single developer was responsible for this refactoring subgraph 88

Slide 89

Slide 89 text

Refactoring subgraph from Spring Framework RENAME RENAME Two commits Commit C1 Commit C2 89

Slide 90

Slide 90 text

RQ4: Which refactorings compose the refactoring subgraphs? 90

Slide 91

Slide 91 text

Frequency of refactoring operations 91

Slide 92

Slide 92 text

Frequency of refactoring operations Most common refactoring operations include rename (21%), extract and move (19%), and extract (17%) 92

Slide 93

Slide 93 text

Frequency of refactoring operations We detected only 83 occurrences of move and rename operations 93

Slide 94

Slide 94 text

Frequency of refactoring operations There are also few inheritance-based refactorings 94

Slide 95

Slide 95 text

Homogeneous: Subgraphs with a single refactoring operation Heterogeneous: Subgraphs with two or more distinct refactoring operations Two groups: 95

Slide 96

Slide 96 text

Heterogeneous vs Homogeneous subgraphs 96

Slide 97

Slide 97 text

Heterogeneous x Homogeneous subgraphs Most refactoring subgraphs include more than one refactoring type (72%) 97

Slide 98

Slide 98 text

Number of distinct refactoring operations Most heterogeneous subgraphs includes two distinct refactoring types (84%) 98

Slide 99

Slide 99 text

Homogeneous refactoring subgraph from Facebook Fresco EXTRACT EXTRACT EXTRACT EXTRACT 99

Slide 100

Slide 100 text

Homogeneous refactoring subgraph from Facebook Fresco EXTRACT EXTRACT EXTRACT EXTRACT Four extract method operations 100

Slide 101

Slide 101 text

Homogeneous refactoring subgraph from Facebook Fresco EXTRACT EXTRACT EXTRACT EXTRACT A developer extracted method fetchDecodedImage(...) 101

Slide 102

Slide 102 text

Homogeneous refactoring subgraph from Facebook Fresco EXTRACT EXTRACT EXTRACT years later EXTRACT A second developer made two new extract operations 102

Slide 103

Slide 103 text

Homogeneous refactoring subgraph from Facebook Fresco EXTRACT EXTRACT EXTRACT EXTRACT Two developers 103

Slide 104

Slide 104 text

Homogeneous refactoring subgraph from Facebook Fresco EXTRACT EXTRACT EXTRACT EXTRACT Commit C1 Commit C2 Commit C3 Three commits 104

Slide 105

Slide 105 text

RQ5: Are the refactoring subgraphs created by the same or multiple developers? 105

Slide 106

Slide 106 text

Subgraphs performed by a single developer Subgraphs created by multiple developers Two groups: 106

Slide 107

Slide 107 text

Developers of refactoring subgraphs 107

Slide 108

Slide 108 text

Developers of refactoring subgraphs Most refactoring subgraphs are created by a single developer (60%) 108

Slide 109

Slide 109 text

Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT EXTRACT EXTRACT 109

Slide 110

Slide 110 text

Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT EXTRACT EXTRACT Commit C1 A developer renamed three methods 110

Slide 111

Slide 111 text

Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT EXTRACT EXTRACT Commit C2 A second developer extracted method checkDuration(...) 111

Slide 112

Slide 112 text

Refactoring subgraph from Square Okhttp MOVE EXTRACT EXTRACT EXTRACT Commit C2 Commit C3 … moving to a new class named Util RENAME RENAME RENAME 112

Slide 113

Slide 113 text

Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT EXTRACT EXTRACT Two developers 113

Slide 114

Slide 114 text

Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT EXTRACT EXTRACT Seven refactoring operations 114

Slide 115

Slide 115 text

Refactoring subgraph from Square Okhttp RENAME RENAME RENAME MOVE EXTRACT EXTRACT EXTRACT Commit C3 Three commits Commit C2 Commit C1 115

Slide 116

Slide 116 text

Large Subgraph Example 116

Slide 117

Slide 117 text

Large refactoring subgraph from Square Okhttp 117 37 vertices

Slide 118

Slide 118 text

Large refactoring subgraph from Square Okhttp Push down and move method operations 118

Slide 119

Slide 119 text

Large refactoring subgraph from Square Okhttp 24 extract and move method operations 119

Slide 120

Slide 120 text

Large refactoring subgraph from Square Okhttp 21 extract and move operations to a single method 120

Slide 121

Slide 121 text

Large refactoring subgraph from Square Okhttp public int readInt() throws IOException { require(4, Deadline.NONE); return buffer.readInt(); } A developer performed 21 extract method operations to move this duplicated code to a single method 121

Slide 122

Slide 122 text

Implications and Conclusions 122

Slide 123

Slide 123 text

Refactoring-aware Software Evolution Refactoring Graphs is a key data structure to improve the results of current software evolution tools 123

Slide 124

Slide 124 text

Example: Git Blame Show the last author that changed each line of a file 124

Slide 125

Slide 125 text

Example: Git Blame Bob creates a method to calculate the area of a square class Math{ } float squareArea(float l){ + return l * l * l; } + float squareArea(float l){ + return l * l; + } 125

Slide 126

Slide 126 text

Example: Git Blame Git-blame shows Bob as a creator Bob class Math{ } float squareArea(float l){ + return l * l * l; } + float squareArea(float l){ + return l * l; + } 126

Slide 127

Slide 127 text

Example: Git Blame class Math{ } float squareArea(float l){ + return l * l * l; } Bob introduces a bug in a second commit + return l * l * l; l 127

Slide 128

Slide 128 text

Example: Git Blame class Math{ } float squareArea(float l){ return l * l; } + return l * l * l; Git-blame shows Bob as responsible for the last change (bug) Bob 128

Slide 129

Slide 129 text

Example: Git Blame class Math{ } - float squareArea(float l){ - return l * l * l; - } Alice moves the method to a utility class class Utility{ } + float squareArea(float l){ + return l * l * l; + } 129

Slide 130

Slide 130 text

Example: Git Blame Git-blame shows Alice as creator of method squareArea Alice class Utility{ } float squareArea(float l){ + return l * l * l; } + float squareArea(float l){ + return l * l * l; + } 130

Slide 131

Slide 131 text

Example: Git Blame class Math{ float squareArea(float l){ return l * l; } } class Math{ float squareArea(float l){ return l * l * l; } } class Utility{ float squareArea(float l){ return l * l * l; } } Bob is the real creator of squareArea() 131

Slide 132

Slide 132 text

Example: Git Blame class Math{ float squareArea(float l){ return l * l; } } class Math{ float squareArea(float l){ return l * l * l; } } class Utility{ float squareArea(float l){ return l * l * l; } } Bob is responsible for the bug 132

Slide 133

Slide 133 text

Example: Git Blame class Math{ float squareArea(float l){ return l * l; } } class Math{ float squareArea(float l){ return l * l * l; } } class Utility{ float squareArea(float l){ return l * l * l; } } git-blame may miss relevant data due to refactoring operations 133

Slide 134

Slide 134 text

Example: Git Blame class Math{ float squareArea(float l){ return l * l; } } class Math{ float squareArea(float l){ return l * l * l; } } class Utility{ float squareArea(float l){ return l * l * l; } } A refactoring history can improve existing tools and techniques 134

Slide 135

Slide 135 text

135 Future Studies Other popular programming languages and ecosystems Refactoring graphs based on class and package level

Slide 136

Slide 136 text

Refactoring subgraphs... … are small … have up to three commits … are often heterogeneous … are mostly created by a single developer … span from a few days to months RQ1 RQ2 RQ3 RQ4 RQ5 136

Slide 137

Slide 137 text

Refactoring Graphs: Assessing Refactoring over Time Aline Brito, Andre Hora, Marco Tulio Valente IEEE SANER 2020