Slide 1

Slide 1 text

Identification of Extract Method Refactoring Opportunities IEEE TCSE Most Influential Paper February 27, 2019 1 Nikolaos Tsantalis Alexander Chatzigeorgiou

Slide 2

Slide 2 text

2 What are we doing next? Year 2008 Something challenging Learn new things

Slide 3

Slide 3 text

3 Year 2008 Let’s try to solve the Extract Method problem

Slide 4

Slide 4 text

The programmer: • indicates a variable of interest at a specific program point • selects a suitable method from among the candidates created by the tool • names the selected method 4 @SSR2001

Slide 5

Slide 5 text

5 @SSR2001 Elegant solution both theoretically and practically

Slide 6

Slide 6 text

R(B2 ) 6 R(B3 ) R(B1 ) 1 public String statement() { 2 double totalAmount = 0; 3 int renterPoints = 0; 4 Enumeration rentals = _rentals.elements(); 5 String result = "Rental Record for " + getName() + "\n"; 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 8 double thisAmount = each.getCharge(); 9 if(each.getMovie().getPriceCode() == Movie.NEW_RELEASE && each.getDaysRented() > 1) 10 renterPoints += 2; else 11 renterPoints++; 12 result += "\t" + each.getMovie().getTitle() + "\t" + String.valueOf(thisAmount) + "\n"; 13 totalAmount += thisAmount; } 14 result += "Amount owed is " + String.valueOf(totalAmount) + "\n"; 15 result += "You earned " + String.valueOf(renterPoints) + " frequent renter points"; 16 return result; }

Slide 7

Slide 7 text

7 R(B2 ) 1 public String statement() { 2 double totalAmount = 0; 3 int renterPoints = 0; 4 Enumeration rentals = _rentals.elements(); 5 String result = "Rental Record for " + getName() + "\n"; 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 8 double thisAmount = each.getCharge(); 9 if(each.getMovie().getPriceCode() == Movie.NEW_RELEASE && each.getDaysRented() > 1) 10 renterPoints += 2; else 11 renterPoints++; 12 result += "\t" + each.getMovie().getTitle() + "\t" + String.valueOf(thisAmount) + "\n"; 13 totalAmount += thisAmount; } 14 result += "Amount owed is " + String.valueOf(totalAmount) + "\n"; 15 result += "You earned " + String.valueOf(renterPoints) + " frequent renter points"; 16 return result; }

Slide 8

Slide 8 text

8 R(B2 ) 1 public String statement() { 2 double totalAmount = 0; 3 int renterPoints = 0; 4 Enumeration rentals = _rentals.elements(); 5 String result = "Rental Record for " + getName() + "\n"; 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 8 double thisAmount = each.getCharge(); 9 if(each.getMovie().getPriceCode() == Movie.NEW_RELEASE && each.getDaysRented() > 1) 10 renterPoints += 2; else 11 renterPoints++; 12 result += "\t" + each.getMovie().getTitle() + "\t" + String.valueOf(thisAmount) + "\n"; 13 totalAmount += thisAmount; } 14 result += "Amount owed is " + String.valueOf(totalAmount) + "\n"; 15 result += "You earned " + String.valueOf(renterPoints) + " frequent renter points"; 16 return result; }

Slide 9

Slide 9 text

9 1 public String statement() { 2 double totalAmount = 0; 3 int renterPoints = 0; 4 Enumeration rentals = _rentals.elements(); 5 String result = "Rental Record for " + getName() + "\n"; renterPoints = renterPoints(renterPoints, rentals); 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 8 double thisAmount = each.getCharge(); 12 result += "\t" + each.getMovie().getTitle() + "\t" + String.valueOf(thisAmount) + "\n"; 13 totalAmount += thisAmount; } 16 return result; } int renterPoints(int renterPoints, Enumeration rentals) { 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 9 if(each.getMovie().getPriceCode() == Movie.NEW_RELEASE && each.getDaysRented() > 1) 10 renterPoints += 2; else 11 renterPoints++; } return renterPoints;

Slide 10

Slide 10 text

Limitations 1. Does not support extraction opportunities for objects, but only variables of primitive type 10 Composite variables

Slide 11

Slide 11 text

11 1 public String statement() { 2 double totalAmount = 0; 3 int renterPoints = 0; 4 Enumeration rentals = _rentals.elements(); 5 String result = "Rental Record for " + getName() + "\n"; renterPoints = renterPoints(renterPoints, rentals); 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 8 double thisAmount = each.getCharge(); 12 result += "\t" + each.getMovie().getTitle() + "\t" + String.valueOf(thisAmount) + "\n"; 13 totalAmount += thisAmount; } 16 return result; } int renterPoints(int renterPoints, Enumeration rentals) { 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 9 if(each.getMovie().getPriceCode() == Movie.NEW_RELEASE && each.getDaysRented() > 1) 10 renterPoints += 2; else 11 renterPoints++; } return renterPoints;

Slide 12

Slide 12 text

12 Enumeration rentals = _rentals.elements(); public Enumeration elements() { return new Enumeration() { int count = 0; public boolean hasMoreElements() { return count < elementCount; } public E nextElement() { synchronized (Vector.this) { if (count < elementCount) { return elementData(count++); } } } }; } count is read count is read & modified

Slide 13

Slide 13 text

13 Enumeration rentals = _rentals.elements(); Rental each = (Rental) rentals.nextElement(); R: rentals.count while(rentals.hasMoreElements()) { R+W: rentals.count

Slide 14

Slide 14 text

Limitations 2. Does not handle behavior preservation issues 14 Behavior preservation rules

Slide 15

Slide 15 text

15 1 public String statement() { 2 double totalAmount = 0; 3 int renterPoints = 0; 4 Enumeration rentals = _rentals.elements(); 5 String result = "Rental Record for " + getName() + "\n"; renterPoints = renterPoints(renterPoints, rentals); 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 8 double thisAmount = each.getCharge(); 12 result += "\t" + each.getMovie().getTitle() + "\t" + String.valueOf(thisAmount) + "\n"; 13 totalAmount += thisAmount; } 16 return result; } int renterPoints(int renterPoints, Enumeration rentals) { 6 while(rentals.hasMoreElements()) { 7 Rental each = (Rental) rentals.nextElement(); 9 if(each.getMovie().getPriceCode() == Movie.NEW_RELEASE && each.getDaysRented() > 1) 10 renterPoints += 2; else 11 renterPoints++; } return renterPoints;

Slide 16

Slide 16 text

16 Enumeration rentals = _rentals.elements(); renterPoints = renterPoints(renterPoints, rentals); while(rentals.hasMoreElements()) { int renterPoints(int renterPoints, Enumeration rentals) { Rental each = (Rental) rentals.nextElement(); All rentals elements are enumerated while(rentals.hasMoreElements()) { rentals has no more elements

Slide 17

Slide 17 text

Rule #1 17 A duplicated statement should not modify the state of an object

Slide 18

Slide 18 text

Limitations 3. Does not guarantee that the complete computation of the variable indicated by the user will be extracted. 18 Union of block-based slices

Slide 19

Slide 19 text

public void translate(double dx, double dy){ if (getParent() == null) { dy = TOP_GAPY - getBounds().getY(); } else { double y = getBounds().getY() + dy; y = Math.max(y, getParent().getBounds().getMinY() - topHeight/2); y = Math.min(y, getParent().getBounds().getMaxY() - topHeight/2); dy = y - getBounds().getY(); } super.translate(dx, dy); } 19 Slicing criteria for variables

Slide 20

Slide 20 text

GeneralPath area = new GeneralPath(); double x = domainAxis.valueToJava2D(polygon[0], data, domainEdge); double y = rangeAxis.valueToJava2D(polygon[1], data, rangeEdge); if (orientation == PlotOrientation.HORIZONTAL) { area.moveTo((float) y, (float) x); for (int i = 2; i < polygon.length; i += 2) { x = domainAxis.valueToJava2D(polygon[i], data, domainEdge); y = rangeAxis.valueToJava2D(polygon[i + 1], data, rangeEdge); area.lineTo((float) y, (float) x); } area.closePath(); } else if (orientation == PlotOrientation.VERTICAL) { area.moveTo((float) x, (float) y); for (int i = 2; i < polygon.length; i += 2) { x = domainAxis.valueToJava2D(polygon[i], data, domainEdge); y = rangeAxis.valueToJava2D(polygon[i + 1], data, rangeEdge); area.lineTo((float) x, (float) y); } area.closePath(); } 20 Slicing criteria for objects

Slide 21

Slide 21 text

21 Tool support 17K installations 40K recommended refactorings 20K Extract Method refactorings

Slide 22

Slide 22 text

• JFreeChart v1.0 • Independent expert assessed 64 recommendations • 57 (89%) implemented a distinct functionality • 27 (42%) resolved a code smell • 15 eliminated duplicated code • 11 decomposed complex methods • 1 was part of method suffering from Feature Envy 22 Evaluation (1)

Slide 23

Slide 23 text

Evaluation (1) • Overlap +0.29 • Tightness +0.19 • Coverage ~

Slide 24

Slide 24 text

• 2 developers found manually refactoring opportunities in their own projects • We compared their findings with JDeodorant • TP: An opportunity identified by human and tool • FP: An opportunity identified only by tool • FN: An opportunity identified only by human • Average Precision: 51% • Average Recall: 69% 24 Evaluation (2)

Slide 25

Slide 25 text

28 How this work affected my research and career?

Slide 26

Slide 26 text

29

Slide 27

Slide 27 text

Assessing the Refactorability of Software Clones • First approach to automatically assess whether a pair of clones can be safely refactored • Optimize statement matching of the clones • Maximize the number of matched statements • Minimize the number of differences between them • List of preconditions to ensure that the differences can be safely parameterized 30

Slide 28

Slide 28 text

Large-scale Empirical Study • Studied 1M+ clones detected by 4 clone detectors in 9 open-source projects • 94% of the clones are either Type-2 or Type-3 • Out of those, only 14% could be safely refactored using regular parameters 32

Slide 29

Slide 29 text

33

Slide 30

Slide 30 text

Clone Refactoring with Lambda Expressions • First approach to assess the applicability of Lambda expressions for clone refactoring • Studied 46K+ Type-2 & Type-3 clones that were not refactorable with regular parameters • 58% of these clones could be safely refactored with Lambda expressions 34

Slide 31

Slide 31 text

35 How this work affected other researchers?

Slide 32

Slide 32 text

36 @22nd International Conference on Program Comprehension (ICPC 2014)

Slide 33

Slide 33 text

37 01 public String execute() throws Exception { 02 logger.info("Starting execute()"); 03 Session sess = HibernateUtil.getSessionFactory().openSession(); 04 Transaction t = sess.beginTransaction(); 05 Criteria criteria = sess.createCriteria(User.class); 06 criteria.add(Restrictions.idEq(this.user.getUsername())); 07 criteria.add(Restrictions.eq("password", this.user.getPassword())); 08 User user = (User) criteria.uniqueResult(); 09 t.commit(); 10 sess.close(); 11 if (user != null) { 12 ActionContext.getContext().getSession().put(AUTHENTICATED_USER, user); 13 logger.info("Finishing execute() -- Success"); 14 return SUCCESS; 15 } 16 this.addActionError(this.getText("login.failure")); 17 logger.info("Finishing execute() -- Failure"); 18 return INPUT; 19 } Types Session, Transaction, Criteria are interfaces

Slide 34

Slide 34 text

Generation of candidates for extraction Generate all possible combinations of statements: • Syntactical preconditions: • ONLY consecutive statements • control statements should include all their child statements • Behavioral preconditions: • a single variable must be assigned, if it is returned • Quality preconditions: • Filter-out very small or very large candidates in size 38

Slide 35

Slide 35 text

Ranking of candidates Extract the dependencies of: • The code fragment to be extracted Dep • The remaining statements Dep’ Dependencies are: • Variables • Types • Packages 39 Dep’ (Remaining) Dep (Extracted)

Slide 36

Slide 36 text

Oracle for Evaluation 1. Retrieve all methods with N>3 statements 2. Retrieve all invoked methods • Check inline method refactoring preconditions 3. Inline the methods passing the criteria 4. Add the Extract Method refactoring reverting the Inline in the oracle 40

Slide 37

Slide 37 text

Oracle for Evaluation 41 Scale-up the evaluation to more cases Unknown if these methods were actually extracted

Slide 38

Slide 38 text

42 @IEEE Transactions on Software Engineering vol. 43, no. 10, October 2017

Slide 39

Slide 39 text

Coherent statements 1. accessing the same variable (attributes, local variables and method parameters) 2. calling a method for the same object 3. calling the same method for a different object of the same type 43

Slide 40

Slide 40 text

accessible variables and method calls per statement 44

Slide 41

Slide 41 text

45 Clusters of coherent statements with distance = 1

Slide 42

Slide 42 text

46 Clusters of coherent statements with distance = 2

Slide 43

Slide 43 text

Extract Method opportunities • In each iteration, the distance is increased by 1, until we reach the maximum, i.e., method size • Each formed cluster is a candidate refactoring opportunity • Remove candidates • Duplicate clusters • Clusters violating preconditions 47

Slide 44

Slide 44 text

Evaluation 48 Tools Tolerance Recall Precision F-Measure SEMI 1% 38,0% 12,9% 19,2% 2% 47,0% 14,6% 22,3% 3% 55,5% 18,8% 28,1% JDeodorant 1% 14,8% 17,4% 16,0% 2% 18,4% 21,1% 19,7% 3% 23,8% 28,0% 25,7% JExtract 1% 52,2% 12,6% 20,4% 2% 59,3% 13,1% 21,5% 3% 61,9% 15,0% 24,2%

Slide 45

Slide 45 text

49 @IEEE 28th International Symposium on Software Reliability Engineering (ISSRE 2017)

Slide 46

Slide 46 text

Feature Extraction 1. Structural features: number of control flow statements, variables, method invocations, packages, ratio of total lines of code 2. Functional features: a. Find program elements (invocations, packages, variable accesses, type accesses) only/mainly used in extracted code (likely functionality) b. Calculate the ratio of statements involved in this functionality within the extracted code (cohesion) 50

Slide 47

Slide 47 text

Training GEMS 1. 267 Extract Method refactorings from open- source projects that were confirmed from the developers who applied them 2. 5598 inlined methods that were invoked only once in the source code of the projects 51

Slide 48

Slide 48 text

Evaluation 52

Slide 49

Slide 49 text

Overall comparison 1. JDeodorant is still the only tool that can recommend Extract Method refactoring opportunities for non-consecutive statements 2. JDeodorant still has very competitive precision 3. Machine learning beats everything, but needs lots of training data 53

Slide 50

Slide 50 text

Lessons learned – How to impact 1. Find a challenging topic with practical impact • The right topic can build your career up to tenure 2. Don’t be afraid to build on previous ideas • As long as the increments are not trivial 3. Tools + data • Maintain your tools & support your users • Research that cannot be reproduced is not science 4. Listen to your reviewers, despite the pain • Get inspired from their feedback and push the bar 54