Refactoring Mining - The key to unlock software evolution

Refactoring Mining The key to unlock software evolution Nikolaos Tsantalis
Concordia University

Thank you IWoR

Who am I RefactoringMiner

Refactoring Miner

Alexander Chatzigeorgiou Eleni Stroulia Zhenchang Xing Fabio Rocha

CASCON'13 version 0.0

• Ideas inspired from UMLDiff • Structure of method bodies
is ignored • Method body includes only method calls and field accesses • Most refactorings detected based on signature matching • Only precision is provided • Evaluation included only 3 systems

Refactoring motivations

Why did I stop working on this research? Ref-Finder (ICSM
2010): "The precision and recall on open source projects were 0.74 and 0.96 respectively." "Since these programs did not document refactorings, we created a set of correct refactorings by running REF-FINDER with a similarity threshold (σ=0.65) and manually verified them. We then measured a recall by comparing this set with the results found using a higher threshold (σ=0.85)"

Lesson # 1 Don't be intim ida te d by
s u p e r re sults Always que stion the results

Until one day... Danilo Silva Marco Tulio Valente

Where was the code?

Lesson #2 Always make your code & data available in
a repository You never know what it can enable in the future

Open Science initiatives M. T. Baldassarre, N. Ernst, B. Hermann,
T. Menzies, R. Yedida Mandatory Data Availability field

Let's do an empirical study about refactoring We must study
Why We Refactor

ICSE'16 version 0.1

• Danilo developed the API of RefactoringMiner • Tooling for
checking out and parsing Git commits • Infrastructure for monitoring GitHub projects • Automatic generation of emails to contact developers • A web app for thematic analysis

Firehouse interview • Monitored 124 GitHub projects between June 8th
and August 7th, 2015 • Sent 465 emails and received 195 responses (42%) • +27 commits with a description explaining the reasons • Compiled a catalogue of 44 distinct motivations for 12 well-known refactoring types

Motivation Catalogue Artifact https://github.com/aserg- ufmg/why-we-refactor

ICSE'16 rejection Reviewer #1: "A major threat to the research
is not discussed or considered, that RefFinder has poor recall (0.24 [31]). The authors did a good job of combating the low-precision by manually inspecting results, the low recall is not discussed or dealt with."

Lesson #3 Even ICSE reviewers make mistakes Don't get disappointed.
Improve, advance, re-submit

FSE'16 re-submission • Renamed the tool from RefDetector to RefactoringMiner
• Addressed most of reviewers' comments

RefDiff MSR'17 RefactoringMiner ...

The path of Virtue or Vice

Rebirth ICSE’18 version 1.0

Limitations of previous approaches 1. Dependence on similarity thresholds •
thresholds need calibration for projects with different characteristics 2. Dependence on built versions • only 38% of the change history can be successfully compiled [Tufano et al., 2017] 3. Unreliable oracles for evaluating precision/recall • Incomplete (refactorings found in release notes or commit messages) • Biased (applying a single tool with two different similarity thresholds) • Artificial (seeded refactorings)

Thresholds are banned

private static Address[] createAddresses(int count) { Address[] addresses = new
Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } After Before

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(int count) { List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } After Before

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } After Before private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; }

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { } return addresses; } try { addresses[i] = new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } After Before

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } After Before

protected static Address createAddress(String host, int port) { try {
return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } private static Address[] createAddresses(int count) { Address[] addresses = new Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } After Before textual similarity ≈ 30%

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } After Before (1) Abstraction

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address(host, port); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } After Before (2) Argumentization

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } After Before (2) Argumentization

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", ports.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } After Before (3) AST Node Replacements

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } After Before (3) AST Node Replacements

Address[count]; for (int i = 0; i < count; i++) { try { addresses[i] = new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } } return addresses; } private static List<Address> createAddresses(AtomicInteger ports, int count){ List<Address> addresses = new ArrayList<Address>(count); for (int i = 0; i < count; i++) { addresses.add(createAddress("127.0.0.1", ports.incrementAndGet())); } return addresses; } protected static Address createAddress(String host, int port) { try { return new Address("127.0.0.1", PORTS.incrementAndGet()); } catch (UnknownHostException e) { e.printStackTrace(); } return null; } After Before textual similarity = 100%

Reliable oracle • We used the “Why We Refactor” dataset
(538 commits from 185 open source projects) • We executed both available tools (RefactoringMiner and RefDiff) • Converted the output of the tools to the same format • We manually validated all refactoring instances (4,108 unique instances, out of which 3,188 were true positives and 920 were false positives) • The validation process was labor-intensive and involved 3 validators for a period of 3 months (i.e., 9 person-months) • To compute recall, we considered the union of the true positives reported by both tools as the ground truth. Matin Mansouri Laleh Eshkevari Davood Mazinanian

ICSE’18 retrospective • Our solution was still far from perfect
• Time pressure • It was urgent to establish RefactoringMiner with a publication • ICSE deadline: August 25, 2017 • Sophia’s birth: August 15, 2017 • Despite working over 1 year on this paper, there was still space for improvement • The entire team graduated after this work

Wisdom TSE’20 version 2.0

• Added more replacement types • 20 new sub-method level
refactoring types detected based on AST node replacements • Nested refactoring detection • Refactoring inference • Improved the matching of method calls to method declarations with argument type inference

Even more reliable oracle • We executed all available tools
• RefactoringMiner 1.0 and 2.0 • RefDiff 0.1.1, 1.0, 2.0 • GumTreeDiff 2.1.2 • Converted the output of the tools to the same format • We validated 5,830 new unique refactoring instances, out of which 4,038 were true positives and 1,792 were false positives. • 7,226 true positives in total for 40 different refactoring types (72% of true instances are detected by two or more tools) Ameya Ketkar

Emperor version 2.2

From loop and if to Stream API .forEach() and .filter()

Current status • 85 supported refactoring types • 10,900 validated
true positives in the oracle • Precision: 99.7% • Recall: 97% • Most tested and reliable version (200K commits without exception) • Independent studies confirm it has the best precision

Lesson #4 What it takes to make a reliable &
usable tool Minimum 5 years of research and development Extensive testing Detailed documentation (README with API code snippets) Supporting users (200+ issues resolved) Stable project leader Great team

The promises

Promise #1 “researchers can replicate existing empirical studies and refute
or confirm previously-held beliefs.”

Threats to validity • Both studies relied on Ref-Finder •
Independent studies revealed that Ref-Finder had low precision/recall • 35% precision, 24% recall [Soares et al. JSS 2013] • 27% precision [Kádár et al. PROMISE 2016] • Ref-Finder paper claimed 74% precision, 96% recall • Collected refactoring info at release level (between two versions) • Coarse-grained analysis led to strong and unsafe assumptions

Lesson #5 A tool comes with a huge responsibility for
its authors We should go above and beyond when evaluating its accuracy

Promise #2 “reduce the noise created by refactorings, such as
file/directory renaming, and significantly improve the accuracy of other tools.”

Lesson #6 Making existing code evolution analysis techniques refactoring-aware can
improve their accuracy

Promise #3 “our oracle of true refactorings from 538 commits
across 185 projects provides an invaluable resource for validating novel refactoring tools and for comparing existing approaches”

Promise #4 “enable online refactoring detection on partial input, when
a developer inspects a code diff to review a change, or tries to understand code evolution selectively”

• Partially refactoring-aware • Supports method signature changes • Method
moves • File renames/moves • Uses thresholds when comparing program elements, calibrated on a training set of 100 methods • Mismatches methods from which a significant part of their body has been extracted to new methods, as it uses a 75% body similarity threshold to match modified methods • Ultra-fast: Less than 2 seconds to fetch the entire change history of a method

Challenge How can we take advantage of RefactoringMiner accuracy in
a way that is not computationally expensive? Mehran Jodavi Solution Partial and incremental commit analysis based on the location of the tracked program element

Precision +7.5% Recall +3.7% Precision +9.1% Recall +5.8% > 97%
precision/recall change level Method tracking commit level Method tracking change level Variable tracking > 98% precision/recall commit level

Promise #5 “Integrating RefactoringMiner with the diff and code review
tools can raise the level of abstraction for code changes originating from refactorings, thus helping developers better understand the code evolution”

Diff Hunk Graph Nodes: diff hunks Hard links: syntactic constraints
between the AST nodes Soft links: repetitive change patterns Refactoring links: edits in different locations originating from the same refactoring Cosmetic links: changes in comments & reformatting

Promise #6 “refactoring operations can be automatically documented at commit-time
to provide a more detailed description of the applied changes in the commit message”

Mining for Learning

File f = new File(:[v0], :[v1]) → Path f =
:[v0].resolve(:[v1]) FileOutputStream stream = new FileOutputStream(:[v0]) → OutputStream stream = Files.newOutputStream(:[v0]) → Ameya Ketkar Rewrite rules

Mining for Recommending

Why We Refactor++ A large-scale empirical study • We developed
detection rules for refactoring motivations • We optimized the rules on the FSE’16 “Why We Refactor” dataset • We validated their accuracy on the TOSEM’2020 dataset • Precision: 98.4% Recall: 93.5% • We collected the motivations for 346K Extract Method instances found in 132,897 commits of 325 open-source repositories Sadegh Aalizadeh J. Pa ntiuchina , F. Za m pe tti, S. Sca la brino, V. Pia nta dosi, R. Olive to, G. Bavota , a nd M. Di Pe nta , "Why Deve lope rs Re fa ctor Source Code : A Mining-ba se d Study,“ ACM Tra nsa ctions on Softwa re Engine e ring a nd Me thodology, Volum e 29, Issue 4, Article 29, Se pte m be r 2020.

Refactoring recommendation tools for top-5 Extract Method motivations 1. Reusable
Method [NONE] 2. Remove Duplication [CeDAR (Tairas and Gray, IST’12), Creios (Hotta et al., CSMR’12), Rase (Meng et al., ICSE’15), JDeodorant (Tsantalis et al., TSE’15 + ICSE’17), CRec (Yue et al., ICSME’18)] 3. Facilitate Extension [FR-Refactor (Ally S. Nyamawe et al. RE’19 + EMSE’20)] 4. Decompose for Readability [JDeodorant (Tsantalis and Chatzigeorgiou, JSS’11), JExtract (Silva et al., ICPC’14), SEMI (Charalampidou et al., TSE’17), GEMS (Xu et al., ISSRE’17)] 5. Alternative Signature [NONE]

Mining for Accuracy

Making better refactoring mining tools is a community effort Mining
competitions hosted in the Workshop Community-created benchmarks Hackathons for improving tools

https://github.com/tsantalis/RefactoringMiner https://github.com/JetBrains-Research/RefactorInsight https://github.com/JetBrains-Research/kotlinRMiner https://github.com/JetBrains-Research/data-driven-type-migration Oleg Smirnov Ameya Ketkar Zarina Kurbatova
https://github.com/jodavimehran/code-tracker Mehran Jodavi

Refactoring Mining - The key to unlock software...

Refactoring Mining - The key to unlock software evolution

More Decks by Nikolaos Tsantalis

Other Decks in Research

Featured

Transcript