Slide 1

Slide 1 text

application of hierarchical parameterized templates for automated software error correction Применение технологии иерархических параметризируемых шаблонов для автоматизированного исправления ошибок в программном коде Artyom Aleksyuk, Vladimir Itsykson Nov 12, 2015 Peter the Great St.Petersburg Polytechnic University

Slide 2

Slide 2 text

introduction • Wide use of software systems • Important areas • Validation and verification of software • Static analysis • Why not try to fix found bugs? 2

Slide 3

Slide 3 text

existing approaches and tools • IntelliJ IDEA - Structural Search and Replace • Uses templates to describe replacements • Tightly coupled with IDEA UI and code model • AutoFix-E: Automated Debugging of Programs with Contracts • Juzi: A Tool for Repairing Complex Data Structures • Corrects data structures using symbolic execution • GenProg - Genetic programming for code repair • A very promising tool and approach • Requires a lot of unit tests • Grail, Axis, AFix • Dedicated to repair multithreaded programs 3

Slide 4

Slide 4 text

task The main task is to develop an automated system which fixes code with the help of a static analyzer. Designed system consists of: • Static analyzer interface module • Code modification module • Set of corrections 4

Slide 5

Slide 5 text

requirements The developed system must meet the following requirements: • It should work with minimal users’ involvement • Modifications should be correct, i.e. the system shouldn’t alter code logic in any way and should do only those modifications which are described in the template; • It should be universal; • Code formatting and comments should be kept • The system should support the latest versions of programming language • It should be extensible 5

Slide 6

Slide 6 text

static analyzer FindBugs was chosen as the static analyzer. • Easy to interchange information about warnings • Mostly signature-based The system must use templates to describe code replacements. 6

Slide 7

Slide 7 text

code modification approaches • Manual AST modification (for example, using a JavaParser library) • The most universal approach • Low extensibility - requires writing new code for each new correction • DSL for code modification? • Template-based code modification technology (D.A. Timofeev master’s degree, 2010) • Uses templates to describe code modifications Templates are written in language based on Java • Allows using variables (”selectors”) in templates • Supports Java 1.5 (JRE/JDK 1.5 were introduced in 2004!) • Doesn’t keep code formatting and comments • Sometimes behaves incorrectly (just buggy :( ) 7

Slide 8

Slide 8 text

difficulties A badly behaving automatic software repair system can skip required code region, modify inappropriate code or even make a wrong correction. General reasons for that: • Static analyzer mistake • Static analyzer interface bottleneck • Incorrect template match • Improper modification Ways to overcome the last problem • Code review • Unit testing • Other suitable verification and validation methods 8

Slide 9

Slide 9 text

architecture FindBugs report parsing Warnings list "Before" template "After" template Source code "Before" parse tree "After" parse tree Source code parse tree Difference "Before" template matches in code Changes applied Source code 9

Slide 10

Slide 10 text

bugs examples 1. Absence of explicit default encoding designation when reading text files 2. Strings comparison via == operator 3. Absence of null check in equal() method 4. Absence of argument type check in equal() method 5. Usage of constructors for wrapper classes 6. toString() method call for array 7. Usage of approximate values of mathematical constants 8. JVM termination via System.exit() call when handling errors 9. Null return in toString() method 10. Arrays comparison using equals() method 11. Comparison of compareTo() method returning value on equality with constant 10

Slide 11

Slide 11 text

replacement templates Templates language = Java + selectors. Selectors are described using #idetifier expression. Example: string comparison using ==. Before: #a == #b After: #b.equals(#a) Absence of a null pointer check. Before: boolean equals(Object obj) { #expr; } After: boolean equals(Object obj) { if (obj == null) { return false; } #expr; } 11

Slide 12

Slide 12 text

queries Ability to specify requirements for selectors 1. Type of tree node 2. Range of values 3. Quantity of caught nodes 4. Complex queries via XPath Example: [before] #array.toString() [after] Arrays.toString(#array) [query] array is Identifier array quantity 1 12

Slide 13

Slide 13 text

development • FindBugs report is just an XML document, read using standard Java DOM parser • Each template consists of three or four .INI-like sections: [before], [after], [type] and optionally [query]. Each template can fix multiple bug types and vice versa. • Improved template matching code • Selector queries 13

Slide 14

Slide 14 text

improved template matching code Pattern Matching in Trees Additional complexity because of selectors Each selector can include any number of nodes 14

Slide 15

Slide 15 text

improved template matching code 1 2 3 4 5 6 7 8 1a 2a 3a 4a 7a 8a 15

Slide 16

Slide 16 text

development • Ported to ANTLRv4 • Grammar written from scratch, now based on Java 7 • Selectors can be used nearly everywhere • Transition from AST to CST (Parse tree) • New way to transform internal representation back to the source code (allows to transfer formatting and comments) 16

Slide 17

Slide 17 text

ci integration Shell script designed to be run as a Jenkins build step • Launch FindBugs and fetch a report from it • Run FixMyCode • Commit changes A new branch is created each time. Developers should review modifications and do a merge. 17

Slide 18

Slide 18 text

ci integration 18

Slide 19

Slide 19 text

testing Trying to fix bugs in a popular, widely used project. JGraphT library: • Maintained code base • Uses Java 7 features • Has a plenty of unit tests (439) • Middle-size project (27K SLOC) Results: • 46 bugs found • 14 errors was fixes • 8 errors can’t be fixed because of FindBugs error • Other bugs need an appropriate replacement template 19

Slide 20

Slide 20 text

testing Examples. Inefficient usage of wrapper classes: buckets.get(degree[nb]).remove(new Integer(nb)); Replacement: buckets.get(degree[nb]).remove( Integer.valueOf(nb)); 20

Slide 21

Slide 21 text

testing Absence of a null pointer and argument type check: @Override public boolean equals(Object obj) { LabelsEdge otherEdge = ( LabelsEdge) obj; if ((this.source == otherEdge.source) && (this.target == otherEdge. target)) { return true; } else { return false; } } @Override public boolean equals(Object obj) { if (obj == null) { return false; } if (!obj.getClass().isInstance(this)) { return false; } LabelsEdge otherEdge = (LabelsEdge) obj; if ((this.source == otherEdge.source) && (this.target == otherEdge.target) ) { return true; } else { return false; } } 21

Slide 22

Slide 22 text

recap • The extensible system that works nearly automatically was developed Source code can be fetched from https://bitbucket.org/h31/fixmycode • Template grammar was updated and extended • A set of replacement templates was written • The developed system could be used to maintain the code quality within Continuous Integration • Also can be used to modernize legacy code 22

Slide 23

Slide 23 text

future direction of development • First of all, make it a production-grade project (documentation, code quality, stability) • More powerful query types • Support for other static analyzers (Java Path Finder, etc) • Extending tool for related tasks: performance improvement, security enhancement 23

Slide 24

Slide 24 text

thank you for attention! 24