Slide 1

Slide 1 text

Towards an Example Repository Advanced Software Tools Seminar, TAU 31 May 2010 Alexey Zagalsky, Ohad Barzilay, Prof. Amiram Yehudai

Slide 2

Slide 2 text

About Myself  Computer Science M.Sc. Student  Working under the supervision of Prof. Amiram Yehudai  This talk is based on a work in progress  Happy to receive suggestions

Slide 3

Slide 3 text

Overview  Example Embedding  Example Repository ◦ Requirements for a repository  Case Study  Discussion

Slide 4

Slide 4 text

EXAMPLE EMBEDDING

Slide 5

Slide 5 text

Example Embedding  Example Embedding is the notion of using an already existing code fragment within a new context  Empirical SE research characterized EE as a software activity  Ohad Barzilay, Orit Hazzan, Amiram Yehudai, “Characterizing Example Embedding as a Software Activity”, 2009  Motivation derived from Refactoring  Using examples is not new

Slide 6

Slide 6 text

Software Activity  We define software activity as a collection of fine grained techniques, which together assemble an abstract key notion in software development Analysis Programming Debugging Coding Refactoring Search Example Embedding Code Comprehension Testing Design Taken from “Example Embedding by Ohad Barzilay” slides

Slide 7

Slide 7 text

Example Embedding Eco-System Software Practices Software Process Training Activity Catalogue Software Tools Organizational Support

Slide 8

Slide 8 text

EXAMPLE REPOSITORY

Slide 9

Slide 9 text

Top Down Approach Define Research Implementation

Slide 10

Slide 10 text

Support Software development Eco-System  The repository has to be more than just a collection of examples  It has to be part of the Example Embedding software activity  In order for the repository to support the example embedding eco-system we must define the requirements for a repository

Slide 11

Slide 11 text

Requirements System - Eco Support Software development  Integrate into IDE  Searching  Run-able in IDE  Automatic Embedding  New & Existing Examples  Repository Framework  Simple to Use

Slide 12

Slide 12 text

Integrate into IDE  The repository should be an integral part of the development process  The developer will “stay in focus”  Integrating into IDE will allow: ◦ Automatic Embedding ◦ Run-able in IDE

Slide 13

Slide 13 text

Searching  Properties: ◦ Name ◦ Rank / Score / Rating ◦ Tags ◦ Author / Source of the Example ◦ Input / Output type / Signature ◦ Programming Language ◦ Security ◦ Example View History  These properties will allow to: ◦ Filter the search results ◦ Use an advanced searching algorithm based on these properties and not only based on similarity of keywords

Slide 14

Slide 14 text

Advanced Searching  Emily Hill, Lori Pollock, K. Vijay-Shanker, “Automatically Capturing Source Code Context of NL-Queries for Software Maintenance and Reuse”, ICSE’09

Slide 15

Slide 15 text

Run-able in IDE  The developer will be able to: ◦ Run the example before he embeds it ◦ See if the example meets his requirements  This could be done using Slices

Slide 16

Slide 16 text

Automatic Embedding  The developer should be able to choose an example from the search results and embed it into his code automatically  Allow the developer to choose where he wants embed the example  Similar to how refactoring works  Automatic Embedding will improve code quality by following Embedding Patterns

Slide 17

Slide 17 text

New & Existing Examples  Well defined example structure  Allow to create and submit new examples directly from IDE  Support conversion for existing examples ◦ SourceForge.net  Conversion will contribute to the acceptance of the repository framework  Automated tool

Slide 18

Slide 18 text

Repository Framework  Choose the source of the example framework ◦ Public repository ◦ Proprietary repository  Software companies could use their own examples using the same framework

Slide 19

Slide 19 text

Simple to Use  Should not require learning any query languages  It has to be intuitive and based on human aspects  Should be similar to internet search engines

Slide 20

Slide 20 text

CASE STUDY

Slide 21

Slide 21 text

Candidates for the repository  CodeGenie ◦ Sourcerer  Strathcona Tool  Refactory  StackOverflow  Let’s examine each one of them using the requirements we defined earlier

Slide 22

Slide 22 text

CodeGenie

Slide 23

Slide 23 text

CodeGenie  Using test-cases to search and reuse source code  Otavio Lemos, Sushil Bajracharya, Joel Ossher, Ricardo Morla, Paulo Masiero, Pierre Baldi, Cristina Lopes, “Using Test-Cases to Search and Reuse Source Code“, 2007  Based on the Sourcerer search engine ◦ Search is based on Keywords, Structural properties and Relations among program elements  Code Search  Repository Access  Slicing

Slide 24

Slide 24 text

Test Driven Code Search  The same way that test cases can be used to define a software feature in TDD, they can also be used to describe a desired feature in a code search 1. Construct test cases 2. Search for code and integrate 3. Refactor

Slide 25

Slide 25 text

Test Driven Code Search (TDCS)

Slide 26

Slide 26 text

TDCS – Weaving / UnWeaving  The developer can explore the results by weaving / testing and unweaving them  To do that, a program slicing service to provide self-contained code pieces related to the desired feature, and a repository access service must be available at the code services side.

Slide 27

Slide 27 text

Demo  http://sourcerer.ics.uci.edu/codegenie/ #demo

Slide 28

Slide 28 text

Full Details  Eclipse  JUnit (at least one method)  Extract information: ◦ Extract interface of the missing method and the names of the missing method and of the class it belongs to, by analyzing compiler errors present in the test cases ◦ The AST is explored to extract the return type and argument types  Weaving: ◦ Merge by name strategy as used as used in Hyper/J ◦ The merging is done by a union operation on the structures of the classes ◦ Uses java annotations to track the woven structers  Sourcerer

Slide 29

Slide 29 text

Their Results

Slide 30

Slide 30 text

Case Study • The tool is integrated in the IDE Integrate into IDE • Allows Searching but only by use cases • No option for filtering by any of the parameters Searching • Allows running the code from the IDE using slices Run-able in IDE • Allows weaving but it is not real code embedding Automatic Embedding • Doesn’t allow adding examples from the IDE New & Existing Examples • Only allows access to the code the sourcerer has access to – SourceForge.net Repository Framework • Only searchable by using test cases • Not intuitive Simple to Use

Slide 31

Slide 31 text

Strathcona Tool

Slide 32

Slide 32 text

Using Structural Context to Recommend Examples  Strathcona tool (eclipse plug-in)  Reid Holmes, Gail C. Murphy, “Using Structural Context to Recommend Source Code Examples”, 2005  Locating code in an example repository based on heuristically matching the structure of the code under development to the example code ◦ Structural Context extracted automatically ◦ The repository is extracted automatically from existing applications

Slide 33

Slide 33 text

Strathcona 1. Populate the repository 2. Structural Context description generated and sent to the server 3. Heuristically match structure description 4. Returns ten best structural matches to the developer

Slide 34

Slide 34 text

Strathcona

Slide 35

Slide 35 text

Strathcona

Slide 36

Slide 36 text

Strathcona  Heuristics ◦ Inheritance Heuristic – same parents ◦ Calls Heuristic  Basic Calls – call the same targets  Calls Best fit – best ration match/unmatched call targets  Calls with inheritance – with at least one parent ◦ Uses Heuristic  Basic Uses – use the same types  Uses with inheritance – with at least one parent

Slide 37

Slide 37 text

Results

Slide 38

Slide 38 text

Limitations  Search relies entirely on the structure of the code being edited ◦ The ability to return useful examples dependent upon the quality of the seed code used in the query  Searches only the extracted code in the repository  More limitations: ◦ Repository code must be parse-able by eclipse compiler ◦ Repository code should represent good usage of the framework

Slide 39

Slide 39 text

Case Study • The tool is integrated in the IDE Integrate into IDE • Allows Searching but only by code being edited • No option for filtering by any of the parameters Searching • Returns the source code of the example Run-able in IDE • Only returns the source code of the example without embedding Automatic Embedding • Based on the source code extracted into the repository only New & Existing Examples • Supports mostly local repository • Requires improvements to support public rep. Repository Framework • Only searchable by using edited code seed • Not intuitive Simple to Use

Slide 40

Slide 40 text

StackOverflow

Slide 41

Slide 41 text

StackOverflow  Stack Overflow is a collaboratively edited question and answer site for programmers  http://stackoverflow.com/

Slide 42

Slide 42 text

Case Study • The repository only available using a web browser Integrate into IDE • Allows keyword searching and supports most of the properties described earlier Searching • Doesn’t allow running the code from the IDE Run-able in IDE • The developer has to manually insert the code into his project Automatic Embedding • The repository only supports adding new examples • Manually fill a form in order to add new examples New & Existing Examples • Only allows access to the code on the repository itself Repository Framework • The interface is very simple and is intuitive since it is similar to other popular search engines Simple to Use

Slide 43

Slide 43 text

Refactory.org

Slide 44

Slide 44 text

Refactory.org  A wiki for useful code snippets  http://www.refactory.org/

Slide 45

Slide 45 text

Case Study • The repository only available using a web browser Integrate into IDE • Allows keyword searching, but the repository has only a small number of code examples Searching • Doesn’t allow running the code from the IDE Run-able in IDE • The developer has to manually insert the code into his project Automatic Embedding • The repository only supports adding new examples • Manually fill a form in order to add new examples New & Existing Examples • Only allows access to the code on the repository itself Repository Framework • The interface is very simple and is intuitive since it is similar to other popular search engines Simple to Use

Slide 46

Slide 46 text

Discussion  Requirements  Focus point

Slide 47

Slide 47 text

QUESTION ?

Slide 48

Slide 48 text

THANK YOU !