Towards an example repository

Towards an example repository

This presentation was given as my M.Sc. research proposal at the Advanced Software Tools research seminar, Tel Aviv University, May 31st 2010.

419a48edd42158cd11d802f06b448e83?s=128

Alexey Zagalsky

May 31, 2010
Tweet

Transcript

  1. 1.

    Towards an Example Repository Advanced Software Tools Seminar, TAU 31

    May 2010 Alexey Zagalsky, Ohad Barzilay, Prof. Amiram Yehudai
  2. 2.

    About Myself  Computer Science M.Sc. Student  Working under

    the supervision of Prof. Amiram Yehudai  This talk is based on a work in progress  Happy to receive suggestions
  3. 5.

    Example Embedding  Example Embedding is the notion of using

    an already existing code fragment within a new context  Empirical SE research characterized EE as a software activity  Ohad Barzilay, Orit Hazzan, Amiram Yehudai, “Characterizing Example Embedding as a Software Activity”, 2009  Motivation derived from Refactoring  Using examples is not new
  4. 6.

    Software Activity  We define software activity as a collection

    of fine grained techniques, which together assemble an abstract key notion in software development Analysis Programming Debugging Coding Refactoring Search Example Embedding Code Comprehension Testing Design Taken from “Example Embedding by Ohad Barzilay” slides
  5. 10.

    Support Software development Eco-System  The repository has to be

    more than just a collection of examples  It has to be part of the Example Embedding software activity  In order for the repository to support the example embedding eco-system we must define the requirements for a repository
  6. 11.

    Requirements System - Eco Support Software development  Integrate into

    IDE  Searching  Run-able in IDE  Automatic Embedding  New & Existing Examples  Repository Framework  Simple to Use
  7. 12.

    Integrate into IDE  The repository should be an integral

    part of the development process  The developer will “stay in focus”  Integrating into IDE will allow: ◦ Automatic Embedding ◦ Run-able in IDE
  8. 13.

    Searching  Properties: ◦ Name ◦ Rank / Score /

    Rating ◦ Tags ◦ Author / Source of the Example ◦ Input / Output type / Signature ◦ Programming Language ◦ Security ◦ Example View History  These properties will allow to: ◦ Filter the search results ◦ Use an advanced searching algorithm based on these properties and not only based on similarity of keywords
  9. 14.

    Advanced Searching  Emily Hill, Lori Pollock, K. Vijay-Shanker, “Automatically

    Capturing Source Code Context of NL-Queries for Software Maintenance and Reuse”, ICSE’09
  10. 15.

    Run-able in IDE  The developer will be able to:

    ◦ Run the example before he embeds it ◦ See if the example meets his requirements  This could be done using Slices
  11. 16.

    Automatic Embedding  The developer should be able to choose

    an example from the search results and embed it into his code automatically  Allow the developer to choose where he wants embed the example  Similar to how refactoring works  Automatic Embedding will improve code quality by following Embedding Patterns
  12. 17.

    New & Existing Examples  Well defined example structure 

    Allow to create and submit new examples directly from IDE  Support conversion for existing examples ◦ SourceForge.net  Conversion will contribute to the acceptance of the repository framework  Automated tool
  13. 18.

    Repository Framework  Choose the source of the example framework

    ◦ Public repository ◦ Proprietary repository  Software companies could use their own examples using the same framework
  14. 19.

    Simple to Use  Should not require learning any query

    languages  It has to be intuitive and based on human aspects  Should be similar to internet search engines
  15. 21.

    Candidates for the repository  CodeGenie ◦ Sourcerer  Strathcona

    Tool  Refactory  StackOverflow  Let’s examine each one of them using the requirements we defined earlier
  16. 22.
  17. 23.

    CodeGenie  Using test-cases to search and reuse source code

     Otavio Lemos, Sushil Bajracharya, Joel Ossher, Ricardo Morla, Paulo Masiero, Pierre Baldi, Cristina Lopes, “Using Test-Cases to Search and Reuse Source Code“, 2007  Based on the Sourcerer search engine ◦ Search is based on Keywords, Structural properties and Relations among program elements  Code Search  Repository Access  Slicing
  18. 24.

    Test Driven Code Search  The same way that test

    cases can be used to define a software feature in TDD, they can also be used to describe a desired feature in a code search 1. Construct test cases 2. Search for code and integrate 3. Refactor
  19. 26.

    TDCS – Weaving / UnWeaving  The developer can explore

    the results by weaving / testing and unweaving them  To do that, a program slicing service to provide self-contained code pieces related to the desired feature, and a repository access service must be available at the code services side.
  20. 28.

    Full Details  Eclipse  JUnit (at least one method)

     Extract information: ◦ Extract interface of the missing method and the names of the missing method and of the class it belongs to, by analyzing compiler errors present in the test cases ◦ The AST is explored to extract the return type and argument types  Weaving: ◦ Merge by name strategy as used as used in Hyper/J ◦ The merging is done by a union operation on the structures of the classes ◦ Uses java annotations to track the woven structers  Sourcerer
  21. 30.

    Case Study • The tool is integrated in the IDE

    Integrate into IDE • Allows Searching but only by use cases • No option for filtering by any of the parameters Searching • Allows running the code from the IDE using slices Run-able in IDE • Allows weaving but it is not real code embedding Automatic Embedding • Doesn’t allow adding examples from the IDE New & Existing Examples • Only allows access to the code the sourcerer has access to – SourceForge.net Repository Framework • Only searchable by using test cases • Not intuitive Simple to Use
  22. 32.

    Using Structural Context to Recommend Examples  Strathcona tool (eclipse

    plug-in)  Reid Holmes, Gail C. Murphy, “Using Structural Context to Recommend Source Code Examples”, 2005  Locating code in an example repository based on heuristically matching the structure of the code under development to the example code ◦ Structural Context extracted automatically ◦ The repository is extracted automatically from existing applications
  23. 33.

    Strathcona 1. Populate the repository 2. Structural Context description generated

    and sent to the server 3. Heuristically match structure description 4. Returns ten best structural matches to the developer
  24. 36.

    Strathcona  Heuristics ◦ Inheritance Heuristic – same parents ◦

    Calls Heuristic  Basic Calls – call the same targets  Calls Best fit – best ration match/unmatched call targets  Calls with inheritance – with at least one parent ◦ Uses Heuristic  Basic Uses – use the same types  Uses with inheritance – with at least one parent
  25. 37.
  26. 38.

    Limitations  Search relies entirely on the structure of the

    code being edited ◦ The ability to return useful examples dependent upon the quality of the seed code used in the query  Searches only the extracted code in the repository  More limitations: ◦ Repository code must be parse-able by eclipse compiler ◦ Repository code should represent good usage of the framework
  27. 39.

    Case Study • The tool is integrated in the IDE

    Integrate into IDE • Allows Searching but only by code being edited • No option for filtering by any of the parameters Searching • Returns the source code of the example Run-able in IDE • Only returns the source code of the example without embedding Automatic Embedding • Based on the source code extracted into the repository only New & Existing Examples • Supports mostly local repository • Requires improvements to support public rep. Repository Framework • Only searchable by using edited code seed • Not intuitive Simple to Use
  28. 41.

    StackOverflow  Stack Overflow is a collaboratively edited question and

    answer site for programmers  http://stackoverflow.com/
  29. 42.

    Case Study • The repository only available using a web

    browser Integrate into IDE • Allows keyword searching and supports most of the properties described earlier Searching • Doesn’t allow running the code from the IDE Run-able in IDE • The developer has to manually insert the code into his project Automatic Embedding • The repository only supports adding new examples • Manually fill a form in order to add new examples New & Existing Examples • Only allows access to the code on the repository itself Repository Framework • The interface is very simple and is intuitive since it is similar to other popular search engines Simple to Use
  30. 45.

    Case Study • The repository only available using a web

    browser Integrate into IDE • Allows keyword searching, but the repository has only a small number of code examples Searching • Doesn’t allow running the code from the IDE Run-able in IDE • The developer has to manually insert the code into his project Automatic Embedding • The repository only supports adding new examples • Manually fill a form in order to add new examples New & Existing Examples • Only allows access to the code on the repository itself Repository Framework • The interface is very simple and is intuitive since it is similar to other popular search engines Simple to Use