Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Class Name Recommendation based on Graph Embedding of Program Elements

kuri8ive
December 05, 2019

Class Name Recommendation based on Graph Embedding of Program Elements

This is the presentation for the paper, "Class Name Recommendation based on Graph Embedding of Program Elements" at the 26th Asia-Pacific Software Engineering Conference (APSEC2019).
I'm the first author of this paper.

kuri8ive

December 05, 2019
Tweet

More Decks by kuri8ive

Other Decks in Research

Transcript

  1. APSEC2019 Day 3 – December 5th, 2019 Class Name Recommendation

    based on Graph Embedding of Program Elements Shintaro Kurimoto, Yasuhiro Hayase, Hiroshi Yonai, Hiroyoshi Ito and Hiroyuki Kitagawa
  2.  lIntroduction lProposed approach lExperiments lConclusion Contents

  3.  Introduction

  4.  Identifier Names Names given to uniquely identify each program

    element Program Elements Program components representing properties or behaviors, such as classes, methods, and fields Introduction | Program Elements & Identifier Names
  5.  Introduction | Appropriate Identifier Naming The quality of identifier

    names greatly affects program comprehension [Takang et al., 1996] 6TFS"DDPVOU*OGP $MBTT"#$ Important Appropriate names enable developers to spend less time on program comprehension [Lawrie et al., 2006]
  6.  Introduction | Appropriate Identifier Naming Difficulties Naming conventions lack

    empirical knowledge necessary for good naming [Tichy, 1997] Appropriate identifier naming requires domain knowledge of software [Deißenbock et al., 2005] follow follow ≠
  7.  Introduction | Situation of Class Name Recommendation DMBTT'PP\ GJFME"

    NFUIPE NFUIPE ^ What is a good name for this class…? Recommended 1. Libsystemd 2. SystemdPlugin ・ ・ ・
  8.  Introduction | Studies on Class Name Recommendation Recommended class

    names by association rule mining [Fukuda et al., 2015] The content of a class is beneficial to recommend class names Low accuracy Embedding is beneficial to recommend identifier names Recommendation is not available before a program element is used somewhere Rule Recommended identifier names by embedding with Skip-gram [Allamanis et al., 2015] ?
  9.  Introduction | Goal Propose an approach that can recommend

    class names 1. before a class is used 2. with high accuracy
  10.  Introduction | Summary 1. Appropriate Identifier Naming is important

    but difficult 2. Previous approaches are not enough in available situations or accuracy
  11.  Proposed approach

  12.  Recommend a method name by embedding a method-call graph

    Recommendation is available before a method is used Embed a graph that represents the relationships between classes, methods, and fields Introduced heterogeneous graph embedding Higher accuracy than those of homogeneous Basically extends this work DPOOFDU ? XSJUF OFXMJOF [Yonai et al., 2019] Extension inspired by this work [Dong et al., 2017] Proposed | Key Idea
  13. Code Corpus Recommended  D ɾ ɾ D (1) (2)

    (3) (4)  \    ^ N N G Model (1) Extract relationships (2) Learn embedding (3) Obtain embedding of a target class (4) Recommend class names Train Recommend Proposed | Overview
  14.  Proposed | (1) Extract Relationships between Program Elements ɿ$MBTT

    ɿ.FUIPE ɿ'JFME type return type access call extend possess possess
  15.  Move the embedding of a target class nearer to

    the weighted sum of those of program elements that have relationship with the target Proposed | (2) Learning Embedding : Field : Target class : Class : Method : Weighted sum of program elements related to the target class
  16.  Move the embedding of a target class nearer to

    the weighted sum of those of program elements that have relationship with the target Proposed | (2) Learning Embedding : Field : Target class : Class : Method : Weighted sum of program elements related to the target class
  17.  Move the embedding of a target class nearer to

    the weighted sum of those of program elements that have relationship with the target Proposed | (2) Learning Embedding : Field : Target class : Class : Method : Weighted sum of program elements related to the target class
  18.  Move the embedding of a target class nearer to

    the weighted sum of those of program elements that have relationship with the target Proposed | (2) Learning Embedding : Field : Target class : Class : Method : Weighted sum of program elements related to the target class
  19.  Move the embedding of a target class nearer to

    the weighted sum of those of program elements that have relationship with the target Proposed | (2) Learning Embedding : Field : Target class : Class : Method : Weighted sum of program elements related to the target class
  20.  Move the embedding of a target class nearer to

    the weighted sum of those of program elements that have relationship with the target Proposed | (2) Learning Embedding : Field : Target class : Class : Method : Weighted sum of program elements related to the target class
  21.  Move the embedding of a target class nearer to

    the weighted sum of those of program elements that have relationship with the target Proposed | (2) Learning Embedding : Field : Target class : Class : Method : Weighted sum of program elements related to the target class
  22.  Move the embedding of a target class nearer to

    the weighted sum of those of program elements that have relationship with the target Proposed | (2) Learning Embedding : Field : Target class : Class : Method : Weighted sum of program elements related to the target class
  23.  [Yonai et al., 2019] Consider 1 relationship, only method-method

    relation by homogeneous embedding Proposed | Novelty Proposed approach Consider 7 relationships, between class, method, field relations by heterogeneous embedding
  24.  \    ^ N N G 1.

    Given a code including a target class 2. Obtain the embedding of the target class from owned methods and fields by procedure (2) $PEFJODMVEJOH BUBSHFUDMBTT : Class : Method : Field : Target class Proposed | (3) Obtaining the Embedding of a Target Class
  25.  1. Calculate cos similarity between the target class and

    all candidate classes 2. Recommend based on the similarity Recommended  D ɾ ɾ D Proposed | (4) Class Name Recommendation
  26.  1. Our approach considers more relationships of program elements

    2. The core to realize it is applying heterogeneous graph embedding Proposed | Summary
  27.  Experiments

  28.  1. Recommendation before a class is used 2. Recommendation

    after a class is used 3. Where does the proposed work well? Experiments
  29.  </PEFT>  <&EHFT>  <$BOEJEBUF $MBTT>  <5BSHFU $MBTT>

     20 large Java projects, such as ElasticSearch, Clojure [Allamanis et al., 2015] The quality of identifier names is assured because many developers maintain for years Experiments | Dataset
  30.  Goal evaluate whether the proposed recommends… l before a

    class is used l with high accuracy Task Class name recommendation (before a class is used) Criterion The ratio of the recommended names that match partially the actual name within the top-10 Experiments | 1. Recommendation before a class is used
  31.  1.2x higher than [Fukuda et al., 2015] -> before

    a class is used and with high accuracy The ratio of successful recommendation L L L Lʙ Lʙ  Experiments | 1. Recommendation before a class is used
  32.  Goal evaluate how well the proposed recommends… l after

    a class is used Task Class name recommendation (after a class is used) Criterion The ratio of the recommended names that match partially the actual name within the top-10 Experiments | 2. Recommendation after a class is used
  33.  The ratio of successful recommendation L L L Lʙ

    Lʙ  Experiments | 2. Recommendation after a class is used 1.7x higher than [Fukuda et al., 2015] -> with higher accuracy after a class is used
  34.  Experiments | Recommendation example JdbcConnector JdbcRecordSinkProvider TestJdbcRecordSet JdbcMetadata TestJdbcClient

    TestingDatabase TestJdbcMetadata TestJdbcRecordSetProvider ConnectorColumnHandle ViewResolutionTests ??? extends ConnectorRecord SetProvider getRecordSet()
  35.  Experiments | Recommendation example JdbcConnector JdbcRecordSinkProvider TestJdbcRecordSet JdbcMetadata TestJdbcClient

    TestingDatabase TestJdbcMetadata TestJdbcRecordSetProvider ConnectorColumnHandle ViewResolutionTests JdbcRecord SetProvider extends ConnectorRecord SetProvider getRecordSet()
  36.  Experiments | Recommendation example JdbcConnector JdbcRecordSinkProvider TestJdbcRecordSet JdbcMetadata TestJdbcClient

    TestingDatabase TestJdbcMetadata TestJdbcRecordSetProvider ConnectorColumnHandle ViewResolutionTests extends ConnectorRecord SetProvider getRecordSet() JdbcRecord SetProvider
  37.  Goal evaluate l where the proposed work well Task

    Class name recommendation (before & after a class is used) Criterion The ratio of the recommended names that match partially the actual name within the top-10 Experiments | 3. Where does the proposed work well?
  38.  Experiments | 3. Where does the proposed work well?

    The proposed works well where a target class has relations with all types of program elements
  39.  Experiments | Summary 1. Our approach can recommend before

    a class is used and with high accuracy 2. Our approach would work better (1) after a class is used (2) where a class has relations with all types of program elements
  40.  Conclusion

  41.  Goal Propose an approach that recommends class names (1)

    before a class is used (2) with high accuracy Accomplishment Proposed a class name recommendation approach based on graph embedding of program elements Future work l Step by step recommendation l Combining another approach such as rule mining Conclusion The code and notebook is available: https://github.com/kuri8ive/apsec2019class