Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Class Name Recommendation based on Graph Embedding of Program Elements

kuri8ive
December 05, 2019

Class Name Recommendation based on Graph Embedding of Program Elements

This is the presentation for the paper, "Class Name Recommendation based on Graph Embedding of Program Elements" at the 26th Asia-Pacific Software Engineering Conference (APSEC2019).
I'm the first author of this paper.

kuri8ive

December 05, 2019
Tweet

More Decks by kuri8ive

Other Decks in Research

Transcript

  1. APSEC2019 Day 3 – December 5th, 2019
    Class Name Recommendation based on
    Graph Embedding of Program Elements
    Shintaro Kurimoto, Yasuhiro Hayase, Hiroshi Yonai,
    Hiroyoshi Ito and Hiroyuki Kitagawa

    View full-size slide


  2. lIntroduction
    lProposed approach
    lExperiments
    lConclusion
    Contents

    View full-size slide


  3. Introduction

    View full-size slide


  4. Identifier Names
    Names given to uniquely identify
    each program element
    Program Elements
    Program components
    representing properties or behaviors,
    such as classes, methods, and fields
    Introduction | Program Elements & Identifier Names

    View full-size slide


  5. Introduction | Appropriate Identifier Naming
    The quality of identifier names
    greatly affects
    program comprehension
    [Takang et al., 1996]
    6TFS"DDPVOU*OGP
    $MBTT"#$
    Important
    Appropriate names enable
    developers to spend less time
    on program comprehension
    [Lawrie et al., 2006]

    View full-size slide


  6. Introduction | Appropriate Identifier Naming
    Difficulties
    Naming conventions lack
    empirical knowledge
    necessary for good naming
    [Tichy, 1997]
    Appropriate identifier naming
    requires domain knowledge
    of software
    [Deißenbock et al., 2005]
    follow follow

    View full-size slide


  7. Introduction | Situation of Class Name Recommendation
    DMBTT'PP\
    GJFME"
    NFUIPE

    NFUIPE

    ^
    What is a good name
    for this class…?
    Recommended
    1. Libsystemd
    2. SystemdPlugin



    View full-size slide


  8. Introduction | Studies on Class Name Recommendation
    Recommended class names
    by association rule mining
    [Fukuda et al., 2015]
    The content of a class is
    beneficial to
    recommend class names
    Low accuracy
    Embedding is beneficial to
    recommend identifier names
    Recommendation is
    not available
    before a program element
    is used somewhere
    Rule
    Recommended identifier names
    by embedding with Skip-gram
    [Allamanis et al., 2015]
    ?

    View full-size slide


  9. Introduction | Goal
    Propose an approach
    that can recommend class names
    1. before a class is used
    2. with high accuracy

    View full-size slide


  10. Introduction | Summary
    1. Appropriate Identifier Naming is
    important but difficult
    2. Previous approaches are not enough
    in available situations or accuracy

    View full-size slide


  11. Proposed approach

    View full-size slide


  12. Recommend a method name
    by embedding
    a method-call graph
    Recommendation is available
    before a method is used
    Embed a graph that represents the relationships
    between classes, methods, and fields
    Introduced
    heterogeneous
    graph embedding
    Higher accuracy than
    those of homogeneous
    Basically extends this work
    DPOOFDU
    ? XSJUF
    OFXMJOF
    [Yonai et al., 2019]
    Extension inspired by this work
    [Dong et al., 2017]
    Proposed | Key Idea

    View full-size slide

  13. Code
    Corpus
    Recommended
    D
    ɾ
    ɾ
    D
    (1) (2)
    (3)
    (4)

    \





    ^
    N
    N
    G
    Model
    (1)
    Extract
    relationships
    (2)
    Learn
    embedding
    (3)
    Obtain
    embedding
    of a target
    class
    (4)
    Recommend
    class names
    Train Recommend
    Proposed | Overview

    View full-size slide


  14. Proposed | (1) Extract Relationships between Program Elements
    ɿ$MBTT ɿ.FUIPE ɿ'JFME
    type
    return
    type
    access
    call
    extend
    possess
    possess

    View full-size slide


  15. Move the embedding of a target class nearer
    to the weighted sum of those of program elements
    that have relationship with the target
    Proposed | (2) Learning Embedding
    : Field
    : Target class : Class : Method
    : Weighted sum of program elements related to the target class

    View full-size slide


  16. Move the embedding of a target class nearer
    to the weighted sum of those of program elements
    that have relationship with the target
    Proposed | (2) Learning Embedding
    : Field
    : Target class : Class : Method
    : Weighted sum of program elements related to the target class

    View full-size slide


  17. Move the embedding of a target class nearer
    to the weighted sum of those of program elements
    that have relationship with the target
    Proposed | (2) Learning Embedding
    : Field
    : Target class : Class : Method
    : Weighted sum of program elements related to the target class

    View full-size slide


  18. Move the embedding of a target class nearer
    to the weighted sum of those of program elements
    that have relationship with the target
    Proposed | (2) Learning Embedding
    : Field
    : Target class : Class : Method
    : Weighted sum of program elements related to the target class

    View full-size slide


  19. Move the embedding of a target class nearer
    to the weighted sum of those of program elements
    that have relationship with the target
    Proposed | (2) Learning Embedding
    : Field
    : Target class : Class : Method
    : Weighted sum of program elements related to the target class

    View full-size slide


  20. Move the embedding of a target class nearer
    to the weighted sum of those of program elements
    that have relationship with the target
    Proposed | (2) Learning Embedding
    : Field
    : Target class : Class : Method
    : Weighted sum of program elements related to the target class

    View full-size slide


  21. Move the embedding of a target class nearer
    to the weighted sum of those of program elements
    that have relationship with the target
    Proposed | (2) Learning Embedding
    : Field
    : Target class : Class : Method
    : Weighted sum of program elements related to the target class

    View full-size slide


  22. Move the embedding of a target class nearer
    to the weighted sum of those of program elements
    that have relationship with the target
    Proposed | (2) Learning Embedding
    : Field
    : Target class : Class : Method
    : Weighted sum of program elements related to the target class

    View full-size slide


  23. [Yonai et al., 2019]
    Consider 1 relationship,
    only method-method
    relation by
    homogeneous embedding
    Proposed | Novelty
    Proposed approach
    Consider 7 relationships,
    between class, method,
    field relations by
    heterogeneous embedding

    View full-size slide


  24. \





    ^
    N
    N
    G
    1. Given a code including a target class
    2. Obtain the embedding of the target class
    from owned methods and fields by procedure (2)
    $PEFJODMVEJOH
    BUBSHFUDMBTT
    : Class : Method : Field
    : Target class
    Proposed | (3) Obtaining the Embedding of a Target Class

    View full-size slide


  25. 1. Calculate cos similarity between the target class
    and all candidate classes
    2. Recommend based on the similarity
    Recommended
    D
    ɾ
    ɾ
    D
    Proposed | (4) Class Name Recommendation

    View full-size slide


  26. 1. Our approach considers
    more relationships of program elements
    2. The core to realize it is applying
    heterogeneous graph embedding
    Proposed | Summary

    View full-size slide


  27. 1. Recommendation before a class is used
    2. Recommendation after a class is used
    3. Where does the proposed work well?
    Experiments

    View full-size slide



  28. <&EHFT>
    <$BOEJEBUF
    $MBTT>

    <5BSHFU
    $MBTT>

    20 large Java projects, such as ElasticSearch, Clojure
    [Allamanis et al., 2015]
    The quality of identifier names is assured
    because many developers maintain for years
    Experiments | Dataset

    View full-size slide


  29. Goal
    evaluate whether the proposed recommends…
    l before a class is used
    l with high accuracy
    Task
    Class name recommendation (before a class is used)
    Criterion
    The ratio of the recommended names that
    match partially the actual name within the top-10
    Experiments | 1. Recommendation before a class is used

    View full-size slide


  30. 1.2x higher than [Fukuda et al., 2015]
    -> before a class is used and with high accuracy
    The ratio of successful recommendation
    L
    L
    L


    Experiments | 1. Recommendation before a class is used

    View full-size slide


  31. Goal
    evaluate how well the proposed recommends…
    l after a class is used
    Task
    Class name recommendation (after a class is used)
    Criterion
    The ratio of the recommended names that
    match partially the actual name within the top-10
    Experiments | 2. Recommendation after a class is used

    View full-size slide


  32. The ratio of successful recommendation
    L
    L
    L


    Experiments | 2. Recommendation after a class is used
    1.7x higher than [Fukuda et al., 2015]
    -> with higher accuracy after a class is used

    View full-size slide


  33. Experiments | Recommendation example
    JdbcConnector
    JdbcRecordSinkProvider
    TestJdbcRecordSet
    JdbcMetadata
    TestJdbcClient
    TestingDatabase
    TestJdbcMetadata
    TestJdbcRecordSetProvider
    ConnectorColumnHandle
    ViewResolutionTests
    ???
    extends
    ConnectorRecord
    SetProvider
    getRecordSet()

    View full-size slide


  34. Experiments | Recommendation example
    JdbcConnector
    JdbcRecordSinkProvider
    TestJdbcRecordSet
    JdbcMetadata
    TestJdbcClient
    TestingDatabase
    TestJdbcMetadata
    TestJdbcRecordSetProvider
    ConnectorColumnHandle
    ViewResolutionTests
    JdbcRecord
    SetProvider
    extends
    ConnectorRecord
    SetProvider
    getRecordSet()

    View full-size slide


  35. Experiments | Recommendation example
    JdbcConnector
    JdbcRecordSinkProvider
    TestJdbcRecordSet
    JdbcMetadata
    TestJdbcClient
    TestingDatabase
    TestJdbcMetadata
    TestJdbcRecordSetProvider
    ConnectorColumnHandle
    ViewResolutionTests
    extends
    ConnectorRecord
    SetProvider
    getRecordSet()
    JdbcRecord
    SetProvider

    View full-size slide


  36. Goal
    evaluate
    l where the proposed work well
    Task
    Class name recommendation
    (before & after a class is used)
    Criterion
    The ratio of the recommended names that
    match partially the actual name within the top-10
    Experiments | 3. Where does the proposed work well?

    View full-size slide


  37. Experiments | 3. Where does the proposed work well?
    The proposed works well
    where a target class has relations with
    all types of program elements

    View full-size slide


  38. Experiments | Summary
    1. Our approach can recommend
    before a class is used and with high accuracy
    2. Our approach would work better
    (1) after a class is used
    (2) where a class has relations
    with all types of program elements

    View full-size slide


  39. Goal
    Propose an approach that recommends class names
    (1) before a class is used (2) with high accuracy
    Accomplishment
    Proposed a class name recommendation approach
    based on graph embedding of program elements
    Future work
    l Step by step recommendation
    l Combining another approach such as rule mining
    Conclusion
    The code and notebook is available: https://github.com/kuri8ive/apsec2019class

    View full-size slide