Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NLP + SE = ❤️

NLP + SE = ❤️

We present an overview of the current opportunities and challenges of applying NLP techniques to solve software engineering problems.

Georgios Gousios

October 24, 2019
Tweet

More Decks by Georgios Gousios

Other Decks in Technology

Transcript

  1. NLP + SE = ❤
    Georgios Gousios

    TU Delft

    View Slide

  2. function setTimeout(callBack, delay){...};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay,fn);
    }
    );
    Can you spot the bug?

    View Slide

  3. Finding bugs
    function setTimeout(callBack, delay)
    {...};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay, fn);
    });

    View Slide

  4. Finding bugs
    function setTimeout(callBack, delay)
    {...};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay, fn);
    });
    Name that
    denotes a
    function
    Name that
    denotes a
    function

    View Slide

  5. Finding bugs
    function setTimeout(callBack, delay)
    {...};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay, fn);
    });
    Name that
    denotes a
    function
    Name that
    denotes a
    function
    Order of
    application
    Order of
    application

    View Slide

  6. — Earl Barr, UCL
    “Source code is bimodal: it combines a formal,
    algorithmic channel and a natural language channel
    of identifiers and comments.”

    View Slide

  7. Finding bugs
    function setTimeout(callBack, delay)
    {...};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay,fn);
    });

    View Slide

  8. Finding bugs
    function setTimeout(callBack, delay)
    {...};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay,fn);
    });
    Natural
    language
    channel
    Natural
    language
    channel

    View Slide

  9. Finding bugs
    function setTimeout(callBack, delay)
    {...};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay,fn);
    });
    Natural
    language
    channel
    Natural
    language
    channel
    Code
    semantics
    channel
    Code
    semantics
    channel

    View Slide

  10. Static Analysis
    • Static analysis only captures the semantics channel

    • Bug detection and other forms of static analysis is pattern
    matching on increasingly precise semantics

    • Most static bug detectors find a subset of bugs (Habib and
    Pradel, ASE 2018)

    • Humans need to identify the patterns

    • As the semantics relax, static analysis becomes unsound

    • Almost impossible for dynamic languages (“stringly typed”)

    View Slide

  11. function setTimeout(callBack: a -> b,
    delay: int){…};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay,fn);
    }
    );

    View Slide

  12. function setTimeout(callBack: a -> b,
    delay: int){…};
    browserSingleton.startPoller(100,
    function(delay, fn) {
    setTimeout(delay,fn);
    }
    );
    The compiler can only help if
    humans add semantic information

    View Slide

  13. How can NLP help?
    NLP approaches to software analysis aim to exploit the natural
    language information channel to help with tasks such as:

    • Bug finding

    • Type annotations

    • Inconsistencies

    • Source code summarisation

    • …

    View Slide

  14. The Naturalness hypothesis
    “Software is a form of human communication; software
    corpora have similar statistical properties to natural
    language corpora; and these properties can be exploited
    to build better software engineering tools.”
    Hindle et al. On the naturalness of software. ICSE 2012

    View Slide

  15. Naturalness showcased
    Hindle et al. On the naturalness of software. ICSE 2012
    Code n-grams are less “surprising”
    to a language model than English

    View Slide

  16. Naturalness showcased
    Hindle et al. On the naturalness of software. ICSE 2012
    Code n-grams are less “surprising”
    to a language model than English
    We can train language models to
    predict next tokens better in code

    View Slide

  17. Finding bugs
    Pradel and Shen. Deepbugs: A Learning Approach
    to Name-based Bug Detection. OOPSLA 2018
    Training models to distinguish correct from buggy code
    Buggy code
    Correct code
    Buggy
    Correct

    View Slide

  18. Finding bugs
    Pradel and Shen. Deepbugs: A Learning Approach
    to Name-based Bug Detection. OOPSLA 2018
    How to produce buggy code?
    • Swap function arguments

    foo(a, b) -> foo(b, a)
    • Replace binary operators

    i <= length -> i % length
    • Replace binary operand

    i <= length -> i <= foo

    View Slide

  19. Finding bugs
    Pradel and Shen. Deepbugs: A Learning Approach
    to Name-based Bug Detection. OOPSLA 2018
    Training on 150k Javascript files
    Swapped arguments
    Wrong binary operator
    Wrong binary operand
    Accuracy
    94%
    92%
    89%

    View Slide

  20. Predicting types
    def bigger_number(a, b):
    if a > b:
    return a
    else:
    return b
    Python 2.7 code

    View Slide

  21. Predicting types
    def bigger_number(a, b):
    if a > b:
    return a
    else:
    return b
    Python 2.7 code
    Geen Python 2.7
    na 2019!

    View Slide

  22. Predicting types
    def bigger_number(a: ???, b: ???) -> ???:
    if a > b:
    return a
    else:
    return b
    Python 3.5+ code. Can you guess the types?

    View Slide

  23. Predicting types
    def bigger_number(a: ???, b: ???) -> ???:
    if a > b:
    return a
    else:
    return b
    Python 3.5+ code. Can you guess the types?
    How can we automatically
    annotate JavaScript/Python code
    with types?

    View Slide

  24. Predicting types
    def is_bigger(a:int, b:int) -> boolean:
    “””
    Returns True if a is number a is
    bigger than b, else False
    “””
    return a > b
    Learning from existing code annotations

    View Slide

  25. Predicting types
    def is_bigger(a:int, b:int) -> boolean:
    “””
    Returns True if a is number a is
    bigger than b, else False
    “””
    return a > b
    Learning from existing code annotations
    embedding

    View Slide

  26. Predicting types
    def is_bigger(a:int, b:int) -> boolean:
    “””
    Returns True if a is number a is
    bigger than b, else False
    “””
    return a > b
    Learning from existing code annotations
    embedding

    View Slide

  27. Predicting types
    def is_bigger(a:int, b:int) -> boolean:
    “””
    Returns True if a is number a is
    bigger than b, else False
    “””
    return a > b
    Learning from existing code annotations
    embedding
    sequence
    learning

    View Slide

  28. Predicting types
    def is_bigger(a:int, b:int) -> boolean:
    “””
    Returns True if a is number a is
    bigger than b, else False
    “””
    return a > b
    Learning from existing code annotations
    embedding
    sequence
    learning

    View Slide

  29. Predicting types
    def is_bigger(a:int, b:int) -> boolean:
    “””
    Returns True if a is number a is
    bigger than b, else False
    “””
    return a > b
    Learning from existing code annotations
    embedding
    sequence
    learning

    View Slide

  30. Predicting types
    def is_bigger(a:int, b:int) -> boolean:
    “””
    Returns True if a is number a is
    bigger than b, else False
    “””
    return a > b
    Learning from existing code annotations
    embedding
    sequence
    learning
    concat +

    View Slide

  31. Predicting types
    def is_bigger(a:int, b:int) -> boolean:
    “””
    Returns True if a is number a is
    bigger than b, else False
    “””
    return a > b
    Learning from existing code annotations
    embedding
    sequence
    learning
    concat
    prediction
    +

    View Slide

  32. Predicting types
    Results on 500 partially annotated GitHub projects
    return type
    argument type
    combined
    precision
    .67
    .61
    .65
    recall
    .62
    .57
    .59
    precision
    .76
    .77
    .80
    recall
    .70
    .70
    .71
    Top - 1 Top -3

    View Slide

  33. Finding inconsistencies
    How to find inconsistent function/variable names?
    Liu et al. Learning to Spot and Refactor
    Inconsistent Method Names. ICSE 2019

    View Slide

  34. Finding inconsistencies
    How to find inconsistent function/variable names?
    Liu et al. Learning to Spot and Refactor
    Inconsistent Method Names. ICSE 2019

    View Slide

  35. Finding inconsistencies
    How to find inconsistent function/variable names?
    Liu et al. Learning to Spot and Refactor
    Inconsistent Method Names. ICSE 2019

    View Slide

  36. Finding inconsistencies
    How to find inconsistent function/variable names?
    Liu et al. Learning to Spot and Refactor
    Inconsistent Method Names. ICSE 2019

    View Slide

  37. Finding inconsistencies
    How to find inconsistent function/variable names?
    Liu et al. Learning to Spot and Refactor
    Inconsistent Method Names. ICSE 2019

    View Slide

  38. Finding inconsistencies
    How to find inconsistent function/variable names?
    Liu et al. Learning to Spot and Refactor
    Inconsistent Method Names. ICSE 2019
    Methods with the similar names should have similar bodies

    View Slide

  39. Finding inconsistencies
    Liu et al. Learning to Spot and Refactor
    Inconsistent Method Names. ICSE 2019

    View Slide

  40. Finding inconsistencies
    Liu et al. Learning to Spot and Refactor
    Inconsistent Method Names. ICSE 2019
    1. Build embeddings of function names and body vectors
    2. For each function body:
    1. Find functions close to it in vector space
    2. Check their respective name distance

    View Slide

  41. Code summarization
    Wan et al. Improving automatic source code
    summarization via deep reinforcement learning. ASE 2018
    def add(a, b):
    return a + b
    def ???(a, b):
    return a + b

    View Slide

  42. Code summarization
    Wan et al. Improving automatic source code
    summarization via deep reinforcement learning. ASE 2018
    def add(a, b):
    return a + b
    “””
    Adds two numbers
    “””
    def ???(a, b):
    return a + b

    View Slide

  43. Code summarization
    Wan et al. Improving automatic source code
    summarization via deep reinforcement learning. ASE 2018
    def add(a, b):
    return a + b
    “””
    Adds two numbers
    “””
    def ???(a, b):
    return a + b
    add

    View Slide

  44. Code summarization
    Wan et al. Improving automatic source code
    summarization via deep reinforcement learning. ASE 2018
    Use critic network to re-adjust model weights
    BLEU score of 0.35

    View Slide

  45. Code summarization
    Wan et al. Improving automatic source code
    summarization via deep reinforcement learning. ASE 2018
    NL channel
    Semantics
    channel
    Use critic network to re-adjust model weights
    BLEU score of 0.35

    View Slide

  46. Code summarization
    Wan et al. Improving automatic source code
    summarization via deep reinforcement learning. ASE 2018
    NL channel
    Semantics
    channel
    Use critic network to re-adjust model weights
    BLEU score of 0.35

    View Slide

  47. Main challenges
    • Developers like to invent names

    • Code vocabularies are 10x the size of NL ones

    • Compression techniques (e.g. BPE) to the rescue

    • How to feed code to a network without loosing info from either the
    NL or the semantics channel?

    • Code2Vec, TreeLSTMs, GGNNs,…

    • Keeping up with evolution

    • Making tools — not just research papers

    View Slide

  48. ML4SE course!
    • Student presentations of course projects, including poster
    sessions

    • Oct 30, 13:45 - 17:00, Pulse-Hall 7

    View Slide