Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NLP + SE = ❤️

NLP + SE = ❤️

We present an overview of the current opportunities and challenges of applying NLP techniques to solve software engineering problems.

Georgios Gousios

October 24, 2019
Tweet

More Decks by Georgios Gousios

Other Decks in Technology

Transcript

  1. Finding bugs function setTimeout(callBack, delay) {...}; browserSingleton.startPoller(100, function(delay, fn) {

    setTimeout(delay, fn); }); Name that denotes a function Name that denotes a function
  2. Finding bugs function setTimeout(callBack, delay) {...}; browserSingleton.startPoller(100, function(delay, fn) {

    setTimeout(delay, fn); }); Name that denotes a function Name that denotes a function Order of application Order of application
  3. — Earl Barr, UCL “Source code is bimodal: it combines

    a formal, algorithmic channel and a natural language channel of identifiers and comments.”
  4. Finding bugs function setTimeout(callBack, delay) {...}; browserSingleton.startPoller(100, function(delay, fn) {

    setTimeout(delay,fn); }); Natural language channel Natural language channel
  5. Finding bugs function setTimeout(callBack, delay) {...}; browserSingleton.startPoller(100, function(delay, fn) {

    setTimeout(delay,fn); }); Natural language channel Natural language channel Code semantics channel Code semantics channel
  6. Static Analysis • Static analysis only captures the semantics channel

    • Bug detection and other forms of static analysis is pattern matching on increasingly precise semantics • Most static bug detectors find a subset of bugs (Habib and Pradel, ASE 2018) • Humans need to identify the patterns • As the semantics relax, static analysis becomes unsound • Almost impossible for dynamic languages (“stringly typed”)
  7. function setTimeout(callBack: a -> b, delay: int){…}; browserSingleton.startPoller(100, function(delay, fn)

    { setTimeout(delay,fn); } ); The compiler can only help if humans add semantic information
  8. How can NLP help? NLP approaches to software analysis aim

    to exploit the natural language information channel to help with tasks such as: • Bug finding • Type annotations • Inconsistencies • Source code summarisation • …
  9. The Naturalness hypothesis “Software is a form of human communication;

    software corpora have similar statistical properties to natural language corpora; and these properties can be exploited to build better software engineering tools.” Hindle et al. On the naturalness of software. ICSE 2012
  10. Naturalness showcased Hindle et al. On the naturalness of software.

    ICSE 2012 Code n-grams are less “surprising” to a language model than English
  11. Naturalness showcased Hindle et al. On the naturalness of software.

    ICSE 2012 Code n-grams are less “surprising” to a language model than English We can train language models to predict next tokens better in code
  12. Finding bugs Pradel and Shen. Deepbugs: A Learning Approach to

    Name-based Bug Detection. OOPSLA 2018 Training models to distinguish correct from buggy code Buggy code Correct code Buggy Correct
  13. Finding bugs Pradel and Shen. Deepbugs: A Learning Approach to

    Name-based Bug Detection. OOPSLA 2018 How to produce buggy code? • Swap function arguments foo(a, b) -> foo(b, a) • Replace binary operators i <= length -> i % length • Replace binary operand i <= length -> i <= foo
  14. Finding bugs Pradel and Shen. Deepbugs: A Learning Approach to

    Name-based Bug Detection. OOPSLA 2018 Training on 150k Javascript files Swapped arguments Wrong binary operator Wrong binary operand Accuracy 94% 92% 89%
  15. Predicting types def bigger_number(a, b): if a > b: return

    a else: return b Python 2.7 code Geen Python 2.7 na 2019!
  16. Predicting types def bigger_number(a: ???, b: ???) -> ???: if

    a > b: return a else: return b Python 3.5+ code. Can you guess the types?
  17. Predicting types def bigger_number(a: ???, b: ???) -> ???: if

    a > b: return a else: return b Python 3.5+ code. Can you guess the types? How can we automatically annotate JavaScript/Python code with types?
  18. Predicting types def is_bigger(a:int, b:int) -> boolean: “”” Returns True

    if a is number a is bigger than b, else False “”” return a > b Learning from existing code annotations
  19. Predicting types def is_bigger(a:int, b:int) -> boolean: “”” Returns True

    if a is number a is bigger than b, else False “”” return a > b Learning from existing code annotations embedding
  20. Predicting types def is_bigger(a:int, b:int) -> boolean: “”” Returns True

    if a is number a is bigger than b, else False “”” return a > b Learning from existing code annotations embedding
  21. Predicting types def is_bigger(a:int, b:int) -> boolean: “”” Returns True

    if a is number a is bigger than b, else False “”” return a > b Learning from existing code annotations embedding sequence learning
  22. Predicting types def is_bigger(a:int, b:int) -> boolean: “”” Returns True

    if a is number a is bigger than b, else False “”” return a > b Learning from existing code annotations embedding sequence learning
  23. Predicting types def is_bigger(a:int, b:int) -> boolean: “”” Returns True

    if a is number a is bigger than b, else False “”” return a > b Learning from existing code annotations embedding sequence learning
  24. Predicting types def is_bigger(a:int, b:int) -> boolean: “”” Returns True

    if a is number a is bigger than b, else False “”” return a > b Learning from existing code annotations embedding sequence learning concat +
  25. Predicting types def is_bigger(a:int, b:int) -> boolean: “”” Returns True

    if a is number a is bigger than b, else False “”” return a > b Learning from existing code annotations embedding sequence learning concat prediction +
  26. Predicting types Results on 500 partially annotated GitHub projects return

    type argument type combined precision .67 .61 .65 recall .62 .57 .59 precision .76 .77 .80 recall .70 .70 .71 Top - 1 Top -3
  27. Finding inconsistencies How to find inconsistent function/variable names? Liu et

    al. Learning to Spot and Refactor Inconsistent Method Names. ICSE 2019
  28. Finding inconsistencies How to find inconsistent function/variable names? Liu et

    al. Learning to Spot and Refactor Inconsistent Method Names. ICSE 2019
  29. Finding inconsistencies How to find inconsistent function/variable names? Liu et

    al. Learning to Spot and Refactor Inconsistent Method Names. ICSE 2019
  30. Finding inconsistencies How to find inconsistent function/variable names? Liu et

    al. Learning to Spot and Refactor Inconsistent Method Names. ICSE 2019
  31. Finding inconsistencies How to find inconsistent function/variable names? Liu et

    al. Learning to Spot and Refactor Inconsistent Method Names. ICSE 2019
  32. Finding inconsistencies How to find inconsistent function/variable names? Liu et

    al. Learning to Spot and Refactor Inconsistent Method Names. ICSE 2019 Methods with the similar names should have similar bodies
  33. Finding inconsistencies Liu et al. Learning to Spot and Refactor

    Inconsistent Method Names. ICSE 2019 1. Build embeddings of function names and body vectors 2. For each function body: 1. Find functions close to it in vector space 2. Check their respective name distance
  34. Code summarization Wan et al. Improving automatic source code summarization

    via deep reinforcement learning. ASE 2018 def add(a, b): return a + b def ???(a, b): return a + b
  35. Code summarization Wan et al. Improving automatic source code summarization

    via deep reinforcement learning. ASE 2018 def add(a, b): return a + b “”” Adds two numbers “”” def ???(a, b): return a + b
  36. Code summarization Wan et al. Improving automatic source code summarization

    via deep reinforcement learning. ASE 2018 def add(a, b): return a + b “”” Adds two numbers “”” def ???(a, b): return a + b add
  37. Code summarization Wan et al. Improving automatic source code summarization

    via deep reinforcement learning. ASE 2018 Use critic network to re-adjust model weights BLEU score of 0.35
  38. Code summarization Wan et al. Improving automatic source code summarization

    via deep reinforcement learning. ASE 2018 NL channel Semantics channel Use critic network to re-adjust model weights BLEU score of 0.35
  39. Code summarization Wan et al. Improving automatic source code summarization

    via deep reinforcement learning. ASE 2018 NL channel Semantics channel Use critic network to re-adjust model weights BLEU score of 0.35
  40. Main challenges • Developers like to invent names • Code

    vocabularies are 10x the size of NL ones • Compression techniques (e.g. BPE) to the rescue • How to feed code to a network without loosing info from either the NL or the semantics channel? • Code2Vec, TreeLSTMs, GGNNs,… • Keeping up with evolution • Making tools — not just research papers
  41. ML4SE course! • Student presentations of course projects, including poster

    sessions • Oct 30, 13:45 - 17:00, Pulse-Hall 7