Slide 1

Slide 1 text

Lily Mast Eli Rademacher Tien Nguyen Danny Dig Anh Nguyen Michael Hilton Mihai Codoban Hoan Nguyen API CODE RECOMMENDATION USING STATISTICAL LEARNING FROM FINE-GRAINED CHANGES 1

Slide 2

Slide 2 text

STATE OF THE PRACTICE 2

Slide 3

Slide 3 text

STATE OF THE PRACTICE System.out. 2

Slide 4

Slide 4 text

STATE OF THE PRACTICE System.out. append(char c) append(CharSequence c append(CharSequence c checkError() close() flush() format(locale l, Stri format(String format, 2

Slide 5

Slide 5 text

STATE OF THE PRACTICE System.out. append(char c) PrintStream append(CharSequence csq) PrintStream append(CharSequence csq, int start, … ) PrintStream checkError() boolean close() void flush() void format(locale l, String format,…) PrintStream format(String format, Object… args) PrintStream print(boolean b) void print(char c) void print(char[] s) void print(double d) void print(float f) void print(int i) void print(long l) void print(Object obj) void print(String s) void printf(Locale l, String format, …) PrintStream printf(String format, Object… args) PrintStream println() void println(boolean x) void println(char x) void println(char[] x) void println(double x) void println(float x) void println(int x) void println(long x) void println(Object x) void println(String x) void write(byte[] but, int off, int len) void write(int b) void 2

Slide 6

Slide 6 text

(PREVIOUS) STATE OF THE ART : LEARN FROM STATIC CONTEXT Text t;
 t = new Text();
 t.setText(“hello world"); 3

Slide 7

Slide 7 text

(PREVIOUS) STATE OF THE ART : LEARN FROM STATIC CONTEXT Text t;
 t = new Text();
 t.setText(“hello world"); Recommend: 
 setText(…)
 because it often co-occurs with:
 new Text() 3

Slide 8

Slide 8 text

(PREVIOUS) STATE OF THE ART : LEARN FROM STATIC CONTEXT Text t;
 t = new Text();
 t.setText(“hello world"); Recommend: 
 setText(…)
 because it often co-occurs with:
 new Text() frequently co-occuring terms: [Bruch et al. FSE 09] 3

Slide 9

Slide 9 text

(PREVIOUS) STATE OF THE ART : LEARN FROM STATIC CONTEXT Text t;
 t = new Text();
 t.setText(“hello world"); Recommend: 
 setText(…)
 because it often co-occurs with:
 new Text() frequently co-occuring terms: [Bruch et al. FSE 09] +Order of terms: [Reiss ICSE 09] 3

Slide 10

Slide 10 text

(PREVIOUS) STATE OF THE ART : LEARN FROM STATIC CONTEXT Text t;
 t = new Text();
 t.setText(“hello world"); Recommend: 
 setText(…)
 because it often co-occurs with:
 new Text() frequently co-occuring terms: [Bruch et al. FSE 09] +Order of terms: [Reiss ICSE 09] +Program dependencies: [Nguyen ICSE 12] 3

Slide 11

Slide 11 text


 
 + for (Task t: tasks) {
 t.execute();
 + } CHALLENGE: IRRELEVANT TOKENS CLOSEST TO REC POINT. 4

Slide 12

Slide 12 text


 
 + for (Task t: tasks) {
 t.execute();
 + } + Set results = new HashSet<>(); + results._ CHALLENGE: IRRELEVANT TOKENS CLOSEST TO REC POINT. 4

Slide 13

Slide 13 text


 
 + for (Task t: tasks) {
 t.execute();
 + } + Set results = new HashSet<>(); + results._ 
 
 + Set results = new HashSet<>(); 
 + results._ CHALLENGE: IRRELEVANT TOKENS CLOSEST TO REC POINT. 4

Slide 14

Slide 14 text

NATURALNESS OF CODE CHANGES Our key insight: changes are regular
 Changes co-occur together
 We leverage dynamic context provided by regular changes 5

Slide 15

Slide 15 text

KEY ADVANCES 6 Approach Previous Ours Mining Source Static Code Context (code snapshots) Static Code Context (code snapshots) + Dynamic Code Context (code changes) Recommendation Approach Association Mining Association Mining + Statistical Inference of Changes + Set results = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 + +results.add(t.getResults()); } + Set results = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 + +results._ } + Set results = new HashSet<>(); + results.add(t.getResults());

Slide 16

Slide 16 text

APIREC 7

Slide 17

Slide 17 text

LEARNING REGULAR CHANGES USING STATISTICAL APPROACH 8 
 
 + Set results = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 + results.add(t.getResults()); } 
 
 + Set results = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 + results.add(t.getResults()); }

Slide 18

Slide 18 text

LEARNING REGULAR CHANGES USING STATISTICAL APPROACH 8 
 
 + Set results = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 + results.add(t.getResults()); }

Slide 19

Slide 19 text

LEARNING REGULAR CHANGES USING STATISTICAL APPROACH 8 
 
 + Set results = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 + results.add(t.getResults()); }

Slide 20

Slide 20 text

Set results = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 results.add(t.getResults()); } Set r = new HashSet<>(); for (int i=0; i h = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 h.add(t.getResults()); } Set col = new HashSet<>(); for (int i=0; i results = new HashSet<>(); for (Task t: tasks) {
 t.execute();
 results.add(t.getResults()); } Set results = new HashSet<>(); while (tasks.length > 0) {
 tasks[i].calculate();
 results.add(t.getResults()); i++; } BASIS OF CONSENSUS 9

Slide 21

Slide 21 text

CHANGE INFERENCE MODEL 10

Slide 22

Slide 22 text

Score(c,(DC,SC))= 
 CHANGE INFERENCE MODEL 10 c = current change (,, )

Slide 23

Slide 23 text


 wSC × Score(c,SC) 
 Score(c,(DC,SC))= 
 CHANGE INFERENCE MODEL 10 c = current change (,, ) Score(c,SC) = impact of Static Context T on predicting c

Slide 24

Slide 24 text


 wSC × Score(c,SC) 
 
 
 + wDC × Score(c,DC) Score(c,(DC,SC))= 
 CHANGE INFERENCE MODEL 10 c = current change (,, ) Score(c,SC) = impact of Static Context T on predicting c Score(c,DC) = impact of Dynamic Context on predicting c

Slide 25

Slide 25 text


 wSC × Score(c,SC) 
 
 wSC × 
 
 + wDC × Score(c,DC) Score(c,(DC,SC))= 
 CHANGE INFERENCE MODEL 10 c = current change (,, ) Score(c,SC) = impact of Static Context T on predicting c Score(c,DC) = impact of Dynamic Context on predicting c wSC = weight of impact of context

Slide 26

Slide 26 text


 wSC × Score(c,SC) 
 
 wSC × 
 
 + wDC × Score(c,DC) 
 + wDC × Score(c,(DC,SC))= 
 CHANGE INFERENCE MODEL 10 c = current change (,, ) Score(c,SC) = impact of Static Context T on predicting c Score(c,DC) = impact of Dynamic Context on predicting c wSC = weight of impact of context wDC = weight of impact of change

Slide 27

Slide 27 text

RESEARCH QUESTIONS: 11

Slide 28

Slide 28 text

RESEARCH QUESTIONS: ➤RQ1 Accuracy: How often does APIREC recommend the correct API? 11

Slide 29

Slide 29 text

RESEARCH QUESTIONS: ➤RQ1 Accuracy: How often does APIREC recommend the correct API? ➤RQ2 Sensitivity: How does context size impact APIREC’s accuracy? 11

Slide 30

Slide 30 text

RESEARCH QUESTIONS: ➤RQ1 Accuracy: How often does APIREC recommend the correct API? ➤RQ2 Sensitivity: How does context size impact APIREC’s accuracy? ➤RQ3 Running Time: What is the running time of APIREC? 11

Slide 31

Slide 31 text

CORPORA 12 Large Corpus Community Corpus Projects 50 8 Total Source Files 48,699 8,561 Total Commits 113,103 18,233 Total AST nodes changed 43,538,386 4,487,479

Slide 32

Slide 32 text


 for (Task t: tasks) {
 t.execute();
 }
 SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5

Slide 33

Slide 33 text


 for (Task t: tasks) {
 t.execute();
 }
 SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 Set

Slide 34

Slide 34 text


 for (Task t: tasks) {
 t.execute();
 }
 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 TaskResult

Slide 35

Slide 35 text


 Set 
 for (Task t: tasks) {
 t.execute();
 }
 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 results

Slide 36

Slide 36 text


 Set 
 for (Task t: tasks) {
 t.execute();
 }
 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 =

Slide 37

Slide 37 text


 Set 
 Set results = 
 for (Task t: tasks) {
 t.execute();
 }
 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 new

Slide 38

Slide 38 text


 Set 
 Set results = new 
 Set results = 
 for (Task t: tasks) {
 t.execute();
 }
 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 HashSet<>();

Slide 39

Slide 39 text


 Set 
 Set results = new HashSet<>();
 
 
 Set results = new 
 Set results = 
 for (Task t: tasks) {
 t.execute();
 }
 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 results.

Slide 40

Slide 40 text


 Set 
 Set results = new HashSet<>();
 results.
 
 
 Set results = new HashSet<>();
 
 
 Set results = new 
 Set results = 
 for (Task t: tasks) {
 t.execute();
 }
 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 add()

Slide 41

Slide 41 text


 Set 
 Set results = new HashSet<>();
 results.
 
 
 Set results = new HashSet<>();
 
 
 Set results = new 
 Set results = 
 for (Task t: tasks) {
 t.execute();
 }
 add 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 add() clear size remove contains

Slide 42

Slide 42 text


 Set 
 Set results = new HashSet<>();
 results.
 
 
 Set results = new HashSet<>();
 
 
 Set results = new 
 Set results = 
 for (Task t: tasks) {
 t.execute();
 }
 add 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 add() clear size remove contains Top - 1

Slide 43

Slide 43 text


 Set 
 Set results = new HashSet<>();
 results.
 
 
 Set results = new HashSet<>();
 
 
 Set results = new 
 Set results = 
 for (Task t: tasks) {
 t.execute();
 }
 remove 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 add() clear size contains add

Slide 44

Slide 44 text


 Set 
 Set results = new HashSet<>();
 results.
 
 
 Set results = new HashSet<>();
 
 
 Set results = new 
 Set results = 
 for (Task t: tasks) {
 t.execute();
 }
 remove 
 Set results 
 Set SIMULATE USER BY RE-PLAYING CHANGES FROM COMMITS 13 Next Change Recommendation 1 2 3 4 5 add() clear size contains Top - 5 add

Slide 45

Slide 45 text

EVALUATION SETUP 14

Slide 46

Slide 46 text

EVALUATION SETUP Community Edition: APIREC trained on Large Corpus, tested with Community Corpus. 14

Slide 47

Slide 47 text

EVALUATION SETUP Community Edition: APIREC trained on Large Corpus, tested with Community Corpus. Project Edition: APIREC trained first 90% of commits of a single project, tested on remaining 10% of commits 14

Slide 48

Slide 48 text

EVALUATION SETUP Community Edition: APIREC trained on Large Corpus, tested with Community Corpus. Project Edition: APIREC trained first 90% of commits of a single project, tested on remaining 10% of commits User Edition: APIREC trained first 90% of commits of a single user project, tested on remaining 10% of commits 14

Slide 49

Slide 49 text

ACCURACY: RELATED WORK 15 correct answer is in Top-X suggestions 0% 20% 40% 60% 80% Top-1 Top-5 Top-10 78% 74% 55% 77% 64% 29% 69% 61% 26% 40% 34% 22% sequence based set based graph based APIRec

Slide 50

Slide 50 text

ACCURACY: RELATED WORK 15 correct answer is in Top-X suggestions APIREC is more accurate than previous work 0% 20% 40% 60% 80% Top-1 Top-5 Top-10 78% 74% 55% 77% 64% 29% 69% 61% 26% 40% 34% 22% sequence based set based graph based APIRec

Slide 51

Slide 51 text

STATIC CONTEXT PROVIDES CONSTANT IMPACT 16 Number of tokens in static context 0% 20% 40% 60% 80% 1 5 10 15 20 30 40 Code Context - Top 1 Code Context - Top 5

Slide 52

Slide 52 text

0% 20% 40% 60% 80% 1 5 10 15 20 30 40 Change Context - Top 1 Change Context - Top 5 DYNAMIC CONTEXT SIZE PROVIDES INCREASING IMPACT 17 Number of tokens in dynamic context 17

Slide 53

Slide 53 text

ACCURACY: DIFFERENT EDITIONS 18 0% 20% 40% 60% 80% Top 1 Top 5 Top 10 78% 74% 55% 41% 34% 17% 59% 53% 29% User (trained on 1 user in 1 project) Project (trained on one project) Community (trained on 50 projects)

Slide 54

Slide 54 text

IMPLICATIONS 19

Slide 55

Slide 55 text

IMPLICATIONS Changes are valuable
 Using fine grained changes can provide significant improvement over previous, change agnostic approaches. 19

Slide 56

Slide 56 text

IMPLICATIONS Changes are valuable
 Using fine grained changes can provide significant improvement over previous, change agnostic approaches. Changes are personal
 A developers’s history predicts future changes better than the entire project changes. 19

Slide 57

Slide 57 text

IMPLICATIONS Changes are valuable
 Using fine grained changes can provide significant improvement over previous, change agnostic approaches. Changes are personal
 A developers’s history predicts future changes better than the entire project changes. Changes are untapped
 Fine grained changes provide a wealth of data that is currently under used. 19

Slide 58

Slide 58 text

20