Workload-Aware Reviewer Recommendation using a Multi-objective Search-Based Approach

Slide 1

Slide 1 text

Workload-Aware Reviewer Recommendation using a Multi-objective Search-Based Approach Wisam Haitham Abbood Al-Zubaidi Patanamon (Pick) Thongtanunam Hoa Khanh Dam Chakkrit (Kla) Tantithamthavorn Aditya Ghose [email protected] @patanamon 1

Slide 60

Slide 60 text

13 Several Reviewer Recommendation Approaches have been Developed to Improve Code Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013,   Thongtanunam et al SANER2015,   Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them Invited reviewers often consider their workload when accepting new invitations [Ruangwan et al EMSE 2019] At Google, review tasks are assigned in a round-robin manner [Sadowski et al. ICSE 2018] WLRRec: Workload-aware Reviewer Recommendation NSGA-II A new patch Experience & Activeness Past Collaboration Obj 1: Maximize the chance of participating a review Workload Obj 2: Mimize the Skewness of the Reviewing Workload Distribution Our WLRRec outperforms the single-objective approaches 0% 45% 90% 135% 180% Precision Recall F1 0% 35% 70% 105% 140% Precision Recall F1 %Gain WLRRec vs GA-Obj1 Precision Recall F-Measure Precision Recall F-Measure %Gain WLRRec vs GA-Obj2 WLRRec is 88%-142% higher precision, 111%-178% higher recall than GA-Obj1 WLRRec is 55%-101% higher precision, 96%-138% higher recall than GA-Obj2 Our WLRRec with NSGA-II is better than other two multi-objective approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure NSGA-II is 31%-95% higher F-measure, NSGA-II is 19%-95% higher F-measure, %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume Our WLRRec with NSGA-II is better than other two multi-objective approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure NSGA-II is 31%-95% higher F-measure, 21%-31% higher hypervolume than MOCell NSGA-II is 19%-95% higher F-measure, 29%-47% higher hypervolume than SPEA2 %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume Our WLRRec outperforms the four alternative approaches

Slide 61

Slide 61 text

13 Several Reviewer Recommendation Approaches have been Developed to Improve Code Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013,   Thongtanunam et al SANER2015,   Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them Invited reviewers often consider their workload when accepting new invitations [Ruangwan et al EMSE 2019] At Google, review tasks are assigned in a round-robin manner [Sadowski et al. ICSE 2018] WLRRec: Workload-aware Reviewer Recommendation NSGA-II A new patch Experience & Activeness Past Collaboration Obj 1: Maximize the chance of participating a review Workload Obj 2: Mimize the Skewness of the Reviewing Workload Distribution Our WLRRec outperforms the single-objective approaches 0% 45% 90% 135% 180% Precision Recall F1 0% 35% 70% 105% 140% Precision Recall F1 %Gain WLRRec vs GA-Obj1 Precision Recall F-Measure Precision Recall F-Measure %Gain WLRRec vs GA-Obj2 WLRRec is 88%-142% higher precision, 111%-178% higher recall than GA-Obj1 WLRRec is 55%-101% higher precision, 96%-138% higher recall than GA-Obj2 Our WLRRec with NSGA-II is better than other two multi-objective approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure NSGA-II is 31%-95% higher F-measure, NSGA-II is 19%-95% higher F-measure, %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume Our WLRRec with NSGA-II is better than other two multi-objective approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure NSGA-II is 31%-95% higher F-measure, 21%-31% higher hypervolume than MOCell NSGA-II is 19%-95% higher F-measure, 29%-47% higher hypervolume than SPEA2 %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume Our work highlights the potential of leveraging the multi-objective algorithm that consider review workload and other important information to ﬁnd reviewers [email protected] @patanamon http://patanamon.com Our WLRRec outperforms the four alternative approaches

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Slide 35

Slide 35 text

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Slide 38

Slide 38 text

Slide 39

Slide 39 text

Slide 40

Slide 40 text