Workload-Aware Reviewer Recommendation using a Multi-objective Search-Based Approach
This is a paper presented at PROMISE2020. This paper proposes a new technique to recommend reviewers based on reviewer experience, activeness, past collaboration, and reviewing workload. The presentation video is here: https://youtu.be/zG4y_eXQXXU
Identifying a defect I think it’s better to do var s = "{}" console.log(s) Suggesting a solution Author Reviewer Reviewer Code Review: A method to improve the overall quality of a patch through manual examination A code review tool (Ex. Gerrit) A patch 2
Storey ICSE2011] Shouldn't console.log() call the toString() method (where appropriate) on objects? Identifying a defect I think it’s better to do var s = "{}" console.log(s) Suggesting a solution Author Reviewer Reviewer Code Review: A method to improve the overall quality of a patch through manual examination A code review tool (Ex. Gerrit) A patch 2
Storey ICSE2011] A patch tends to be less defective when it was reviewed and discussed extensively by many reviewers [Thongtanunam et al MSR2015; Kononenko et al. ICSME2015] Shouldn't console.log() call the toString() method (where appropriate) on objects? Identifying a defect I think it’s better to do var s = "{}" console.log(s) Suggesting a solution Author Reviewer Reviewer Code Review: A method to improve the overall quality of a patch through manual examination A code review tool (Ex. Gerrit) A patch 2
Storey ICSE2011] A patch tends to be less defective when it was reviewed and discussed extensively by many reviewers [Thongtanunam et al MSR2015; Kononenko et al. ICSME2015] Shouldn't console.log() call the toString() method (where appropriate) on objects? Identifying a defect I think it’s better to do var s = "{}" console.log(s) Suggesting a solution Author Reviewer Reviewer Code Review: A method to improve the overall quality of a patch through manual examination A code review tool (Ex. Gerrit) A patch Finding suitable reviewers is not a trivial task [Thongtanunam et al SANER2015] 2
Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] 3
Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] 3
Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them 3
Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them Invited reviewers often consider their workload when accepting new invitations [Ruangwan et al EMSE 2019] 3
Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them Invited reviewers often consider their workload when accepting new invitations [Ruangwan et al EMSE 2019] At Google, review tasks are assigned in a round-robin manner [Sadowski et al. ICSE 2018] 3
new patch Experience & Activeness Past Collaboration Obj 1: Maximize the chance of participating a review Workload Obj 2: Mimize the Skewness of the Workload 4 Measure Reviewer Metrics
new patch Experience & Activeness Past Collaboration Obj 1: Maximize the chance of participating a review Workload Obj 2: Mimize the Skewness of the Workload 4 Measure Reviewer Metrics
selects the solution that is closet to the reference point S1 S2 S3 S4 S1 S2 S3 Objective 1: Maximize chance of participating a review S4 Objective 2: Minimize skewness of the workload distribution Reference point Dist(S4 ) Dist(S 3) Dist(S2) Dist(S2) The Knee Point Approach 9
selects the solution that is closet to the reference point S1 S2 S3 S4 S1 S2 S3 Objective 1: Maximize chance of participating a review S4 Objective 2: Minimize skewness of the workload distribution Reference point Dist(S4 ) Dist(S 3) Dist(S2) Dist(S2) Measure the distance between the solution and the reference point The Knee Point Approach 9
selects the solution that is closet to the reference point S1 S2 S3 S4 S1 S2 S3 Objective 1: Maximize chance of participating a review S4 Objective 2: Minimize skewness of the workload distribution Reference point Dist(S4 ) Dist(S 3) Dist(S2) Dist(S2) Measure the distance between the solution and the reference point The Knee Point Approach Select S3 as it has the closest distance 9
180% Precision Recall F1 0% 35% 70% 105% 140% Precision Recall F1 %Gain WLRRec vs GA-Obj1 Precision Recall F-Measure Precision Recall F-Measure %Gain WLRRec vs GA-Obj2 Considering multiple objectives at the same time allows us to better find reviewers 11
approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure The NSGA-II algorithm leveraged by our WLRRec is an appropriate multi-objective approach to find solutions in this problem domain %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume 12
Code Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them Invited reviewers often consider their workload when accepting new invitations [Ruangwan et al EMSE 2019] At Google, review tasks are assigned in a round-robin manner [Sadowski et al. ICSE 2018]
Code Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them Invited reviewers often consider their workload when accepting new invitations [Ruangwan et al EMSE 2019] At Google, review tasks are assigned in a round-robin manner [Sadowski et al. ICSE 2018] WLRRec: Workload-aware Reviewer Recommendation NSGA-II A new patch Experience & Activeness Past Collaboration Obj 1: Maximize the chance of participating a review Workload Obj 2: Mimize the Skewness of the Reviewing Workload Distribution
Code Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them Invited reviewers often consider their workload when accepting new invitations [Ruangwan et al EMSE 2019] At Google, review tasks are assigned in a round-robin manner [Sadowski et al. ICSE 2018] WLRRec: Workload-aware Reviewer Recommendation NSGA-II A new patch Experience & Activeness Past Collaboration Obj 1: Maximize the chance of participating a review Workload Obj 2: Mimize the Skewness of the Reviewing Workload Distribution Our WLRRec outperforms the single-objective approaches 0% 45% 90% 135% 180% Precision Recall F1 0% 35% 70% 105% 140% Precision Recall F1 %Gain WLRRec vs GA-Obj1 Precision Recall F-Measure Precision Recall F-Measure %Gain WLRRec vs GA-Obj2 WLRRec is 88%-142% higher precision, 111%-178% higher recall than GA-Obj1 WLRRec is 55%-101% higher precision, 96%-138% higher recall than GA-Obj2 Our WLRRec with NSGA-II is better than other two multi-objective approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure NSGA-II is 31%-95% higher F-measure, NSGA-II is 19%-95% higher F-measure, %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume Our WLRRec with NSGA-II is better than other two multi-objective approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure NSGA-II is 31%-95% higher F-measure, 21%-31% higher hypervolume than MOCell NSGA-II is 19%-95% higher F-measure, 29%-47% higher hypervolume than SPEA2 %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume Our WLRRec outperforms the four alternative approaches
Code Review Process Expertise/Experience-based Approaches Finding reviewers who review many similar patches in the past [Balachandran ICSE2013, Thongtanunam et al SANER2015, Zanjani et al TSE2016, Xia et al ICSME2016] Exp. + Past Collaboration Approaches Finding reviewers who often work with the author in the past [Yu et al ICSME2014, Ouni et al IST2017] ! Requesting only experts or active reviewers for a review could potentially burden them Invited reviewers often consider their workload when accepting new invitations [Ruangwan et al EMSE 2019] At Google, review tasks are assigned in a round-robin manner [Sadowski et al. ICSE 2018] WLRRec: Workload-aware Reviewer Recommendation NSGA-II A new patch Experience & Activeness Past Collaboration Obj 1: Maximize the chance of participating a review Workload Obj 2: Mimize the Skewness of the Reviewing Workload Distribution Our WLRRec outperforms the single-objective approaches 0% 45% 90% 135% 180% Precision Recall F1 0% 35% 70% 105% 140% Precision Recall F1 %Gain WLRRec vs GA-Obj1 Precision Recall F-Measure Precision Recall F-Measure %Gain WLRRec vs GA-Obj2 WLRRec is 88%-142% higher precision, 111%-178% higher recall than GA-Obj1 WLRRec is 55%-101% higher precision, 96%-138% higher recall than GA-Obj2 Our WLRRec with NSGA-II is better than other two multi-objective approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure NSGA-II is 31%-95% higher F-measure, NSGA-II is 19%-95% higher F-measure, %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume Our WLRRec with NSGA-II is better than other two multi-objective approaches 0% 25% 50% 75% 100% Precision Recall F1 HV 0% 25% 50% 75% 100% Precision Recall F1 HV %Gain WLRRec with NSGA-II vs MOCell Precision Recall F-Measure NSGA-II is 31%-95% higher F-measure, 21%-31% higher hypervolume than MOCell NSGA-II is 19%-95% higher F-measure, 29%-47% higher hypervolume than SPEA2 %Gain WLRRec with NSGA-II vs SPEA2 Hypervolume Precision Recall F-Measure Hypervolume Our work highlights the potential of leveraging the multi-objective algorithm that consider review workload and other important information to find reviewers [email protected] @patanamon http://patanamon.com Our WLRRec outperforms the four alternative approaches