$30 off During Our Annual Pro Sale. View Details »

Workload-Aware Reviewer Recommendation using a Multi-objective Search-Based Approach

Workload-Aware Reviewer Recommendation using a Multi-objective Search-Based Approach

This is a paper presented at PROMISE2020. This paper proposes a new technique to recommend reviewers based on reviewer experience, activeness, past collaboration, and reviewing workload. The presentation video is here: https://youtu.be/zG4y_eXQXXU

Patanamon (Pick) Thongtanunam

November 05, 2020
Tweet

More Decks by Patanamon (Pick) Thongtanunam

Other Decks in Research

Transcript

  1. Workload-Aware Reviewer Recommendation
    using a Multi-objective Search-Based Approach
    Wisam Haitham
    Abbood Al-Zubaidi
    Patanamon (Pick)
    Thongtanunam
    Hoa Khanh Dam
    Chakkrit (Kla)
    Tantithamthavorn
    Aditya Ghose
    [email protected] @patanamon
    1

    View Slide

  2. Author
    Code Review: A method to improve the overall
    quality of a patch through manual examination
    2

    View Slide

  3. Author
    Code Review: A method to improve the overall
    quality of a patch through manual examination
    A code review tool
    (Ex. Gerrit)
    A patch
    2

    View Slide

  4. Shouldn't console.log() call the
    toString() method (where
    appropriate) on objects?
    Identifying a defect
    I think it’s better to do
    var s = "{}"
    console.log(s)
    Suggesting a solution
    Author
    Reviewer Reviewer
    Code Review: A method to improve the overall
    quality of a patch through manual examination
    A code review tool
    (Ex. Gerrit)
    A patch
    2

    View Slide

  5. Effective code review requires
    active participation

    [Balachandran ICSE2013; Rigby and Storey ICSE2011]
    Shouldn't console.log() call the
    toString() method (where
    appropriate) on objects?
    Identifying a defect
    I think it’s better to do
    var s = "{}"
    console.log(s)
    Suggesting a solution
    Author
    Reviewer Reviewer
    Code Review: A method to improve the overall
    quality of a patch through manual examination
    A code review tool
    (Ex. Gerrit)
    A patch
    2

    View Slide

  6. Effective code review requires
    active participation

    [Balachandran ICSE2013; Rigby and Storey ICSE2011]
    A patch tends to be less defective
    when it was reviewed and discussed
    extensively by many reviewers

    [Thongtanunam et al MSR2015; Kononenko et al. ICSME2015]
    Shouldn't console.log() call the
    toString() method (where
    appropriate) on objects?
    Identifying a defect
    I think it’s better to do
    var s = "{}"
    console.log(s)
    Suggesting a solution
    Author
    Reviewer Reviewer
    Code Review: A method to improve the overall
    quality of a patch through manual examination
    A code review tool
    (Ex. Gerrit)
    A patch
    2

    View Slide

  7. Effective code review requires
    active participation

    [Balachandran ICSE2013; Rigby and Storey ICSE2011]
    A patch tends to be less defective
    when it was reviewed and discussed
    extensively by many reviewers

    [Thongtanunam et al MSR2015; Kononenko et al. ICSME2015]
    Shouldn't console.log() call the
    toString() method (where
    appropriate) on objects?
    Identifying a defect
    I think it’s better to do
    var s = "{}"
    console.log(s)
    Suggesting a solution
    Author
    Reviewer Reviewer
    Code Review: A method to improve the overall
    quality of a patch through manual examination
    A code review tool
    (Ex. Gerrit)
    A patch
    Finding suitable reviewers
    is not a trivial task

    [Thongtanunam et al SANER2015]
    2

    View Slide

  8. Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    3

    View Slide

  9. Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    3

    View Slide

  10. Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    Exp. + Past Collaboration
    Approaches
    Finding reviewers who often work
    with the author in the past

    [Yu et al ICSME2014, Ouni et al IST2017]
    3

    View Slide

  11. Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    Exp. + Past Collaboration
    Approaches
    Finding reviewers who often work
    with the author in the past

    [Yu et al ICSME2014, Ouni et al IST2017]
    !
    Requesting only experts or active reviewers for a
    review could potentially burden them
    3

    View Slide

  12. Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    Exp. + Past Collaboration
    Approaches
    Finding reviewers who often work
    with the author in the past

    [Yu et al ICSME2014, Ouni et al IST2017]
    !
    Requesting only experts or active reviewers for a
    review could potentially burden them
    Invited reviewers often consider
    their workload when accepting
    new invitations

    [Ruangwan et al EMSE 2019]
    3

    View Slide

  13. Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    Exp. + Past Collaboration
    Approaches
    Finding reviewers who often work
    with the author in the past

    [Yu et al ICSME2014, Ouni et al IST2017]
    !
    Requesting only experts or active reviewers for a
    review could potentially burden them
    Invited reviewers often consider
    their workload when accepting
    new invitations

    [Ruangwan et al EMSE 2019]
    At Google, review tasks are
    assigned in a round-robin
    manner

    [Sadowski et al. ICSE 2018]
    3

    View Slide

  14. WLRRec:
    Workload-aware Reviewer Recommendation
    4

    View Slide

  15. WLRRec:
    Workload-aware Reviewer Recommendation
    A new patch
    4

    View Slide

  16. WLRRec:
    Workload-aware Reviewer Recommendation
    A new patch
    4
    Measure Reviewer
    Metrics

    View Slide

  17. WLRRec:
    Workload-aware Reviewer Recommendation
    A multi-objective
    evolutionary search
    (NSGA-II)
    A new patch
    4
    Measure Reviewer
    Metrics

    View Slide

  18. WLRRec:
    Workload-aware Reviewer Recommendation
    A multi-objective
    evolutionary search
    (NSGA-II)
    A new patch
    Experience &
    Activeness
    Past
    Collaboration
    Obj 1: Maximize the chance of
    participating a review
    Workload
    Obj 2: Mimize the
    Skewness of the Workload
    4
    Measure Reviewer
    Metrics

    View Slide

  19. WLRRec:
    Workload-aware Reviewer Recommendation
    A multi-objective
    evolutionary search
    (NSGA-II)
    A new patch
    Experience &
    Activeness
    Past
    Collaboration
    Obj 1: Maximize the chance of
    participating a review
    Workload
    Obj 2: Mimize the
    Skewness of the Workload
    4
    Measure Reviewer
    Metrics

    View Slide

  20. WLRRec Uses 4+1 Key Reviewer Metrics
    Experience &
    Activeness
    Past Collaboration Workload
    5

    View Slide

  21. WLRRec Uses 4+1 Key Reviewer Metrics
    Experience &
    Activeness
    Past Collaboration Workload
    Code Ownership

    %Commits authored

    Reviewing Experience

    %Patches reviewed

    Review Participation Rate

    %Invitations Accepted
    5

    View Slide

  22. WLRRec Uses 4+1 Key Reviewer Metrics
    Experience &
    Activeness
    Past Collaboration Workload
    Code Ownership

    %Commits authored

    Reviewing Experience

    %Patches reviewed

    Review Participation Rate

    %Invitations Accepted
    Familiarity with the
    Patch Author

    Co-reviewing Freq.
    5

    View Slide

  23. WLRRec Uses 4+1 Key Reviewer Metrics
    Experience &
    Activeness
    Past Collaboration Workload
    Code Ownership

    %Commits authored

    Reviewing Experience

    %Patches reviewed

    Review Participation Rate

    %Invitations Accepted
    Familiarity with the
    Patch Author

    Co-reviewing Freq.
    Remaining Reviews

    #Pending Review
    Requests
    5

    View Slide

  24. WLRRec Uses 4+1 Key Reviewer Metrics
    Experience &
    Activeness
    Past
    Collaboration
    Workload
    Code Ownership

    %Commits authored

    Reviewing Experience

    %Patches reviewed

    Review Participation Rate

    %Invitations Accepted
    Familiarity with the
    Patch Author

    Co-reviewing Freq.
    Remaining Reviews

    #Pending Review
    Requests
    Fitness func. for Obj 1:
    Weighted Summation
    Identify reviewers with maximum experience,
    activeness and past collaboration
    Fitness func. for Obj 2:
    Shanon’s Entropy
    Identify reviewers with minimal
    skewed workload
    6

    View Slide

  25. WLRRec identifies reviewers with maximum
    experience activeness, past collaboration (Obj 1)
    Example
    Fitness func. for Obj. 1
    7

    View Slide

  26. WLRRec identifies reviewers with maximum
    experience activeness, past collaboration (Obj 1)
    Example
    Fitness func. for Obj. 1
    Code Ownership COPick COHoa COKla COAditya
    Rev. Experience REPick REHoa REKla REAditya
    Rev. Participate RPPick RPHoa RPKla RPAditya
    Fam. w/ Patch
    Author
    FPPick FPHoa FPKla FPAditya
    7

    View Slide

  27. WLRRec identifies reviewers with maximum
    experience activeness, past collaboration (Obj 1)
    Example
    Fitness func. for Obj. 1
    Code Ownership COPick COHoa COKla COAditya
    Rev. Experience REPick REHoa REKla REAditya
    Rev. Participate RPPick RPHoa RPKla RPAditya
    Fam. w/ Patch
    Author
    FPPick FPHoa FPKla FPAditya
    Weighted Sum ScorePick ScoreHoa ScoreKla ScoreAditya
    7

    View Slide

  28. WLRRec identifies reviewers with maximum
    experience activeness, past collaboration (Obj 1)
    Example
    Fitness func. for Obj. 1
    Code Ownership COPick COHoa COKla COAditya
    Rev. Experience REPick REHoa REKla REAditya
    Rev. Participate RPPick RPHoa RPKla RPAditya
    Fam. w/ Patch
    Author
    FPPick FPHoa FPKla FPAditya
    Weighted Sum ScorePick ScoreHoa ScoreKla ScoreAditya
    Solution Candidate
    7

    View Slide

  29. WLRRec identifies reviewers with maximum
    experience activeness, past collaboration (Obj 1)
    Example
    Fitness func. for Obj. 1
    Code Ownership COPick COHoa COKla COAditya
    Rev. Experience REPick REHoa REKla REAditya
    Rev. Participate RPPick RPHoa RPKla RPAditya
    Fam. w/ Patch
    Author
    FPPick FPHoa FPKla FPAditya
    Weighted Sum ScorePick ScoreHoa ScoreKla ScoreAditya
    Solution Candidate
    Objective 1 score ScorePick + ScoreKla
    7

    View Slide

  30. WLRRec identifies reviewers with minimal skewed
    workload (Obj 2)
    Example
    Fitness func. for Obj. 2
    8

    View Slide

  31. #Pending Review
    Requests
    WLRRec identifies reviewers with minimal skewed
    workload (Obj 2)
    Example
    Fitness func. for Obj. 2
    8

    View Slide

  32. #Pending Review
    Requests
    Solution Candidate
    WLRRec identifies reviewers with minimal skewed
    workload (Obj 2)
    Example
    Fitness func. for Obj. 2
    8

    View Slide

  33. #Pending Review
    Requests
    Solution Candidate
    Total Workload
    WLRRec identifies reviewers with minimal skewed
    workload (Obj 2)
    Example
    Fitness func. for Obj. 2
    8

    View Slide

  34. #Pending Review
    Requests
    Solution Candidate
    Total Workload
    Objective 2 score
    (Shanon’s entropy)
    WLRRec identifies reviewers with minimal skewed
    workload (Obj 2)
    Example
    Fitness func. for Obj. 2
    -0.81
    8
    1
    log2
    4
    (
    5
    10
    log2
    5
    10
    + 2 *
    1
    10
    log2
    1
    10
    +
    2
    10
    log2
    2
    10
    )

    View Slide

  35. #Pending Review
    Requests
    Solution Candidate
    Total Workload
    Objective 2 score
    (Shanon’s entropy)
    WLRRec identifies reviewers with minimal skewed
    workload (Obj 2)
    Example
    Fitness func. for Obj. 2
    -0.81
    8
    1
    log2
    4
    (
    5
    10
    log2
    5
    10
    + 2 *
    1
    10
    log2
    1
    10
    +
    2
    10
    log2
    2
    10
    )
    The lower the score, the lower
    skewed workload (the better
    distribution of workload)

    View Slide

  36. WLRRec selects the solution that is closet to the
    reference point
    S1
    S2
    S3
    S4
    9

    View Slide

  37. Pareto optimal solutions of selected
    reviewers generated by NSGA-II
    WLRRec selects the solution that is closet to the
    reference point
    S1
    S2
    S3
    S4
    9

    View Slide

  38. Pareto optimal solutions of selected
    reviewers generated by NSGA-II
    WLRRec selects the solution that is closet to the
    reference point
    S1
    S2
    S3
    S4
    S1
    S2
    S3
    Objective 1: Maximize chance of
    participating a review

    S4
    Objective 2: Minimize skewness of
    the workload distribution
    Reference
    point
    Dist(S4
    )
    Dist(S
    3)
    Dist(S2)
    Dist(S2)
    The Knee Point
    Approach
    9

    View Slide

  39. Pareto optimal solutions of selected
    reviewers generated by NSGA-II
    WLRRec selects the solution that is closet to the
    reference point
    S1
    S2
    S3
    S4
    S1
    S2
    S3
    Objective 1: Maximize chance of
    participating a review

    S4
    Objective 2: Minimize skewness of
    the workload distribution
    Reference
    point
    Dist(S4
    )
    Dist(S
    3)
    Dist(S2)
    Dist(S2)
    Measure the distance
    between the solution and
    the reference point
    The Knee Point
    Approach
    9

    View Slide

  40. Pareto optimal solutions of selected
    reviewers generated by NSGA-II
    WLRRec selects the solution that is closet to the
    reference point
    S1
    S2
    S3
    S4
    S1
    S2
    S3
    Objective 1: Maximize chance of
    participating a review

    S4
    Objective 2: Minimize skewness of
    the workload distribution
    Reference
    point
    Dist(S4
    )
    Dist(S
    3)
    Dist(S2)
    Dist(S2)
    Measure the distance
    between the solution and
    the reference point
    The Knee Point
    Approach
    Select S3 as it has
    the closest distance
    9

    View Slide

  41. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    10

    View Slide

  42. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    Datasets
    10

    View Slide

  43. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    36K Patches 

    2K Reviewers
    65K Patches 

    1.2K Reviewers
    108K Patches
    3.7K Reviewers
    19K Patches

    410 Reviewers
    Datasets
    10

    View Slide

  44. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    36K Patches 

    2K Reviewers
    65K Patches 

    1.2K Reviewers
    108K Patches
    3.7K Reviewers
    19K Patches

    410 Reviewers
    Datasets Investigation
    10

    View Slide

  45. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    36K Patches 

    2K Reviewers
    65K Patches 

    1.2K Reviewers
    108K Patches
    3.7K Reviewers
    19K Patches

    410 Reviewers
    Datasets Investigation
    Genetic Algorithm (GA)

    Obj1: Maximize chance of
    participating a review
    Genetic Algorithm (GA)

    Obj2: Minimize the skewed
    workload
    Single-Objective vs. Multiple-Objective
    10

    View Slide

  46. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    36K Patches 

    2K Reviewers
    65K Patches 

    1.2K Reviewers
    108K Patches
    3.7K Reviewers
    19K Patches

    410 Reviewers
    Datasets Investigation
    Genetic Algorithm (GA)

    Obj1: Maximize chance of
    participating a review
    Genetic Algorithm (GA)

    Obj2: Minimize the skewed
    workload
    Single-Objective vs. Multiple-Objective
    Multi-Objective
    Cellular Genetic
    Algorithm (MOCell)
    NSGA-II vs. Other Multi-Objective Algorithms
    Strength-based
    Evolutionary Algo-
    rithm (SPEA2)
    10

    View Slide

  47. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    36K Patches 

    2K Reviewers
    65K Patches 

    1.2K Reviewers
    108K Patches
    3.7K Reviewers
    19K Patches

    410 Reviewers
    Datasets Investigation
    Genetic Algorithm (GA)

    Obj1: Maximize chance of
    participating a review
    Genetic Algorithm (GA)

    Obj2: Minimize the skewed
    workload
    Single-Objective vs. Multiple-Objective
    Multi-Objective
    Cellular Genetic
    Algorithm (MOCell)
    NSGA-II vs. Other Multi-Objective Algorithms
    Strength-based
    Evolutionary Algo-
    rithm (SPEA2)
    Performance Measures
    10

    View Slide

  48. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    36K Patches 

    2K Reviewers
    65K Patches 

    1.2K Reviewers
    108K Patches
    3.7K Reviewers
    19K Patches

    410 Reviewers
    Datasets Investigation
    Genetic Algorithm (GA)

    Obj1: Maximize chance of
    participating a review
    Genetic Algorithm (GA)

    Obj2: Minimize the skewed
    workload
    Single-Objective vs. Multiple-Objective
    Multi-Objective
    Cellular Genetic
    Algorithm (MOCell)
    NSGA-II vs. Other Multi-Objective Algorithms
    Strength-based
    Evolutionary Algo-
    rithm (SPEA2)
    Performance Measures
    Precision Recall
    F-Measure Hypervolume
    10

    View Slide

  49. How well can our WLRRec (a multi-objective approach)
    recommend reviewers for a newly-submitted patch?
    36K Patches 

    2K Reviewers
    65K Patches 

    1.2K Reviewers
    108K Patches
    3.7K Reviewers
    19K Patches

    410 Reviewers
    Datasets Investigation
    Genetic Algorithm (GA)

    Obj1: Maximize chance of
    participating a review
    Genetic Algorithm (GA)

    Obj2: Minimize the skewed
    workload
    Single-Objective vs. Multiple-Objective
    Multi-Objective
    Cellular Genetic
    Algorithm (MOCell)
    NSGA-II vs. Other Multi-Objective Algorithms
    Strength-based
    Evolutionary Algo-
    rithm (SPEA2)
    Performance Measures
    Precision Recall
    F-Measure Hypervolume
    %Gain = WLRRecpm - Ypm
    Ypm
    pm = Performance Measures

    Y = Alternative approaches
    10

    View Slide

  50. Our WLRRec outperforms the single-objective
    approaches
    0%
    45%
    90%
    135%
    180%
    Precision Recall F1
    0%
    35%
    70%
    105%
    140%
    Precision Recall F1
    %Gain WLRRec vs GA-Obj1
    Precision Recall F-Measure Precision Recall F-Measure
    %Gain WLRRec vs GA-Obj2
    11

    View Slide

  51. Our WLRRec outperforms the single-objective
    approaches
    0%
    45%
    90%
    135%
    180%
    Precision Recall F1
    0%
    35%
    70%
    105%
    140%
    Precision Recall F1
    %Gain WLRRec vs GA-Obj1
    Precision Recall F-Measure Precision Recall F-Measure
    %Gain WLRRec vs GA-Obj2
    WLRRec achieves 88%-142% higher precision,

    111%-178% higher recall than GA-Obj1
    WLRRec achieves 55%-101% higher precision,

    96%-138% higher recall than GA-Obj2
    11

    View Slide

  52. Our WLRRec outperforms the single-objective
    approaches
    0%
    45%
    90%
    135%
    180%
    Precision Recall F1
    0%
    35%
    70%
    105%
    140%
    Precision Recall F1
    %Gain WLRRec vs GA-Obj1
    Precision Recall F-Measure Precision Recall F-Measure
    %Gain WLRRec vs GA-Obj2
    Considering multiple objectives at the same
    time allows us to better find reviewers
    11

    View Slide

  53. Our WLRRec with NSGA-II is better than other two
    multi-objective approaches
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    %Gain WLRRec with NSGA-II vs MOCell
    Precision Recall F-Measure
    %Gain WLRRec with NSGA-II vs SPEA2
    Hypervolume Precision Recall F-Measure Hypervolume
    12

    View Slide

  54. Our WLRRec with NSGA-II is better than other two
    multi-objective approaches
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    %Gain WLRRec with NSGA-II vs MOCell
    Precision Recall F-Measure
    WLRRec achieves 31%-95% higher F-measure,

    21%-31% higher hypervolume than MOCell
    %Gain WLRRec with NSGA-II vs SPEA2
    Hypervolume Precision Recall F-Measure Hypervolume
    12

    View Slide

  55. Our WLRRec with NSGA-II is better than other two
    multi-objective approaches
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    %Gain WLRRec with NSGA-II vs MOCell
    Precision Recall F-Measure
    WLRRec achieves 31%-95% higher F-measure,

    21%-31% higher hypervolume than MOCell
    WLRRec achieves 19%-95% higher F-measure,

    29%-47% higher hypervolume than SPEA2
    %Gain WLRRec with NSGA-II vs SPEA2
    Hypervolume Precision Recall F-Measure Hypervolume
    12

    View Slide

  56. Our WLRRec with NSGA-II is better than other two
    multi-objective approaches
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    %Gain WLRRec with NSGA-II vs MOCell
    Precision Recall F-Measure
    The NSGA-II algorithm leveraged by our WLRRec is an appropriate
    multi-objective approach to find solutions in this problem domain
    %Gain WLRRec with NSGA-II vs SPEA2
    Hypervolume Precision Recall F-Measure Hypervolume
    12

    View Slide

  57. 13

    View Slide

  58. 13
    Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    Exp. + Past Collaboration
    Approaches
    Finding reviewers who often work
    with the author in the past

    [Yu et al ICSME2014, Ouni et al IST2017]
    !
    Requesting only experts or active reviewers for a
    review could potentially burden them
    Invited reviewers often consider
    their workload when accepting
    new invitations

    [Ruangwan et al EMSE 2019]
    At Google, review tasks are
    assigned in a round-robin
    manner

    [Sadowski et al. ICSE 2018]

    View Slide

  59. 13
    Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    Exp. + Past Collaboration
    Approaches
    Finding reviewers who often work
    with the author in the past

    [Yu et al ICSME2014, Ouni et al IST2017]
    !
    Requesting only experts or active reviewers for a
    review could potentially burden them
    Invited reviewers often consider
    their workload when accepting
    new invitations

    [Ruangwan et al EMSE 2019]
    At Google, review tasks are
    assigned in a round-robin
    manner

    [Sadowski et al. ICSE 2018]
    WLRRec:
    Workload-aware Reviewer Recommendation
    NSGA-II
    A new patch
    Experience &
    Activeness
    Past
    Collaboration
    Obj 1: Maximize the chance of
    participating a review
    Workload
    Obj 2: Mimize the Skewness of the
    Reviewing Workload Distribution

    View Slide

  60. 13
    Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    Exp. + Past Collaboration
    Approaches
    Finding reviewers who often work
    with the author in the past

    [Yu et al ICSME2014, Ouni et al IST2017]
    !
    Requesting only experts or active reviewers for a
    review could potentially burden them
    Invited reviewers often consider
    their workload when accepting
    new invitations

    [Ruangwan et al EMSE 2019]
    At Google, review tasks are
    assigned in a round-robin
    manner

    [Sadowski et al. ICSE 2018]
    WLRRec:
    Workload-aware Reviewer Recommendation
    NSGA-II
    A new patch
    Experience &
    Activeness
    Past
    Collaboration
    Obj 1: Maximize the chance of
    participating a review
    Workload
    Obj 2: Mimize the Skewness of the
    Reviewing Workload Distribution
    Our WLRRec outperforms the single-objective
    approaches
    0%
    45%
    90%
    135%
    180%
    Precision Recall F1
    0%
    35%
    70%
    105%
    140%
    Precision Recall F1
    %Gain WLRRec vs GA-Obj1
    Precision Recall F-Measure Precision Recall F-Measure
    %Gain WLRRec vs GA-Obj2
    WLRRec is 88%-142% higher precision,

    111%-178% higher recall than GA-Obj1
    WLRRec is 55%-101% higher precision,

    96%-138% higher recall than GA-Obj2
    Our WLRRec with NSGA-II is better than other two
    multi-objective approaches
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    %Gain WLRRec with NSGA-II vs MOCell
    Precision Recall F-Measure
    NSGA-II is 31%-95% higher F-measure,
    NSGA-II is 19%-95% higher F-measure,

    %Gain WLRRec with NSGA-II vs SPEA2
    Hypervolume Precision Recall F-Measure Hypervolume
    Our WLRRec with NSGA-II is better than other two
    multi-objective approaches
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    %Gain WLRRec with NSGA-II vs MOCell
    Precision Recall F-Measure
    NSGA-II is 31%-95% higher F-measure,

    21%-31% higher hypervolume than MOCell
    NSGA-II is 19%-95% higher F-measure,

    29%-47% higher hypervolume than SPEA2
    %Gain WLRRec with NSGA-II vs SPEA2
    Hypervolume Precision Recall F-Measure Hypervolume
    Our WLRRec outperforms the four alternative
    approaches

    View Slide

  61. 13
    Several Reviewer Recommendation Approaches have
    been Developed to Improve Code Review Process
    Expertise/Experience-based
    Approaches
    Finding reviewers who review
    many similar patches in the past

    [Balachandran ICSE2013, 

    Thongtanunam et al SANER2015, 

    Zanjani et al TSE2016, Xia et al ICSME2016]
    Exp. + Past Collaboration
    Approaches
    Finding reviewers who often work
    with the author in the past

    [Yu et al ICSME2014, Ouni et al IST2017]
    !
    Requesting only experts or active reviewers for a
    review could potentially burden them
    Invited reviewers often consider
    their workload when accepting
    new invitations

    [Ruangwan et al EMSE 2019]
    At Google, review tasks are
    assigned in a round-robin
    manner

    [Sadowski et al. ICSE 2018]
    WLRRec:
    Workload-aware Reviewer Recommendation
    NSGA-II
    A new patch
    Experience &
    Activeness
    Past
    Collaboration
    Obj 1: Maximize the chance of
    participating a review
    Workload
    Obj 2: Mimize the Skewness of the
    Reviewing Workload Distribution
    Our WLRRec outperforms the single-objective
    approaches
    0%
    45%
    90%
    135%
    180%
    Precision Recall F1
    0%
    35%
    70%
    105%
    140%
    Precision Recall F1
    %Gain WLRRec vs GA-Obj1
    Precision Recall F-Measure Precision Recall F-Measure
    %Gain WLRRec vs GA-Obj2
    WLRRec is 88%-142% higher precision,

    111%-178% higher recall than GA-Obj1
    WLRRec is 55%-101% higher precision,

    96%-138% higher recall than GA-Obj2
    Our WLRRec with NSGA-II is better than other two
    multi-objective approaches
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    %Gain WLRRec with NSGA-II vs MOCell
    Precision Recall F-Measure
    NSGA-II is 31%-95% higher F-measure,
    NSGA-II is 19%-95% higher F-measure,

    %Gain WLRRec with NSGA-II vs SPEA2
    Hypervolume Precision Recall F-Measure Hypervolume
    Our WLRRec with NSGA-II is better than other two
    multi-objective approaches
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    0%
    25%
    50%
    75%
    100%
    Precision Recall F1 HV
    %Gain WLRRec with NSGA-II vs MOCell
    Precision Recall F-Measure
    NSGA-II is 31%-95% higher F-measure,

    21%-31% higher hypervolume than MOCell
    NSGA-II is 19%-95% higher F-measure,

    29%-47% higher hypervolume than SPEA2
    %Gain WLRRec with NSGA-II vs SPEA2
    Hypervolume Precision Recall F-Measure Hypervolume
    Our work highlights the potential of leveraging
    the multi-objective algorithm that consider
    review workload and other important
    information to find reviewers
    [email protected]
    @patanamon
    http://patanamon.com
    Our WLRRec outperforms the four alternative
    approaches

    View Slide