$30 off During Our Annual Pro Sale. View Details »

Source code Curation Tooling for the Code Forager

Source code Curation Tooling for the Code Forager

This talk introduces the notion of Source Code Curation, along with a set of tools that implements it. Source Code Curation is a blend of filtering, refinement, and validation activities. It can help programmers during their code foraging activities. It can help them determine what source code is more likely to be useful, and what's not.

Huascar Sanchez

October 15, 2015
Tweet

More Decks by Huascar Sanchez

Other Decks in Research

Transcript

  1. huascar sanchez hsanchez@cs.ucsc.edu for the Code Forager Source code curation

    tooling defense @ UCSC, October 29, 2015
  2. Title - CONFYYY - MM DD, YYYY Code foraging is

    a form of reuse practiced by many programmers. Despite advances in search technology, code foraging still restricted by the questionable quality of online source code. We can build tools that can help programmers address this questionable quality. 2 This talk in one slide
  3. None
  4. SCC - SRI - 09 18, 2015 This process is

    laborious and challenging • Involves multiple rounds of specific steps (Marchionini, 2006): • browsing, screening, filtering, and retooling • Deals with source code with inherently questionable quality (Gysin and Kuhn, 2010): • not guaranteed to work, to be good, or to be trustworthy 4 Just like Junkyard scavenging…
  5. SCC - SRI - 09 18, 2015 Quality means fitness

    for use, and is relative to a specific programming task (Gryna and Juran, 2001). Quality dimensions (Dandashi, 2002): 1. Accuracy 2. Adaptability 3. Completeness 4. Understandability Source code with questionable quality: Source code that lacks some of these characteristics. 5 Quality
  6. SCC - SRI - 09 18, 2015 Uncertainty over the

    quality of source code can have negative effects on task effectiveness (Mackay, 1991). Effective foraging for online source code, requires addressing questionable quality of code upfront. 6 Quality matters
  7. SCC - SRI - 09 18, 2015 (In general) Curation

    is used to determine what’s useful, what’s junk, and what’s not. Curation blends filtering, refinement, and validation activities (Krysa, 2006; Stonebraker et al., 2013) Curation can help address code foraging’s challenging nature Curation applied to source code equals Source Code Curation. 7 Curation equals Quality minus Junk
  8. SCC - SRI - 09 18, 2015 Source Code Curation

    covers the act of • discovering a code snippet of interest, • cleaning and transforming (refining) it, • presenting it in a meaningful & organized way. Its goal is to improve online source code’s quality; all before consumption. 8 Source Code Curation (Sanchez et al., 2015)
  9. SCC - SRI - 09 18, 2015 Source Code Curation

    can (1) help programmers deal with the inherently questionable quality of online source code upfront, and (2) facilitate code understanding 9 Thesis Key ideas: 1. Source code quality greatly impacts code foraging. 2. Quality is described by a series of quality dimensions. 3. Source code curation can improve these dimensions.
  10. SCC - SRI - 09 18, 2015 Programmers are curious

    (Brandt et al. 2009). Solutions to code foraging’s challenging nature must not impede such natural impulses These impulses are an integral part of their learning experience (Kuhn and DeLine, 2012). Build intuitive tools to support the curation of Java code examples on StackOverflow. 10 How can we implement this notion?
  11. SCC - SRI - 09 18, 2015 11 System for

    curating Java code examples on StackOverflow The Vesperin System Multistage (JSON) Source JSON Pack Text capacity Text Pack Kiwi Violette ( ( AST Multi-stage 1 : swap 2 : partition 4 : quicksort Code stages 3 : randomizedPartition import java.util.Random; public class Quicksort { private static Random rand = new Random(); public static void quicksort(int[] arr, int left, int right) { if (left < right) { int pivot = randomizedPartition(arr, left, right); quicksort(arr, left, pivot); quicksort(arr, pivot + 1, right); } } private static int randomizedPartition(int[] arr, int left, int right) { int swapIndex = left + rand.nextInt(right - left) + 1; swap(arr, left, swapIndex); return partition(arr, left, right); } private static int partition(int[] arr, int left, int right) { int pivot = arr[left]; int i = left - 1; int j = right + 1; while (true) { error path (0:warning, ...., n:warning) ok path Multistaging to Understand: Distilling code examples essence Code Examples Multistager (Sanchez et al., 2015) paper under review
  12. SCC - SRI - 09 18, 2015 Research idea: Allow

    programmers to experiment with code modification ideas in the Web page of the Q&A system (in-place). Hypothesis: Intuitively experimenting with code modifications ideas hands-on can (1) help programmers deal with code with questionable quality upfront and (2) facilitate code understanding. 12 Vesperin
  13. 13

  14. 14

  15. 15

  16. 16

  17. SCC - SRI - 09 18, 2015 17 Source object

    Source ID PK Description Text Content Text Notes Array curation request updated code (1) (2) browser plug-in RESTful service scratch space Vesperin page
  18. SCC - SRI - 09 18, 2015 18 (1) Q&A

    page scratch space • The space where all in-place code modifications are made, via direct editing or via semi-automated code transformations. Vesperin page • reDOMed Q&A page drafts management • Drafts are snapshots of changed code for future recoveries. • Add error tolerance into curation process (Olsen, 2009) Vesperin actions • Make curation requests • Add notes in context • Check code syntax • Mark drafts notes (in context)
  19. SCC - SRI - 09 18, 2015 19 mongo db

    (2) Curation request {“rename”: { “what”: “method”, “where”: [1, 6], “source”: { “name”: “..”, “content”: JSON requests A P I Reply {“draft”: { “before”: {}, “after”: { “name”: “..”, “content”: JSON replies twitter HTML page RESTful Incremental Java parser Publisher & Renderer Java code transformer: • Codepacking • Delete code member, • Rename code member, • Code cleanup, • Clip fragment, • Create new method
  20. SCC- SRI - 09 18, 2015 20 Vesperin in the

    lab
  21. SCC - SRI - 09 18, 2015 Considered the following

    research questions: • How will programmers use Vesperin? • Were the provided facilities sufficient? • Will programmers be able to better understand unfamiliar code examples via curation? Will Vesperin add value? 21 User Study
  22. • 15 Participants, 3 tasks, 60 minutes • One group

    pretest posttest design • Participants are studied before and after the experimental manipulation (Babbie, 2015) • Variables • Independent variable: Vesperin system • Dependent variables: perception and experience SCC - SRI - 09 18, 2015 22 Study setup (Babbie, 2015)
  23. Title - CONFYYY - MM DD, YYYY Results: 23 Participants

    Background experience. 40% of them visit StackOverflow multiple times a day. Moreover, nearly 70% of the participants were extremely familiar with Java and Refactoring. (a) Programming Experience. (b) StackOverflow Visit Frequency. (c) Level of Java Familiarity. (d) Level of Refactoring Familiarity. Figure 4.4: Summary of participants’ background information.
  24. Title - CONFYYY - MM DD, YYYY Task 1 24

    Task 1 Using client/server certificates for two way authentication SSL socket on Android
  25. Title - CONFYYY - MM DD, YYYY Task 1 25

    Task 2 How to add a push notification in my own android app
  26. Title - CONFYYY - MM DD, YYYY Task 1 26

    Task 3 Stop the Twitter stream and return List of status with twitter4j
  27. Title - CONFYYY - MM DD, YYYY Procedure and manipulation

    27 Procedure and manipulation (Babbie, 2015) Give Vesperin demo Measurement of observation e.g., Could such a system allow you to better understand code examples? Pretest Measurement of observation e.g., Did Vesperin allow you to better understand code examples? Posttest Application of Independent variable Use Vesperin Intervention final interview
  28. SCC- SRI - 09 18, 2015 28 User experiences Obtained

    via 4 sources: observations, automated user interaction logging, pretest & posttest, and final interview
  29. SCC - SRI - 09 18, 2015 Used in a

    hybrid comprehension strategy Mixed bottom-up and top-down strategies Used to explore control flow relationships Search and replaced; followed by annotation and cleanup Syntax checking often influenced curation. 29 How was Vesperin used?
  30. SCC - SRI - 09 18, 2015 Editing activity surged

    early on, then subsided over time. This can be explained by looking at the assumptions of dual-process theories (Chaiken and Eagly, 1989) 30 How was Vesperin used? Edits 0 10 20 30 40 50 Minutes 2 4 6 8 10 12 14 16 18 20
  31. SCC - SRI - 09 18, 2015 Its facilities were

    necessary, but not sufficient Unable to handle specific code examples: • Multiple orthogonal classes on a single scratch space • Multiple scratch spaces needed to work in concert Workarounds were used to address limitations • Used static nested classes • Combined content of all scratch spaces into single scratch space Limited aid for identifying code examples’ core parts 31 Were the set of facilities sufficient?
  32. SCC - SRI - 09 18, 2015 Came in with

    high expectations, and left satisfied. (added value and better understanding as predicted) 32 How useful is Vesperin? Horizontal axes: 5-point Likert scale, ranging from strongly disagree (“- -”) to strongly agree (“++”). Vertical axes: number of participants. (a) Better Understanding participants 0 2 4 6 8 10 12 PRETEST POSTEST -- - O + ++ -- - O + ++ (b) Added Value participants 0 1 2 3 4 5 6 7 8 PRETEST POSTTEST -- - O + ++ -- - O + ++
  33. SCC - SRI - 09 18, 2015 Used in a

    hybrid comprehension strategy Helped better understand source code: “It’s much easier to understand the code after its curation.” Its facilities were necessary, but not sufficient Limited aid for identifying prime sets of behavior 33 Overview of Results
  34. SCC - SRI - 09 18, 2015 34 System for

    curating Java code examples on StackOverflow The Vesperin System Multistage (JSON) Source JSON Pack Text capacity Text Pack Kiwi Violette ( ( AST Multi-stage 1 : swap 2 : partition 4 : quicksort Code stages 3 : randomizedPartition import java.util.Random; public class Quicksort { private static Random rand = new Random(); public static void quicksort(int[] arr, int left, int right) { if (left < right) { int pivot = randomizedPartition(arr, left, right); quicksort(arr, left, pivot); quicksort(arr, pivot + 1, right); } } private static int randomizedPartition(int[] arr, int left, int right) { int swapIndex = left + rand.nextInt(right - left) + 1; swap(arr, left, swapIndex); return partition(arr, left, right); } private static int partition(int[] arr, int left, int right) { int pivot = arr[left]; int i = left - 1; int j = right + 1; while (true) { error path (0:warning, ...., n:warning) ok path Multistaging to Understand: Distilling Code Examples Essence Code Examples Multistager (Sanchez et al., 2015) (paper under review)
  35. MTU - ICPC’16 - 05 16, 2016 Problem: Understanding unfamiliar

    code during code foraging is laborious and challenging. • Lots of information contained within code are either peripheral or obscured by other elements. • Lack of tool support for locating the essential sections within a code and then aid with their understanding. Solution: Deliver a method (and its tool) for discovering these essential sections and reveal only their relevant details. 35 Distilling the Essence of Code Examples
  36. Title - CONFYYY - MM DD, YYYY 36 Multistage Representation

    of Code code example … public final class RandGaussianDistrib { private static final Random R… static double uniform(){ return R.nextInt(n); } static double uniform(double a, double b) { return a + uniform() * (b - a); } static double gaussian(){ double r, x, y; do { x = uniform(-1.0, 1.0); y = uniform(-1.0, 1.0); r = (x*x) + (y*y); } while(r >= 1 || r == 0); return x * Math.sqrt(-2 * Math.log(r) / r); } } … public final class RandGaussianDistrib { private static final Random R… static double uniform(){ return R.nextInt(n); } } … public final class RandGaussianDistrib { private static final Random R… static double uniform(){ return R.nextInt(n); } static double uniform(double a, double b){ return a + uniform() * (b - a); } } … public final class RandGaussianDistrib { private static final Random R… static double uniform(){…} static double uniform(double a, double b) {…} static double gaussian(){ double r, x, y; do {…} while (r >= 1 || r == 0); return x * Math.sqrt(-2 * Math.log(r) / r); } } 1: Code stage 2: Code stage 3: Code stage
  37. Title - CONFYYY - MM DD, YYYY 37 Multistage representation

    of code … public final class RandGaussianDistrib { private static final Random R… static double uniform(){ return R.nextInt(n); } } … public final class RandGaussianDistrib { private static final Random R… static double uniform(){ return R.nextInt(n); } static double uniform(double a, double b){ return a + uniform() * (n); } } … public final class RandGaussianDistrib { private static final Random R… static double uniform(){…} static double uniform(double a, double b) {…} static double gaussian(){ double r, x, y; do {…} while (r >= 1 || r == 0); return x * Math.sqrt(-2 * Math.log(r) / r); } } 1: Code stage 2: Code stage 3: Code stage Roadmap for steering understanding
  38. Title - CONFYYY - MM DD, YYYY 38 Implementation

  39. Title - CONFYYY - MM DD, YYYY 39 Multistaging Implementation

    browser plug-in Stage code example Code stages S 1: Get Pivot Index (a) Code stage 1. (b) Code stage 2. (c) Code stage 3. Fig. 6. One application of MethodStaging against the SmallestNum code example. function GETBINDINGSIN( m ) V , S , R {} W {target node types} S S [ m while S is not empty do u pop S if u / 2 V then V V [ { u } for each child node w in u do if if w 2 W then R R [ {binding of w } end if S S [ { w } end for end if end while return R // Set of bindings in m end function Fig. 7. GetBindingsIn subroutine. function RECONSTRUCTSOURCECODE( p, d ) // deletes declaration nodes n 2 { p \ d } from AST p p 0 { n | n 2 p and n / 2 { p \ d }} return source code for p 0 end function Fig. 8. ReconstructSourceCode subroutine. B. MethodStaging with Reduction Programmers dealing with large code stages are often con- fronted with the consequent information overload problem. We can reduce this problem by automatically reducing them. The rationale is that reduced code stages can be easily digested by programmers wishing a quick overview of their operation. We make reduction decisions in MethodStaging based on examples’ source code structure. Our approach is consistent with how human abstractors approach inspecting unfamiliar Equation 1. The usage score of a code block is representative of the demand of its elements throughout the code example. The usage frequency of each element in a code block is the number of times this element appears in a code stage. As a result, we use code blocks’ usage score to show the blocks with a higher demand and hide those with a lesser demand. UsageScore ( b ) = P elem2b UsageF req ( elem ) T otalChildren ( b ) (1) For example, given a nested code block at line 11 in Figure 4c, we first collect its children: temp, list, left, and right. Second, we compute each child’s usage frequency: 2, 7, 10, and 9. Lastly, we put it all together and calculate the nested code block’s usage score: (2 + 7 + 10 + 9)/4 = 7. We cast the problem of reducing large code stages as an instance of the Precedence Constrained Knapsack Problem or PCKP [17]. This problem is specified herein. Problem 3.2: Code Stage Reduction. Given a set of code blocks B (with weight wb and profit pb per block b 2 B), a Knapsack capacity W, a precedence order O ✓ B ⇥ B, and a set of constraints C, find H⇤ such that H⇤ = B \ X ⇤, where wb = number of lines of code in b, pb = UsageScore(b), X ⇤ = arg max { P b2B pb }, and X ⇤ satisfies the constraints in C. The constraints in C include: P b j 2B wb j  W, where bi bj (bi precedes bj ) 2 O, and i, j = 1, . . . , |B|. Similar to Samphaiboon et al. [17], we solve this problem by using dynamic programming. Our solution generalizes the code stage reduction problem, also taking into account a precedence relation between code blocks in a code stage. We build a Directed Acyclic Graph (DAG) to represent such a relation, where nodes correspond to code blocks in a one– to–one fashion. This relation is expressed as a composition relation between code blocks. For instance, a code block k 1 2: Select (a) Code stage 1. (b) Code stage 2. (c) Code stage 3. Fig. 6. One application of MethodStaging against the SmallestNum code example. function GETBINDINGSIN( m ) V , S , R {} W {target node types} S S [ m while S is not empty do u pop S if u / 2 V then V V [ { u } for each child node w in u do if if w 2 W then R R [ {binding of w } end if S S [ { w } end for end if end while return R // Set of bindings in m end function Fig. 7. GetBindingsIn subroutine. function RECONSTRUCTSOURCECODE( p, d ) // deletes declaration nodes n 2 { p \ d } from AST p p 0 { n | n 2 p and n / 2 { p \ d }} return source code for p 0 end function Fig. 8. ReconstructSourceCode subroutine. B. MethodStaging with Reduction Equation 1. The usage score of a code block is representative of the demand of its elements throughout the code example. The usage frequency of each element in a code block is the number of times this element appears in a code stage. As a result, we use code blocks’ usage score to show the blocks with a higher demand and hide those with a lesser demand. UsageScore ( b ) = P elem2b UsageFreq ( elem ) TotalChildren ( b ) (1) For example, given a nested code block at line 11 in Figure 4c, we first collect its children: temp, list, left, and right. Second, we compute each child’s usage frequency: 2, 7, 10, and 9. Lastly, we put it all together and calculate the nested code block’s usage score: (2 + 7 + 10 + 9)/4 = 7. We cast the problem of reducing large code stages as an instance of the Precedence Constrained Knapsack Problem or PCKP [17]. This problem is specified herein. Problem 3.2: Code Stage Reduction. Given a set of code blocks B (with weight wb and profit pb per block b 2 B), a Knapsack capacity W, a precedence order O ✓ B ⇥ B, and a set of constraints C, find H⇤ such that H⇤ = B \ X⇤, where wb = number of lines of code in b, pb = UsageScore(b), X⇤ = arg max { P b2B pb }, and X⇤ satisfies the constraints in C. The constraints in C include: P b j 2B wb j  W, where bi bj (bi precedes bj) 2 O, and i, j = 1, . . . , |B|. 3: Main (a) Code stage 1. (b) Code stage 2. (c) Code stage 3. Fig. 6. One application of MethodStaging against the SmallestNum code example. function GETBINDINGSIN( m ) V , S , R {} W {target node types} S S [ m while S is not empty do u pop S if u / 2 V then V V [ { u } for each child node w in u do if if w 2 W then R R [ {binding of w } end if S S [ { w } end for end if end while return R // Set of bindings in m end function Fig. 7. GetBindingsIn subroutine. function RECONSTRUCTSOURCECODE( p, d ) Equation 1. The usage score of a code block is representative of the demand of its elements throughout the code example. The usage frequency of each element in a code block is the number of times this element appears in a code stage. As a result, we use code blocks’ usage score to show the blocks with a higher demand and hide those with a lesser demand. UsageScore ( b ) = P elem2b UsageFreq ( elem ) TotalChildren ( b ) (1) For example, given a nested code block at line 11 in Figure 4c, we first collect its children: temp, list, left, and right. Second, we compute each child’s usage frequency: 2, 7, 10, and 9. Lastly, we put it all together and calculate the nested code block’s usage score: (2 + 7 + 10 + 9)/4 = 7. We cast the problem of reducing large code stages as an instance of the Precedence Constrained Knapsack Problem or PCKP [17]. This problem is specified herein. RESTful service MethodStaging w/Reduction Source code Capacity multistaging request 1 2 3 (S, H*) e.g., hi ∈ hidden code H* browser plug-in Stage RESTful service processing …
  40. Title - CONFYYY - MM DD, YYYY 40 Multistager Architecture

  41. Title - CONFYYY - MM DD, YYYY 41 Multistaging in

    Action
  42. MTU - ICPC’16 - 05 16, 2016 Exploring the code

    stages suggests a form of code inspection called Multistaging to Understand (MTU). By adopting MTU • Programmers can inspect a few generated code stages, • mentally abstract their functionality, and then • combine gained knowledge to understand main functionality MTU shares similarities with code reading by stepwise abstraction (Linger et al., 1979) 42 Multistaging to Understand
  43. MTU - ICPC’16 - 05 16, 2016 Given the AST

    of a code example, with a set of n method declarations D = D1 ∪ D2 … ∪ Dn, compute a set of interconnected code stages {S | S ⊆ D × D}, sorted in ascending order by LOC, s.t., each code stage s ∈ S ∪ {sØ } builds upon, and in relation to, preceding code stages. Where: • sØ is the null code stage (sØ ’s preceding code stage is sØ ) • si < sj , si precedes sj and i, j = 1 … |S| 43 The Multistaging Problem
  44. MTU - ICPC’16 - 05 16, 2016 Algorithm: MethodStaging(p/*AST*/, sØ

    ) Stages = {sØ } for each method m in p do d = {} // declarations set for each binding b in GetBindingsIn(m) do // e.g., b = (select, method) d = d U {getDeclarationNode(b)} end for s = source code for {n|n ∈ p ∧ n ∉ {p\d}} Stages = Stages U {s} end for return sortAscending(Stages) end Algorithm 44 Solution: MethodStaging Algorithm
  45. MTU - ICPC’16 - 05 16, 2016 MethodStaging provides an

    effective divide and conquer approach for code understanding. One caveat: It can produce large code stages. • Large code stages (code stages with long methods) can hinder MethodStaging’s effectiveness. • Long methods tend to increase programmers’ cognitive overhead more than small methods (Mantyla et al. 2003) Solution: MethodStaging w/Reduction (via code folding) 45 Reflections on MethodStaging
  46. Reduction in MethodStaging shows the code blocks (X*) with a

    high usage score in each code stage s and hides (i.e., folds) the ones with a low usage score (H*), where X* U H* ∈ s MTU - ICPC’16 - 05 16, 2016 46 MethodStaging w/Reduction Basics Usage frequency of an element in a code block b ⊆ s is the number of times it appears in s. UsageScore(b) = ∑elem ∈ b UsageFreq(elem) TotalChildren(b)
  47. MTU - ICPC’16 - 05 16, 2016 Given a set

    of code blocks B (with a weight wb and profit pb per b ∈ B), a Knapsack capacity W, a precedence order O ⊆ B x B (modeled as a DAG), and a set of constraints C, let’s find the set H*, such that H* = B \ X*, wb = LOC(b), pb = UsageScore(b), X* = arg max {∑b ∈ B pb }, and X* satisfies the constraints in C. Where C includes: • ∑ b’ ∈ B wb ’ ≤ W • ∃ bi → bj (bi precedes bi ) ∈ O, i, j = 1 … |B| 47 MethodStaging w/Reduction (Formulated as a Precedence-Constrained Knapsack Problem)
  48. Title - CONFYYY - MM DD, YYYY Multistaging Problem 48

    MethodStaging w/Reduction s to iden- de stages. em by au- t reduced grammers xample. g entirely approach h inspect- unfamiliar ing to the [18]. This ected code score. We quation 1. of the de- mple. The Input: AST Node p, and declarations d 2 p Output: A tuple consisting of the reconstructed source code and H⇤ Function ReconstructSourceCode( p, d ) // delete nodes {p \ d} from AST let p0 JDT. deleteAstNodes (p, {p \ d}) let DAG p 0 traverse p0 and then get built DAG let H⇤ computes B p 0 \ X⇤ p 0 using DAG p 0 and a capacity of 15 LOC return (JDT. getSourceCode (p0), H⇤) end Figure 11: Pseudocode for updated Reconstruct- SourceCode . This subroutine returns a tuple com- prising the reconstructed source code and the code elements to hide.
  49. B7 B1 B2 B3 B4 B6 B5 B8 B9 B12

    B10 B11 X* B7 B1 B2 B3 B4 B6 B5 B8 B9 B12 B10 B11 H* SCC - SRI - 09 18, 2015 Generating the set H* 49 63/6 7/3 19/1 5/3 5/3 3/0 5/0 3/0 13/1 3/1 2/0 17/1 wb/pb B7 B1 B2 B3 B4 B6 B5 B8 B9 B12 B10 B11 B 3. Generate H*: • H* = B \ X* { X*[k,w] = X*[k - 1, w] wk > w max(X*[k - 1, w], wk ≤ w ∧ k - 1 → k X*[k - 1, w - wk] + pk) 1. Build DAG from Example’s AST • bi → bj, bi precedes bj • wb and pb are calculated • wb = wb-original − (wc + wd ) 2. Solve X* using Dynamic Programing
  50. MTU - ICPC’16 - 05 16, 2016 50 MTU Evaluation

  51. Title - CONFYYY - MM DD, YYYY We consider the

    following question: Does MTU make the understanding of unfamiliar code examples easier during code foraging? Where easier means: • High comprehension accuracy • short reviewing time 51 MTU in the Lab
  52. • 12 Participants, 2 groups, 3 tasks, 120 minutes •

    Crossed factorial design with 2 factors • Between-subjects: Comprehension strategy • Within-subjects: Size of code examples • Variables • Response Accuracy & Reviewing time SCC - SRI - 09 18, 2015 52 Experimental Setup (Babbie, 2015)
  53. MTU - ICPC’16 - 05 16, 2016 Open ended questions

    addressing five comprehension abstractions (Pennington, 1987) 53 Response Accuracy Function Describe the overall functionality. Control flow Describe execution sequence using pseudo code. Data flow Describe when a data object gets updated. Operations Describe data object’s need in an execution sequence. State Describe data object’s composition at point of execution. Rating scheme (Du Bois, 2005) to score answers: Correct (10 pts), Almost Correct (8 pts), Right Idea (5 pts), and Wrong (0 pts)
  54. MTU - ICPC’16 - 05 16, 2016 Collected reviewing times

    from two sources: 54 Reviewing time browser plug-in’s time tracker Stage 00h 00m off Upwork’s time tracker Assigned task
  55. MTU - ICPC’16 - 05 16, 2016 55 Results

  56. MTU - ICPC’16 - 05 16, 2016 Significant differences in

    average response accuracy; favoring the treatment group (in bold) over the control group (in italics) 56 Average Response Accuracy Short [35,70) Medium [70, 140) Long [140, 200] MTU RTU p-value MTU RTU p-value MTU RTU p-value Function 6.83 3.33 0.0037 7.17 - 3.83 3.83 0.0509 7.67 - 5.00 p=0.0534 Control flow 8.50 6.83 0.0525 7.17 - 4.33 4.33 0.1984 8.17 - 4.33 p=0.0204 Data flow 8.67 6.17 0.0462 5.33 - 3.00 3.00 0.2308 8.50 - 6.00 p=0.1199 State 8.67 7.00 0.0873 7.67 - 5.67 5.67 0.1594 9.00 - 6.50 p=0.0971 Operations 7.33 3.33 0.0595 7.83 - 4.83 4.83 0.0609 6.50 - 3.00 p=0.0549 Unaccounted factor: Delocalization (Letovsky et al.,1986) Delocalization led to many wrong answers, which caused high p-values. Note: Rating scheme for scoring accuracy of answers: Correct (10 points), Almost Correct (8 points), Right Idea (5 points), and Wrong (0 points).
  57. MTU - ICPC’16 - 05 16, 2016 Significant speed improvements

    of the treatment group (in bold) over the control group (in italics) 57 Average Reviewing Time Short [35,70) Medium [70, 140) Long [140, 200] MTU RTU p-value MTU RTU p-value MTU RTU p-value Reviewing time (secs) 475 745 0.0995 655 1022 0.0446 465 912 0.0284 Note: Reviewing times obtained from two sources: Upwork’s time tracker and Violette’s time tracker.
  58. MTU - ICPC’16 - 05 16, 2016 MTU helps facilitate

    quick and accurate understanding when most of the code is localized. MTU provides minor benefits when code is partially or fully delocalized. MTU provides consistent speed improvements regardless of delocalization. 58 Summarizing
  59. SCC- SRI - 09 18, 2015 59 Epilogue

  60. SCC - SRI - 09 18, 2015 Introduced a new

    paradigm (and its tools) for addressing code foraging’s challenges: Vesperin & Code Examples Multistager. Our results confirmed thesis statement Addressed questionable quality of online code. Facilitated quick and accurate understanding of online code. We only scratched the surface … 60 Source code curation & tools
  61. SCC - SRI - 09 18, 2015 Craft solutions by

    remixing curated code: Semi-automated resolution of delocalized code. Identification of the best chain of code stages to reuse Crowdsourcing program synthesis 61 Looking ahead
  62. SCC - SRI - 09 18, 2015 62 Thank you

    huascar sanchez hsanchez@cs.ucsc.edu