Slide 1

Slide 1 text

Conceptualizing Knowledge Curation in Software Developer Communities: A Socio-Technical Perspective Alexey Zagalsky, Nov. 2017 Towards a Thesis

Slide 2

Slide 2 text

Disclaimer This is pre-synthesized, raw, and probably doesn’t make much sense (yet). I’ve been lucky to collaborate with many talented people. The work I describe next has been done in collaboration with: Margaret-Anne Storey, Daniel M. German, Carlos Gómez Teshima, Germán Poo-Caamaño, Leif Singer, Fernando Figueira Filho, Maryi Arciniegas-Mendez, Carlene Lebeuf, and Bin Lin. 2 Images used in these slides are used for educational purposes only and I claim no credit for any of the images. Images are copyright to its respectful owners.

Slide 3

Slide 3 text

“Our modern society runs on software. But the tools we use to build software are buckling under increased demand.” Nadia Eghbal, Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure “Software is eating the world” - Marc Andreessen 3

Slide 4

Slide 4 text

“No century in recorded history has experienced so many social transformations and such radical ones as the twentieth century.” - P.F. Drucker, 2001 4

Slide 5

Slide 5 text

“The WILD WEST of communication channels” 5

Slide 6

Slide 6 text

I wanted to know, How social channels and tools affect software development “I had leaned and climbed forward like Alice through the looking-glass. I had no idea just how deep the rabbit hole would go.” 6

Slide 7

Slide 7 text

The Role of Social Media in Software Development: A Socio-Technical Perspective Part I "The medium is the message" - Marshall McLuhan 7

Slide 8

Slide 8 text

8

Slide 9

Slide 9 text

We conducted a survey with 1,449 developers on Github The Role of Social Media 9

Slide 10

Slide 10 text

10

Slide 11

Slide 11 text

“Wait, but Slack is meant for team communications, but nobody told you the team has to be a certain size, you can literally build a community as big as 10,000 users or more” - anonymous 11

Slide 12

Slide 12 text

We asked about challenges developers face when using these communication channels 12

Slide 13

Slide 13 text

13

Slide 14

Slide 14 text

Explored an Emerging Channel Developers have recently adopted a new and versatile channel—development chatbots “The Most Important Startup’s Hardest Worker Isn’t a Person” - [wired] 14

Slide 15

Slide 15 text

Developer chatbot roles: Code bots Test bots DevOps bots Support bots Documenting bots Entertainment bots Developer bots enhance efficiency by effectiveness by automating tasks, help developers stay in flow, improve decision making, support team cognition, and regulate individual and team tasks and goals. 15

Slide 16

Slide 16 text

Chatbots help mitigate collaboration friction points Friction in Team Interactions Understanding team members’ roles and expertise Adhering to team procedures and agreements Understanding and working towards team goals Coordinating team activities Managing trust and team cooperation Friction in Individuals’ interactions with Technology Distracting and interrupting technologies Maintaining awareness of new technologies Understanding channel affordances Friction in Team’s interaction with Technology Information fragmentation and overload Adopting and understanding tool usage in the team’s context Maintaining awareness of project activities Inadequate collaboration tooling Miscommunication on text-based channels 16

Slide 17

Slide 17 text

“ChatOps is a collaboration model that connects people, tools, process, and automation into a transparent workflow. This flow connects the work needed, the work happening, and the work done in a persistent location staffed by the people, bots, and related tools.” - Sean Regan, Atlassian 17

Slide 18

Slide 18 text

Software Development as a Knowledge Building Process Part II “The limits of my language define the limits of my world.” - Ludwig Wittgenstein 18

Slide 19

Slide 19 text

“Knowledge work is when individuals use their cognitive abilities, technical know-how, interactions with others, and individual creativity to achieve work outcomes.” [Winslow and Bramer, 1994] “Knowledge workers are said to be involved in defining the scope of their work, being self-managed, searching for new ways of doing things, continuously learning and teaching others, and emphasizing both the quality and quantity of the outcomes.” [Drucker, 1999] 19

Slide 20

Slide 20 text

“Given the complexity of knowledge work, most researchers now agree that this form of work is seldom performed as a solitary endeavor.” [McDermott, 2005] “Perhaps more than any other form of work, knowledge work has pointed to the need for individuals to collaborate together, rather than work alone.” [Woolley, 2009] Social Context of Knowledge Work 20

Slide 21

Slide 21 text

Social and communication channels provide the means for managing knowledge Software can help automate many routine activities in the workplace Social and communication channels can serve as the content of the work itself Knowledge Work and Social Media 21

Slide 22

Slide 22 text

“Software developers are at the cutting edge of knowledge work. In many ways, they’re the prototype of the future knowledge worker; they’re pushing the boundaries of twenty-first century knowledge work. Modern knowledge work is enabled by and dependent on information technology-technologies that are created by software developers and used by legions of knowledge workers worldwide.” [Allan Kelly, 2014] 22

Slide 23

Slide 23 text

What is Knowledge? What is not? “Knowledge happens when information meets experience, values, contextual understanding about the specific situations, application, intuition and beliefs.” Tanmay Vora “A process or a competent goal-oriented activity rather than as an observable and transferable resource” Billet, 1998 23

Slide 24

Slide 24 text

Software is built with the tacit knowledge in the developer's’ head, and the externalized knowledge (explicit) embodied in the development tools, channels, and project artifacts. [Naur 1985] Naur considers programming as a “theory building process” and he stresses the importance of tacit knowledge. Tacit knowledge can be further decomposed into procedural (e.g., practiced skills) and declarative knowledge (e.g., facts) [Robillard 1999] “Knowledge is created out of a dialogue between tacit and explicit knowledge” [Nonaka, 1991, 1994] Tacit - Tacit (e.g., apprenticeship) Explicit - Explicit Tacit - Explicit (e.g., learning craft skills) Explicit - Tacit (e.g., internalization of new knowledge) 24

Slide 25

Slide 25 text

Wasko and Faraj [2000] distinguish different types of knowledge: Knowledge embedded in people (Tacit knowledge) Knowledge as object (Externalized knowledge) Knowledge socially generated, maintained, and exchanged within emergent communities of practice (Knowledge as public good) We added a fourth type, knowledge about people and social networks [Storey et al. 2014] 25

Slide 26

Slide 26 text

26

Slide 27

Slide 27 text

[Wagstrom et al. 2011] 27

Slide 28

Slide 28 text

Mental model A 28

Slide 29

Slide 29 text

Mental model B 29

Slide 30

Slide 30 text

Activities Actors Contributors, Stakeholders Assemblages & Communities of Practice Teams, Organization Processes & Practices Tools & Channels IDE face-to-face Artifacts Code, Documentation, Q&A, History Agile Coding Current mental model (after many iterations) 30

Slide 31

Slide 31 text

Activity theory applied to software engineering [Tell and Babar, 2012] 31

Slide 32

Slide 32 text

Software development is a knowledge building process which is characterized by the (1) knowledge activities and actions, (2) stakeholder roles, and (3) is enabled by socially enhanced tools and communication channels. 32

Slide 33

Slide 33 text

Reinhardt et al. 2011 Acquisition Analyze Authoring Co-authoring Dissemination Expert Search Feedback Information organization Information search Learning Monitoring Networking Service search [Tell and Babar, 2012] 33

Slide 34

Slide 34 text

Acquisition Authoring Co-Authoring (communicating and coordinate with others) Dissemination (can be either of content or activities) Feedback Information Organization and Curation Learning Monitoring Networking Searching (information, services, or experts) Knowledge Activity Typology for Soft. Dev. 34

Slide 35

Slide 35 text

[Ford et al. , 2017] 35

Slide 36

Slide 36 text

Knowledge Curation Part III 36

Slide 37

Slide 37 text

I wanted to know, How is knowledge constructed and curated in a developer community? “In software development, the main difference between social media artifacts and traditional artifacts is that the former can be freely configured by everybody participating in the development, whereas the latter can only be configured by a ‘gatekeeper’. ” C. Treude, Thesis, 2012 37

Slide 38

Slide 38 text

A Socio-Technical Perspective Groups and communities are the primary unit of analysis 38

Slide 39

Slide 39 text

R is an increasingly popular open source programming language The R community plays an important role in knowledge creation and diffusion Two particular communication channels for Q&A are Stack Overflow and the R-help mailing list 39

Slide 40

Slide 40 text

Stack Overflow vs. Mailing Lists Since 2010, there has been a decrease in the number of messages on R-help and an increase on Stack Overflow [Vasilescu 2014] Projects that migrated from mailing lists to Stack Overflow showed improvements [Squire 2015] 40

Slide 41

Slide 41 text

41

Slide 42

Slide 42 text

How-to Set up Bug / Error / Exception Discrepancy Questions Decision help Conceptual / Guidance Code reviewing Other Non-functional Future reference Redirecting Clue / Suggestion / Hint Tutorial Source code Answers Alternative Explanation Announcement Benchmark Opinion Announcement Expansion Background Correction Updates Explanation Solution Off topic / Opinion Too localized Not an answer Repeated question Flags Unclear Clarification Complement / Criticism Expansion Correction / Alternative Comments External reference 42

Slide 43

Slide 43 text

How-to Set up Bug / Error / Exception Discrepancy Questions Decision help Conceptual / Guidance Code reviewing Other Non-functional Future reference Redirecting Clue / Suggestion / Hint Tutorial Source code Answers Alternative Explanation Announcement Benchmark Opinion Announcement Expansion Background Correction Updates Explanation Solution Off topic / Opinion Too localized Not an answer Repeated question Flags Unclear Clarification Complement / Criticsm Expansion Correction / Alternative Comments External reference SO % RH % 20.20% 15.03% 13.01% 2.59% 24.54% 17.62% 5.33% 18.13% 4.09% 16.93% 25.15% 17.44% 0.99% 5.70% 0.62% 0.52% 6.07% 6.04% 43

Slide 44

Slide 44 text

How-to Set up Bug / Error / Exception Discrepancy Questions Decision help Conceptual / Guidance Code reviewing Other Non-functional Future reference Redirecting Clue / Suggestion / Hint Tutorial Source code Answers Alternative Explanation Announcement Benchmark Opinion Announcement Expansion Background Correction Updates Explanation Solution Off-topic / Opinion Too localized Not an answer Repeated question Flags Unclear Clarification Complement / Critic Expansion Correction / Alternative Comments External reference SO % RH % 4.40% 1.12% 12.07% 23.08% 49.10% 0.81% 18.92% 33.60% 13.54% 38.46% 1.96% 2.83% 44

Slide 45

Slide 45 text

Interestingly, we found that both channels are used by the R community and both support Q&A knowledge, however, there are important differences between the two channels 45

Slide 46

Slide 46 text

Participatory Knowledge Construction Crowd Knowledge Construction 46

Slide 47

Slide 47 text

Community Participation Patterns 47

Slide 48

Slide 48 text

48

Slide 49

Slide 49 text

We explored three potential reasons for the decrease in questions with a positive score: 1. We found the proportion of questions marked as duplicates is increasing, but the overall number is only 3% of all questions. 2. Then we counted the number of questions with a negative score, but this only accounts for 2.9% of all questions. 3. We found that 29.2% of all posts have a score equal to zero. A small proportion of these questions (3%) had a zero score after being voted up and down. 49

Slide 50

Slide 50 text

50

Slide 51

Slide 51 text

51

Slide 52

Slide 52 text

52

Slide 53

Slide 53 text

I wanted to know, What role does knowledge moderation play in Stack Overflow? 53

Slide 54

Slide 54 text

https://stackoverflow.com/users?tab=moderators 54

Slide 55

Slide 55 text

55

Slide 56

Slide 56 text

56

Slide 57

Slide 57 text

“Exception handling” (by elected group of moderators) Crowd-moderation (by community members) https://stackoverflow.blog/2009/05/18/a-theory-of-moderation/ 57

Slide 58

Slide 58 text

58

Slide 59

Slide 59 text

[Yuqing Ren and Robert E. Kraut] 59

Slide 60

Slide 60 text

Bounded Context Social Media Open World Stack Overflow (Q&A) Microblogging (Twitter) GitHub Blogs Bounded Contexts (e.g. Amazon, IBM) Q&A: ● size ● culture ● factors ● success or failure? Yammer Hipchat / Slack Is it transferable from open to bounded? Can it be mixed? I come from this side Pushed by companies 60

Slide 61

Slide 61 text

Their goal is to bridge a gap Potential pitfalls: Fragmentation of knowledge Norms and rules Moderation and community caretakers Gamification and effort-vs.-value 61

Slide 62

Slide 62 text

Implications Part IV 62

Slide 63

Slide 63 text

Better understanding of social media impact on software development (Towards) A knowledge framework Knowledge sharing Knowledge productivity Knowledge maps & knowledge flow Insights on knowledge curation within a developer communities 63

Slide 64

Slide 64 text

Published Work 64

Slide 65

Slide 65 text

How Social and Communication Channels Shape and Challenge a Participatory Culture in Software Development (TSE 2016) The (R) Evolution of Social Media in Software Engineering (FOSE ICSE 2014) Disrupting Developer Productivity One Bot at a Time (VaR FSE 2016) How Software Developers Mitigate Collaboration Friction with Chatbots (CSCW workshop 2017) Software Bots (IEEE Software 2018) Why Developers Are Slacking Off: Understanding How Software Teams use Slack (CSCW 2016 poster) How the R Community Creates and Curates Knowledge: An Extended Study of Stack Overflow and Mailing Lists (EMSE 2017) How the R Community Creates and Curates Knowledge: A Comparative Study of Stack Overflow and Mailing Lists (MSR 2016) The Role of Social Media Knowledge Curation 65

Slide 66

Slide 66 text

Collaboration and Regulation Using the Model of Regulation to Understand Software Development Collaboration Practices and Tool Support (CSCW 2017) Regulation as an Enabler for Collaborative Software Development (CHASE 2015) Participatory Platforms for Education Student Experiences Using GitHub in Software Engineering Courses: A Case Study (SEET ICSE 2016) The Emergence of GitHub as a Collaborative Platform for Education (CSCW 2015) Research Methods for Software Engineering Selecting Research Methods for Studying a Participatory Culture in Software Development: Keynote (EASE 2015) Methodology Matters: Is There a Method Choice Bias in Software Engineering? (under review for NIER 2018) A Structured Travelogue Approach for Communicating Qualitative Research in Software Engineering (rejected, will be resubmitted to EMSE) 66

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

Fin [from “The illustrated guide to a Ph.D.”] 68