Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Towards a Thesis

Towards a Thesis

Conceptualizing Knowledge Curation in Software Developer Communities: A Socio-Technical Perspective

Alexey Zagalsky

November 24, 2017
Tweet

More Decks by Alexey Zagalsky

Other Decks in Science

Transcript

  1. Conceptualizing Knowledge Curation
    in Software Developer Communities:
    A Socio-Technical Perspective
    Alexey Zagalsky, Nov. 2017
    Towards a Thesis

    View Slide

  2. Disclaimer
    This is pre-synthesized, raw, and probably doesn’t
    make much sense (yet).
    I’ve been lucky to collaborate with many talented
    people. The work I describe next has been done in
    collaboration with:
    Margaret-Anne Storey, Daniel M. German, Carlos
    Gómez Teshima, Germán Poo-Caamaño, Leif
    Singer, Fernando Figueira Filho, Maryi
    Arciniegas-Mendez, Carlene Lebeuf, and Bin Lin. 2
    Images used in these slides are used for educational purposes only and I claim no credit for any of the
    images. Images are copyright to its respectful owners.

    View Slide

  3. “Our modern society runs on software. But the tools we
    use to build software are buckling under increased
    demand.”
    Nadia Eghbal, Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure
    “Software is eating the world” - Marc Andreessen
    3

    View Slide

  4. “No century in recorded history has experienced so many
    social transformations and such radical ones as the
    twentieth century.”
    - P.F. Drucker, 2001
    4

    View Slide

  5. “The WILD WEST of communication channels”
    5

    View Slide

  6. I wanted to know,
    How social channels and tools affect
    software development
    “I had leaned and climbed forward like Alice through the looking-glass.
    I had no idea just how deep the rabbit hole would go.”
    6

    View Slide

  7. The Role of Social Media in
    Software Development:
    A Socio-Technical Perspective
    Part I
    "The medium is the message" - Marshall McLuhan
    7

    View Slide

  8. 8

    View Slide

  9. We conducted a survey with 1,449 developers on Github
    The Role of Social Media
    9

    View Slide

  10. 10

    View Slide

  11. “Wait, but Slack is meant for team communications, but
    nobody told you the team has to be a certain size, you can
    literally build a community as big as 10,000 users or more”
    - anonymous
    11

    View Slide

  12. We asked about challenges developers face when using
    these communication channels
    12

    View Slide

  13. 13

    View Slide

  14. Explored an Emerging Channel
    Developers have recently adopted a new and versatile
    channel—development chatbots
    “The Most Important Startup’s Hardest Worker Isn’t a Person” - [wired]
    14

    View Slide

  15. Developer chatbot roles:
    Code bots
    Test bots
    DevOps bots
    Support bots
    Documenting bots
    Entertainment bots
    Developer bots enhance efficiency by effectiveness by
    automating tasks, help developers stay in flow,
    improve decision making, support team cognition,
    and regulate individual and team tasks and goals.
    15

    View Slide

  16. Chatbots help mitigate collaboration friction points
    Friction in Team Interactions
    Understanding team members’ roles and expertise
    Adhering to team procedures and agreements
    Understanding and working towards team goals
    Coordinating team activities
    Managing trust and team cooperation
    Friction in Individuals’ interactions with Technology
    Distracting and interrupting technologies
    Maintaining awareness of new technologies
    Understanding channel affordances
    Friction in Team’s interaction with Technology
    Information fragmentation and overload
    Adopting and understanding tool usage in the team’s context
    Maintaining awareness of project activities
    Inadequate collaboration tooling
    Miscommunication on text-based channels 16

    View Slide

  17. “ChatOps is a collaboration model that connects people,
    tools, process, and automation into a transparent
    workflow. This flow connects the work needed, the work
    happening, and the work done in a persistent location
    staffed by the people, bots, and related tools.”
    - Sean Regan, Atlassian
    17

    View Slide

  18. Software Development as a
    Knowledge Building Process
    Part II
    “The limits of my language define the limits of my world.”
    - Ludwig Wittgenstein
    18

    View Slide

  19. “Knowledge work is when individuals use their cognitive abilities,
    technical know-how, interactions with others, and individual creativity to
    achieve work outcomes.”
    [Winslow and Bramer, 1994]
    “Knowledge workers are said to be involved in defining the scope of their
    work, being self-managed, searching for new ways of doing things,
    continuously learning and teaching others, and emphasizing both the quality
    and quantity of the outcomes.”
    [Drucker, 1999]
    19

    View Slide

  20. “Given the complexity of knowledge work, most
    researchers now agree that this form of work is seldom
    performed as a solitary endeavor.”
    [McDermott, 2005]
    “Perhaps more than any other form of work, knowledge
    work has pointed to the need for individuals to collaborate
    together, rather than work alone.”
    [Woolley, 2009]
    Social Context of Knowledge Work
    20

    View Slide

  21. Social and communication channels provide the means
    for managing knowledge
    Software can help automate many routine activities in
    the workplace
    Social and communication channels can serve as the
    content of the work itself
    Knowledge Work and Social Media
    21

    View Slide

  22. “Software developers are at the cutting edge of knowledge
    work. In many ways, they’re the prototype of the future
    knowledge worker; they’re pushing the boundaries of
    twenty-first century knowledge work.
    Modern knowledge work is enabled by and dependent on
    information technology-technologies that are created by
    software developers and used by legions of knowledge
    workers worldwide.”
    [Allan Kelly, 2014]
    22

    View Slide

  23. What is Knowledge? What is not?
    “Knowledge happens when information meets experience, values,
    contextual understanding about the specific situations, application,
    intuition and beliefs.”
    Tanmay Vora
    “A process or a competent goal-oriented activity rather than as an
    observable and transferable resource”
    Billet, 1998
    23

    View Slide

  24. Software is built with the tacit knowledge in the developer's’ head, and
    the externalized knowledge (explicit) embodied in the development
    tools, channels, and project artifacts. [Naur 1985]
    Naur considers programming as a “theory building process” and he
    stresses the importance of tacit knowledge.
    Tacit knowledge can be further decomposed into procedural (e.g.,
    practiced skills) and declarative knowledge (e.g., facts) [Robillard 1999]
    “Knowledge is created out of a dialogue between tacit and explicit
    knowledge” [Nonaka, 1991, 1994]
    Tacit - Tacit (e.g., apprenticeship)
    Explicit - Explicit
    Tacit - Explicit (e.g., learning craft skills)
    Explicit - Tacit (e.g., internalization of new knowledge)
    24

    View Slide

  25. Wasko and Faraj [2000]
    distinguish different types of
    knowledge:
    Knowledge embedded in people (Tacit knowledge)
    Knowledge as object (Externalized knowledge)
    Knowledge socially generated, maintained, and exchanged within
    emergent communities of practice (Knowledge as public good)
    We added a fourth type, knowledge about people and
    social networks [Storey et al. 2014]
    25

    View Slide

  26. 26

    View Slide

  27. [Wagstrom et al. 2011]
    27

    View Slide

  28. Mental model A
    28

    View Slide

  29. Mental model B
    29

    View Slide

  30. Activities
    Actors
    Contributors, Stakeholders
    Assemblages &
    Communities of Practice
    Teams, Organization
    Processes & Practices
    Tools & Channels
    IDE face-to-face
    Artifacts
    Code, Documentation, Q&A,
    History
    Agile Coding
    Current mental model
    (after many iterations)
    30

    View Slide

  31. Activity theory applied to software engineering
    [Tell and Babar, 2012]
    31

    View Slide

  32. Software development is a knowledge building process
    which is characterized by the (1) knowledge activities
    and actions, (2) stakeholder roles, and (3) is enabled by
    socially enhanced tools and communication channels.
    32

    View Slide

  33. Reinhardt et al. 2011
    Acquisition
    Analyze
    Authoring
    Co-authoring
    Dissemination
    Expert Search
    Feedback
    Information organization
    Information search
    Learning
    Monitoring
    Networking
    Service search
    [Tell and Babar, 2012]
    33

    View Slide

  34. Acquisition
    Authoring
    Co-Authoring (communicating and coordinate with others)
    Dissemination (can be either of content or activities)
    Feedback
    Information Organization and Curation
    Learning
    Monitoring
    Networking
    Searching (information, services, or experts)
    Knowledge Activity Typology for Soft. Dev.
    34

    View Slide

  35. [Ford et al. , 2017]
    35

    View Slide

  36. Knowledge Curation
    Part III
    36

    View Slide

  37. I wanted to know,
    How is knowledge constructed and
    curated in a developer community?
    “In software development, the main difference between social media
    artifacts and traditional artifacts is that the former can be freely
    configured by everybody participating in the development, whereas the
    latter can only be configured by a ‘gatekeeper’. ”
    C. Treude, Thesis, 2012
    37

    View Slide

  38. A Socio-Technical Perspective
    Groups and communities are the primary unit of analysis
    38

    View Slide

  39. R is an increasingly popular open source programming
    language
    The R community plays an important role in
    knowledge creation and diffusion
    Two particular communication channels for Q&A are
    Stack Overflow and the R-help mailing list
    39

    View Slide

  40. Stack Overflow vs. Mailing Lists
    Since 2010, there has been a decrease in the number of
    messages on R-help and an increase on Stack Overflow
    [Vasilescu 2014]
    Projects that migrated from mailing lists to Stack
    Overflow showed improvements [Squire 2015]
    40

    View Slide

  41. 41

    View Slide

  42. How-to
    Set up
    Bug / Error /
    Exception
    Discrepancy
    Questions
    Decision help
    Conceptual /
    Guidance
    Code reviewing
    Other
    Non-functional
    Future reference
    Redirecting
    Clue / Suggestion /
    Hint
    Tutorial
    Source code
    Answers
    Alternative
    Explanation
    Announcement
    Benchmark
    Opinion
    Announcement
    Expansion
    Background
    Correction
    Updates
    Explanation
    Solution
    Off topic / Opinion
    Too localized
    Not an answer
    Repeated question
    Flags
    Unclear
    Clarification
    Complement /
    Criticism
    Expansion
    Correction /
    Alternative
    Comments
    External reference
    42

    View Slide

  43. How-to
    Set up
    Bug / Error /
    Exception
    Discrepancy
    Questions
    Decision help
    Conceptual /
    Guidance
    Code reviewing
    Other
    Non-functional
    Future reference
    Redirecting
    Clue / Suggestion /
    Hint
    Tutorial
    Source code
    Answers
    Alternative
    Explanation
    Announcement
    Benchmark
    Opinion
    Announcement
    Expansion
    Background
    Correction
    Updates
    Explanation
    Solution
    Off topic / Opinion
    Too localized
    Not an answer
    Repeated question
    Flags
    Unclear
    Clarification
    Complement /
    Criticsm
    Expansion
    Correction /
    Alternative
    Comments
    External reference
    SO % RH %
    20.20% 15.03%
    13.01% 2.59%
    24.54% 17.62%
    5.33% 18.13%
    4.09% 16.93%
    25.15% 17.44%
    0.99% 5.70%
    0.62% 0.52%
    6.07% 6.04%
    43

    View Slide

  44. How-to
    Set up
    Bug / Error /
    Exception
    Discrepancy
    Questions
    Decision help
    Conceptual /
    Guidance
    Code reviewing
    Other
    Non-functional
    Future reference
    Redirecting
    Clue / Suggestion /
    Hint
    Tutorial
    Source code
    Answers
    Alternative
    Explanation
    Announcement
    Benchmark
    Opinion
    Announcement
    Expansion
    Background
    Correction
    Updates
    Explanation
    Solution
    Off-topic / Opinion
    Too localized
    Not an answer
    Repeated question
    Flags
    Unclear
    Clarification
    Complement /
    Critic
    Expansion
    Correction /
    Alternative
    Comments
    External reference
    SO % RH %
    4.40% 1.12%
    12.07% 23.08%
    49.10% 0.81%
    18.92% 33.60%
    13.54% 38.46%
    1.96% 2.83%
    44

    View Slide

  45. Interestingly, we found that both channels are used by
    the R community and both support Q&A knowledge,
    however, there are important differences between the
    two channels
    45

    View Slide

  46. Participatory
    Knowledge Construction
    Crowd
    Knowledge Construction
    46

    View Slide

  47. Community Participation Patterns
    47

    View Slide

  48. 48

    View Slide

  49. We explored three potential reasons for the decrease in
    questions with a positive score:
    1. We found the proportion of questions marked as duplicates is
    increasing, but the overall number is only 3% of all questions.
    2. Then we counted the number of questions with a negative
    score, but this only accounts for 2.9% of all questions.
    3. We found that 29.2% of all posts have a score equal to zero. A
    small proportion of these questions (3%) had a zero score after
    being voted up and down.
    49

    View Slide

  50. 50

    View Slide

  51. 51

    View Slide

  52. 52

    View Slide

  53. I wanted to know,
    What role does knowledge
    moderation play in Stack Overflow?
    53

    View Slide

  54. https://stackoverflow.com/users?tab=moderators
    54

    View Slide

  55. 55

    View Slide

  56. 56

    View Slide

  57. “Exception handling”
    (by elected group of moderators)
    Crowd-moderation
    (by community members)
    https://stackoverflow.blog/2009/05/18/a-theory-of-moderation/
    57

    View Slide

  58. 58

    View Slide

  59. [Yuqing Ren and Robert E. Kraut]
    59

    View Slide

  60. Bounded Context Social Media
    Open World
    Stack Overflow (Q&A)
    Microblogging (Twitter)
    GitHub
    Blogs
    Bounded Contexts
    (e.g. Amazon, IBM)
    Q&A:
    ● size
    ● culture
    ● factors
    ● success or failure?
    Yammer
    Hipchat / Slack
    Is it transferable
    from open to
    bounded?
    Can it be mixed?
    I come from
    this side
    Pushed by companies
    60

    View Slide

  61. Their goal is to bridge a gap
    Potential pitfalls:
    Fragmentation of knowledge
    Norms and rules
    Moderation and community caretakers
    Gamification and effort-vs.-value
    61

    View Slide

  62. Implications
    Part IV
    62

    View Slide

  63. Better understanding of social media impact on
    software development
    (Towards) A knowledge framework
    Knowledge sharing
    Knowledge productivity
    Knowledge maps & knowledge flow
    Insights on knowledge curation within a developer
    communities
    63

    View Slide

  64. Published Work
    64

    View Slide

  65. How Social and Communication Channels Shape and Challenge a
    Participatory Culture in Software Development (TSE 2016)
    The (R) Evolution of Social Media in Software Engineering (FOSE ICSE 2014)
    Disrupting Developer Productivity One Bot at a Time (VaR FSE 2016)
    How Software Developers Mitigate Collaboration Friction with Chatbots
    (CSCW workshop 2017)
    Software Bots (IEEE Software 2018)
    Why Developers Are Slacking Off: Understanding How Software Teams
    use Slack (CSCW 2016 poster)
    How the R Community Creates and Curates Knowledge: An Extended
    Study of Stack Overflow and Mailing Lists (EMSE 2017)
    How the R Community Creates and Curates Knowledge: A Comparative
    Study of Stack Overflow and Mailing Lists (MSR 2016)
    The Role of Social Media
    Knowledge Curation
    65

    View Slide

  66. Collaboration and Regulation
    Using the Model of Regulation to Understand Software Development Collaboration Practices
    and Tool Support (CSCW 2017)
    Regulation as an Enabler for Collaborative Software Development (CHASE 2015)
    Participatory Platforms for Education
    Student Experiences Using GitHub in Software Engineering Courses: A Case Study (SEET ICSE
    2016)
    The Emergence of GitHub as a Collaborative Platform for Education (CSCW 2015)
    Research Methods for Software Engineering
    Selecting Research Methods for Studying a Participatory Culture in Software Development:
    Keynote (EASE 2015)
    Methodology Matters: Is There a Method Choice Bias in Software Engineering? (under review
    for NIER 2018)
    A Structured Travelogue Approach for Communicating Qualitative Research in Software
    Engineering (rejected, will be resubmitted to EMSE) 66

    View Slide

  67. View Slide

  68. Fin
    [from “The illustrated guide to a Ph.D.”]
    68

    View Slide