Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Questions for Data Scientists in Software Engineering: A Replication

Questions for Data Scientists in Software Engineering: A Replication

Conducted at ING—a software-defined enterprise providing banking solutions—this study presents 171 questions that software engineers at ING would like data scientists to answer. This study is a replication of a similar study at Microsoft. We found that the core software development challenges (relating to code, developer, and customer) remain the same. Subtle differences seen relate to the two companies' context and the time gap in between.

Ayushi Rastogi

October 28, 2020
Tweet

Other Decks in Research

Transcript

  1. Impact of MS study 49% 35% 12% 4% Plain reference

    Example Partly answers Answers
  2. Impact of MS study 34% 25% 20% 15% 5% Analytics

    Testing, quality Process Culture Mobile apps
  3. Design • Replication of MS study • Survey 1: find

    data science problems • Survey 2: rank questions on relevance • Comparison of MS and ING study • Reflections for research and industry
  4. Top essential questions • Developer: Role of team happiness and

    pleasure at work on productivity and performance of DevOp teams • Best practices: How to build for reusability and scalability? • Bug: Compare efforts spend on fixing bugs and vulnerabilities to writing correct software from start
  5. Top unwise questions • Customer and requirement: Software solution in

    one language, for all person and interests • Best practices: Convert PL1 code to Cobol with readability • Developer productivity: Compare performance of one department against another
  6. Different from MS study • Compare broader themes • Word

    count • Map categories • Compare questions across themes
  7. Context and time matters • Focus: MS on customer; ING

    at engineering team • Type of questions: Cloud-related questions at MS; deployment-related at ING • Relevance of questions: MS questions on adoption of agile and automated testing; ING at functional aspects • Eye on future: MS questions on legacy code; not at ING
  8. Reflections on Replication • Policy influence data availability and design.

    • Companies have incomparable information. • A study independent of prior study is hard. • Free from corporate influence vs. self-censorship.
  9. Software Engineering • What can be answered? What should be

    answered? • Use new list of questions • Ask questions relevant to software-defined enterprises • Replicate often and in different contexts • Collect data for hard questions