Scalability, Practicability (and Promotion) in SE Research

Scalability, Practicability (and Promotion) in SE Research Marco Tulio Valente
ASERG, DCC, UFMG, BR @mtov 1 SBCARS, September 2018

My take on SE research 2

SE research is still relevant, after 50 years 3

but we should focus on scalable solutions to practical problems
4

Counterexample #1: solution to a very simple context or system
5

Counterexample #2: solution to a problem developers will have in
10 years 6

We need scalability & practicability 7

Scalability & Practicability ➜ Data 8

The world’s most valuable resource is no longer oil, but
data 9 The Economist, May 2017

For the first time in 50 yrs, we have lots
of data 10

This talk's story is about using as much as possible
data to shed light on modern software engineering problems 11

Of course, we didn't discover GitHub alone; almost everyone is
doing the same 12

Data ➜ Quantitative & Qualitative 13

Part I Large scale surveys ( Surveys = Qualitative Studies
) 14

"Why" surveys • Why do we refactor? • Why do
we break APIs? • Why do open source projects fail? • Why do we star GitHub projects? 15

Why do we refactor? FSE 2016, with Danilo and Tsantalis
16

Why do we really refactor, in practice? 17

Why do we really refactor? • Danilo tracked (using a
tool) refactorings in ◦ 748 Java projects, 61 days ◦ 1,411 refactorings, 185 projects, by 465 devs ◦ 195 answers (42%), right after refactoring 18

FSE 2016 44 real reasons for refactoring 19

Key Finding Refactoring is driven by the need to add
new features and fix bugs and much less by code smell resolution 20

Why do we break APIs? SANER 2018, with Aline, Laerte
and Andre 21

Why do we break APIs? • Aline tracked (using a
tool) breaking changes (BCs) in ◦ 400 Java libraries & frameworks ◦ 116 days ◦ 282 possible BCs, by 102 developers ◦ 56 answers (55%), right after the BCs 22

Key Finding We break APIs to implement new features (32%),
to simplify the APIs (29%) and to improve maintainability (24%) 23

[ firehouse interviews = surveys/interviews right after the event of
interest ] 24

Why do open source projects fail? FSE 2017, with Jailton
25

fail ➜ become deprecated 26

Why do open source projects fail? • Jailton asked this
question to the maintainers of ◦ 408 projects without commits for one year ◦ 118 answers (29%) 27

Reason Projects Usurped by competitor 27 Obsolete 20 Lack of
time 18 Lack of interest 18 Outdated technologies 14 Low maintainability 7 Conflicts among developers 3 Legal problems 2 Acquisition 1 Why do open source projects fail? 28

Why do we star GitHub projects? JSS, to appear, with
Hudson 29

Interesting: developers care about this metric! 31

Why do we star GitHub projects? • Hudson asked this
question to 4,370 GitHub users ◦ right after they starred a popular repository ◦ 791 answers (19%) 33

Key Finding #1 We star repositories to show appreciation (~50%)
and to bookmark projects (~50%) 34

Key Finding #2 3 out of 4 devs consider stars
before contributing or using GitHub projects 35

Why do we need "why-surveys"? Where's the practicability? 36

(1) Surveys require building tools (2) Surveys are used to
evaluate tools (3) Surveys motivate building tools (4) Surveys contribute to public datasets 37

38 (1) Surveys require building tools

Refactoring Detection Tools • RefactoringMiner (1.0) (Tsantalis, CASCON 2013) •
RefactoringMiner (1.1) (Tsantalis, Danilo, MT, FSE 2016) • RefDiff (new tool) (1.0) (Danilo, MT, MSR 2017) • RefactoringMiner (2.0) (Tsantalis et al., ICSE 2018) • RefDiff (2.0) (??) 39

Refactoring Detection Tools • RefactoringMiner (1.0) (Tsantalis, CASCON 2013) •
RefactoringMiner (1.1) (Tsantalis, Danilo, MT, FSE 2016) • RefDiff (new tool) (1.0) (Danilo, MT, MSR 2017) • RefactoringMiner (2.0) (Tsantalis et al., ICSE 2018) • RefDiff (2.0) (??) ⇒ refactoring-aware tools (code reviews, MSR etc) [ see Andre, Romain & MT, ICSE18 ] 40

(2) Surveys are used to evaluate tools 41

Truck (or Bus) Factor 42

Truck (or Bus) Factor 43 The minimum number of developers
that if hit by a truck (or bus) will put a project in a serious risk

TF reveals the concentration of knowledge in software projects 44

Interesting: (1) developers care about this metric 45

Interesting: (2) TF has also value on closed projects 46

47 ICPC 2016, with Guilherme, Leonardo, and Andre

Evaluation with GitHub Projects 133 projects (TF ≤ 2) 65%
48

Validation with Developers • 114 projects • 62 answers (54%)
• 84% agree/partially agree 49

(3) Surveys motivate building tools 50

Example: gittrends.io 51

gittrends.io 52

gittrends.io 53

Growth Patterns 54

(4) Surveys contribute to public datasets 55

Example: Refactoring instances 56 https://github.com/aserg-ufmg/why-we-refactor

Ethical Considerations 57

58 Slide used by the authors in their ESEM presentation

"I get emails like this every week ... This problem
[is] worse than spam, since Google at least filters out spam for me". 59

Recommendations for Large Scale Surveys 60

Based on our experience/lessons learned 1. Questions should focus on
practical and prevailing problems 2. Questions should focus on recent events 3. Questions should be sent by e-mail 4. Mails should have 2-3 short and clear questions 5. Avoid sending thousands of mails 6. Never send two mails to the same person (even in distinct studies) 7. Never identify the participants (names, e-mails, projects, etc) 61

In our experience, Strong correlation: (practical value) vs (response rate)
practical value ➜ at least 20% response rates 62

Part II Large Scale Quantitative Studies (no mails to developers)
63

Example 1: Quantitative Analysis of Breaking Changes 64

65 501,645 LIBRARY CHANGES 28% BREAKING CHANGES SANER 2017, with
Laerte, Aline, Andre

Example 2: Quantitative Analysis of Deprecation Messages 66

67 API Deprecation Messages

68 % API elements deprecated with replacement msgs SANER 2016
& JSS 2018, with Gleison and Andre

69 % API elements deprecated with replacement msgs SANER 2016
& JSS 2018, with Gleison and Andre ⇒ automatic documentation

Part III Research Promotion (no mails to developers) 70

Key to make our results known by developers (i.e. to
transfer knowledge) 71

Promotion Channels • http://aserg.labsoft.dcc.ufmg.br • https://github.com/aserg-ufmg • @mtov • https://medium.com/@aserg.ufmg
• https://arxiv.org • https://speakerdeck.com/aserg_ufmg 72

[ there are other forms to transfer knowledge e.g. open
tools = important, but very hard ] 73

Lots of (mostly positive) feedback! 74

75 "An astonishing paper that may explain why it's so
difficult to patch"

76 Hacker News Effect

PeerJ Preprint (2015) 77 https://peerj.com/preprints/1233

78 "Oh, nice to hear from you! I heard a
lot about (and read) your group's truck factor paper. Cool work!" (answer received in another survey, not related with TFs)

Best citations ever ... 79

80 Best citations ever: devs conferences ... Heather Miller's talk,
Lambda Days 2018

81 Best citations ever … or tech reports Nadia Eghbal,
Ford Foundation.

Scalability & Practicability (& Promotion) Thanks! 82

Scalability, Practicability (and Promotion) in ...

Scalability, Practicability (and Promotion) in SE Research

More Decks by ASERG, DCC, UFMG

Other Decks in Science

Featured

Transcript