Bridging the Two Solitudes
What is this talk about?
Let's use our data analysis tools to analyze our
Let's make smarter decisions based on what we
Teach others what we've learned and help them
reproduce our experiments
“...an important aspect of Haskell's
power lies in the compactness of the
code we write.”
“Compared to working in
popular traditional languages,
when we develop in Haskell
we often write much less code,”
“...in substantially less time,
and with fewer bugs.”
“My impression is that the
Haskell shrinking factor
averages around four, but
obviously it varies a lot.”
This isn't “someone is
wrong on the internet”
This is “someone could be
more right on the internet”
Python is more readable than other languages.
Unit testing helps us maintain quality.
We don't do unit testing; it takes too much time for
too little benefit.
Python is good for beginners.
C++ is better for beginners.
Happy programmers make better programmers.
Most patches don't get many reviewers.
You should keep your patches small.
Most code review is just nitpicking.
“Many eyeballs make all bugs shallow.”
Facts with no
Scientists get this
Can we do better?
We do benchmarks to test performance.
We usability test our UIs.
We use analytics on our websites.
Why don't we analyze ourselves?
What We Actually Know About Software
Development, and Why We Believe It's True
Offices: doors open or closed?
Pair programming: yea or nay?
Modern code review
Failure prediction using organizational structure
Some interesting papers
What can we learn about code review?
Rigby* and Bird, 2013 @ FSE
“Convergent Contemporary Software Peer
* my supervisor last summer
First Responses on Reviews
Number of Reviewers
Hindle et al, 2006
Academia is hard.
Let's write code!
People read it
People reproduced it
People fixed it
People improved it
That was fun
Gousios, G., Pinzger, M., and van Deursen, A., 2013
80% of pull requests merged in < 3 days
while 30% merged < 1 hour.
70% of all pull requests are merged.
To merge or not to merge:
1. How active the area affect by the pull request
has been recently
2. The size of the project
3. The number of files changed by the pull
How do people use git?
Tavish's Git Workflow
Julia's Git Workflow
Mozilla Baloo: Tracking contributor data
What time of day do most bugs get checked in?
git2json: Tools to pull out data from VCS
gitcoach by Mike Hoye: “You modified file X?
Consider also modifying file Y.”
Idea: PR-lint: suggest improvements to pull
requests based on Gousios
Study how you write software
Teach your friends something cool
And back it up with evidence
Share what you learn with others
Build tools to make this easier
Send me emails!