a software application that provides information items estimated to be valuable for a software engineering task in a given context." [Robillard, Walker, Zimmermann, 2009] B+
cover the Ohloh universe with respect to seven dimensions (language, size, contributors, churn, commits, age, activity). Each point in the graph means that x projects can cover y percent of the universe. Meiyappan Nagappan, Thomas Zimmermann, Christian Bird: Diversity in software engineering research. ESEC/SIGSOFT FSE 2013: 466-476 people projects knowledge
and systematic reasoning to make decisions. Definition by Thomas H. Davenport, Jeanne G. Harris Analytics at Work – Smarter Decisions, Better Results software analytics is analytics on software data
Assessing the value of branches with what-if analysis. SIGSOFT FSE 2012: 45 Emad Shihab, Christian Bird, Thomas Zimmermann: The effect of branching strategies on software quality. ESEM 2012: 301-310 Christian Bird, Thomas Zimmermann, Alex Teterev: A theory of branches as goals and virtual teams. CHASE 2011: 53-56
understand problems with branching • Mine source control for relationship of teams and branches • Simulate benefits and cost of alternative branch structures Actions/Tools: • Alert stakeholders about possible conflicts • Recommend branch structure (delete, create, fold branches) • Perform semi-automatic branch refactoring
of branches by file similarity and developer similarity. Dark areas mean many branch pairs in that area. Same files, but different team means potential problems Same files, but different team means potential problems Different Files Same Files Different Teams Same Teams
to assess cost and benefit of individual branches • Cost: Average Delay Increase per Edit How much delay does a branch introduce into development? • Cost: Integrations per Edit on a Branch What is the integration/edit within a branch? • Benefit: Provided Isolation per Edit How many conflicts does a branch prevent per edit?
Green dots are branches with high benefit and low cost Red dots are branches with high cost but low benefit Each dot is a branch If high-cost-low-benefit had been removed, changes would each have saved 8.9 days of delay and only introduced 0.04 additional conflicts.
Worthwhile How do users typically use my application? 80.0% 99.2% What parts of a software product are most used and/or loved by customers? 72.0% 98.5% How effective are the quality gates we run at checkin? 62.4% 96.6% How can we improve collaboration and sharing between teams? 54.5% 96.4% What are the best key performance indicators (KPIs) for monitoring services? 53.2% 93.6% What is the impact of a code change or requirements change to the project and its tests? 52.1% 94.0% What is the impact of tools on productivity? 50.5% 97.2% How do I avoid reinventing the wheel by sharing and/or searching for code? 50.0% 90.9% What are the common patterns of execution in my application? 48.7% 96.6% How well does test coverage correspond to actual code usage by our customers? 48.7% 92.0%
recommendations – What analysis method to use and when? • How to understand results from data? • How to measure success/insight? • Provide tools to transform manual empirical analysis into reusable analysis