the future of mining software engineering data. FoSER 2010: 161-166 [Software Intelligence] offers software practitioners (not just developers) up-to-date and pertinent information to support their daily decision-making processes. […] Raymond P. L. Buse, Thomas Zimmermann: Analytics for software development. FoSER 2010: 77-80 The idea of analytics is to leverage potentially large amounts of data into real and actionable insights. Dongmei Zhang, Yingnong Dang, Jian- Guang Lou, Shi Han, Haidong Zhang, and Tao Xie, Software Analytics as a Learning Case in Practice: Approaches and Experiences. MALETS 2011 Software analytics is to enable software practitioners1 to perform data exploration and analysis in order to obtain insightful and actionable information for data-driven tasks around software and services. 1 Software practitioners typically include software developers, testers, usability engineers, and managers, etc. Raymond P. L. Buse, Thomas Zimmermann: Information needs for software development analytics. ICSE 2012: 987-996 Software development analytics […] empower[s] software development teams to independently gain and share insight from their data without relying on a separate entity. Tim Menzies, Thomas Zimmermann: Software Analytics: So What? IEEE Software 30(4): 31-37 (2013) Software analytics is analytics on software data for managers and software engineers with the aim of empowering software development individuals and teams to gain and share insight from their data to make better decisions. Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, Tao Xie: Software Analytics in Practice. IEEE Software 30(5): 30-37 (2013) With software analytics, software practitioners explore and analyze data to obtain insightful, actionable information for tasks regarding software development, systems, and users.
Principles for Industrial Data Mining. Tim Menzies, Christian Bird, Thomas Zimmermann, Wolfram Schulte and Ekrem Kocaganeli. In MALETS 2011: Proceedings International Workshop on Machine Learning Technologies in Software Engineering
People aren't always analysis experts. Be concise. People have little time. Measure many artifacts with many indicators. Identify important/unusual items automatically. Relate activity to features/areas. Focus on past & present over future. Recognize that developers and managers have different needs. Information Needs for Software Development Analytics. Ray Buse, Thomas Zimmermann. ICSE 2012 SEIP Track
Buse, Thomas Zimmermann. ICSE 2012 SEIP Track Description Insight Relevant Techniques Summarization Search for important or unusual factors to associated with a time range. Characterize events, understand why they happened. Topic analysis, NLP Alerts (& Correlations) Continuous search for unusual changes or relationships in variables Notice important events. Statistics, Repeated measures Forecasting Search for and predict unusual events in the future based on current trends. Anticipate events. Extrapolation, Statistics Trends How is an artifact changing? Understand the direction of the project. Regression analysis Overlays What artifacts account for current activity? Understand the relationships between artifacts. Cluster analysis, repository mining Goals How are features/artifacts changing in the context of completion or some other goal? Assistance for planning Root-cause analysis Modeling Compares the abstract history of similar artifacts. Identify important factors in history. Learn from previous projects. Machine learning Benchmarking Identify vectors of similarity/difference across artifacts. Assistance for resource allocation and many other decisions Statistics Simulation Simulate changes based on other artifact models. Assistance for general decisions What-if? analysis
Which individual measures correlate with employee productivity (e.g., employee age, tenure, engineering skills, education, promotion velocity, IQ)? PROD 7.3% 44.5% 25.5% Q71 Which coding measures correlate with employee productivity (e.g., lines of code, time it takes to build the software, a particular tool set, pair programming, number of hours of coding per day, language)? PROD 15.6% 56.9% 22.0% Q75 What metrics can be used to compare employees? PROD 19.4% 67.6% 21.3% Q70 How can we measure the productivity of a Microsoft employee? PROD 19.1% 70.9% 20.9% Q6 Is the number of bugs a good measure of developer effectiveness? BUG 16.4% 54.3% 17.2% Q128 Can I generate 100% test coverage? TP 15.3% 44.1% 14.4% Q113 Who should be in charge of creating and maintaining a consistent company-wide software process and tool chain? PROC 21.9% 55.3% 12.3% Q112 What are the benefits of a consistent, company-wide software process and tool chain? PROC 25.2% 78.3% 10.4% Q34 When are code comments worth the effort to write them? DP 7.9% 41.2% 9.6% Q24 How much time and money does it cost to add customer input into your design? CR 15.9% 68.2% 8.3% Analyze This! 145 Questions for Data Scientists in Software Engineering. Andrew Begel, Thomas Zimmermann. not every question is “wise”
from historic data • Predict defects for the same project • Hundreds of prediction models exist • Models work fairly well with precision and recall of up to 80%. Predictor Precision Recall Pre-Release Bugs 73.80% 62.90% Test Coverage 83.80% 54.40% Dependencies 74.40% 69.90% Code Complexity 79.30% 66.00% Code Churn 78.60% 79.90% Org. Structure 86.20% 84.00% From: N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality. ICSE 2008.
not enough data to train prediction models or the data is of poor quality New projects do have no data yet Can such projects use models from other projects? (=cross-project prediction) Thomas Zimmermann, Nachiappan Nagappan, Harald Gall, Emanuel Giger, Brendan Murphy: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. ESEC/SIGSOFT FSE 2009: 91-100
each player after each Team Slayer match µ ranges between 0 and 10, although 50% fall between 2.5 and 3.5 Initially µ = 3 for each player, stabilizing after a couple dozen matches TrueSkill in Team Slayer We looked at the cohort of players who started in the release week with complete set of gameplay for those players up to 7 months later (over 3 million players) 70 Person Survey about Player Experience
a population of players. For our Halo study, we selected a cohort of 3.2 million Halo Reach players on Xbox Live who started playing the game in its first week of release. Step 2: If necessary, sample the population of players and ensure that the sample is representative. In our study we used the complete population of players in this cohort, and our dataset had every match played by that population. Step 3: Divide the population into groups and plot the development of the dependent variable over time. For example, when plotting the players’ skill in the charts, we took the median skill at every point along the x-axis for each group in order to reduce the bias that would otherwise occur when using the mean. Step 4: Convert the time series into a symbolic representation to correlate with other factors, for example retention. Repeat steps 1–4 as needed for any other dependent variables of interest.
2.9 3.1 0 10 20 30 40 50 60 70 80 90 100 mu Games Played So Far 0 - 2 games / week [N=59164] 2 - 4 games / week [N=101448] 4 - 8 games / week [N=226161] 8 - 16 games / week [N=363832] 16 - 32 games / week [N=319579] 32 - 64 games / week [N=420258] 64 - 128 games / week [N=415793] 128 - 256 games / week [N=245725] 256+ games / week [N=115010] But players who play more overall eventually surpass those who play 4–8 games per week (not shown in chart) Players who play 4–8 games per week do best Median skill typically increases slowly over time
15 20 25 30 35 40 45 50 Δmu Days of Break Next Game 2 Games Later 3 Games Later 4 Games Later 5 games later 10 games later 3 Change in Skill Following a Break Median skill slightly increases after each game played without breaks Longer breaks correlate with larger skill drops, but not linearly On average, it takes 8–10 games to regain skill lost after 30 day breaks Breaks of 1–2 days correlate in tiny drops in skill
pattern is steady improvement of skill Next most common pattern is a steady decline in skill 6 Skill Changes and Retention Pattern Frequency Total Games 61791 217 45814 252 36320 257 27290 219 22759 216 22452 253 20659 260 20633 222 19858 247 19292 216 17573 219 17454 245 17389 260 15670 215 13692 236 12516 239
pattern is steady improvement of skill Next most common pattern is a steady decline in skill Improving players actually end up playing fewer games than players with declining skill Pattern Frequency Total Games 61791 217 45814 252 36320 257 27290 219 22759 216 22452 253 20659 260 20633 222 19858 247 19292 216 17573 219 17454 245 17389 260 15670 215 13692 236 12516 239 6 Skill Changes and Retention
k-means clustering to cluster players in the sample along the social features. 2. Analyze the cluster centroids to understand the differences in social behavior across clusters. 3. Run a survival analysis to observe trends in retention across clusters. 72