Slide 1

Slide 1 text

© Microsoft Corporation Software Analytics = Sharing Information Thomas Zimmermann Microsoft Research, USA

Slide 2

Slide 2 text

© Microsoft Corporation

Slide 3

Slide 3 text

© Microsoft Corporation 40 percent of major decisions are based not on facts, but on the manager’s gut. Accenture survey among 254 US managers in industry. http://newsroom.accenture.com/article_display.cfm?article_id=4777

Slide 4

Slide 4 text

© Microsoft Corporation analytics is the use of analysis, data, and systematic reasoning to make decisions. Definition by Thomas H. Davenport, Jeanne G. Harris Analytics at Work – Smarter Decisions, Better Results

Slide 5

Slide 5 text

© Microsoft Corporation web analytics (Slide by Ray Buse)

Slide 6

Slide 6 text

© Microsoft Corporation game analytics Halo heat maps Free to play

Slide 7

Slide 7 text

© Microsoft Corporation

Slide 8

Slide 8 text

© Microsoft Corporation

Slide 9

Slide 9 text

© Microsoft Corporation FREE for ESEM Attendees: http://qmags.com/ISW/free-esem

Slide 10

Slide 10 text

© Microsoft Corporation history of software analytics Tim Menzies, Thomas Zimmermann: Software Analytics: So What? IEEE Software 30(4): 31-37 (2013)

Slide 11

Slide 11 text

© Microsoft Corporation the many names software intelligence software analytics software development analytics analytics for software development empirical software engineering mining software repositories

Slide 12

Slide 12 text

© Microsoft Corporation the many definitions

Slide 13

Slide 13 text

© Microsoft Corporation Ahmed E. Hassan, Tao Xie: Software intelligence: the future of mining software engineering data. FoSER 2010: 161-166 [Software Intelligence] offers software practitioners (not just developers) up-to-date and pertinent information to support their daily decision-making processes. […] Raymond P. L. Buse, Thomas Zimmermann: Analytics for software development. FoSER 2010: 77-80 The idea of analytics is to leverage potentially large amounts of data into real and actionable insights. Dongmei Zhang, Yingnong Dang, Jian- Guang Lou, Shi Han, Haidong Zhang, and Tao Xie, Software Analytics as a Learning Case in Practice: Approaches and Experiences. MALETS 2011 Software analytics is to enable software practitioners1 to perform data exploration and analysis in order to obtain insightful and actionable information for data-driven tasks around software and services. 1 Software practitioners typically include software developers, testers, usability engineers, and managers, etc. Raymond P. L. Buse, Thomas Zimmermann: Information needs for software development analytics. ICSE 2012: 987-996 Software development analytics […] empower[s] software development teams to independently gain and share insight from their data without relying on a separate entity. Tim Menzies, Thomas Zimmermann: Software Analytics: So What? IEEE Software 30(4): 31-37 (2013) Software analytics is analytics on software data for managers and software engineers with the aim of empowering software development individuals and teams to gain and share insight from their data to make better decisions. Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, Tao Xie: Software Analytics in Practice. IEEE Software 30(5): 30-37 (2013) With software analytics, software practitioners explore and analyze data to obtain insightful, actionable information for tasks regarding software development, systems, and users.

Slide 14

Slide 14 text

© Microsoft Corporation the many definitions

Slide 15

Slide 15 text

© Microsoft Corporation trinity of software analytics Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, Tao Xie: Software Analytics in Practice. IEEE Software 30(5): 30-37, September/October 2013. MSR Asia Software Analytics group: http://research.microsoft.com/en-us/groups/sa/

Slide 16

Slide 16 text

© Microsoft Corporation inductive engineering The Inductive Software Engineering Manifesto: Principles for Industrial Data Mining. Tim Menzies, Christian Bird, Thomas Zimmermann, Wolfram Schulte and Ekrem Kocaganeli. In MALETS 2011: Proceedings International Workshop on Machine Learning Technologies in Software Engineering

Slide 17

Slide 17 text

© Microsoft Corporation guidelines for analytics Be easy to use. People aren't always analysis experts. Be concise. People have little time. Measure many artifacts with many indicators. Identify important/unusual items automatically. Relate activity to features/areas. Focus on past & present over future. Recognize that developers and managers have different needs. Information Needs for Software Development Analytics. Ray Buse, Thomas Zimmermann. ICSE 2012 SEIP Track

Slide 18

Slide 18 text

© Microsoft Corporation Information Needs for Software Development Analytics. Ray Buse, Thomas Zimmermann. ICSE 2012 SEIP Track Description Insight Relevant Techniques Summarization Search for important or unusual factors to associated with a time range. Characterize events, understand why they happened. Topic analysis, NLP Alerts (& Correlations) Continuous search for unusual changes or relationships in variables Notice important events. Statistics, Repeated measures Forecasting Search for and predict unusual events in the future based on current trends. Anticipate events. Extrapolation, Statistics Trends How is an artifact changing? Understand the direction of the project. Regression analysis Overlays What artifacts account for current activity? Understand the relationships between artifacts. Cluster analysis, repository mining Goals How are features/artifacts changing in the context of completion or some other goal? Assistance for planning Root-cause analysis Modeling Compares the abstract history of similar artifacts. Identify important factors in history. Learn from previous projects. Machine learning Benchmarking Identify vectors of similarity/difference across artifacts. Assistance for resource allocation and many other decisions Statistics Simulation Simulate changes based on other artifact models. Assistance for general decisions What-if? analysis

Slide 19

Slide 19 text

© Microsoft Corporation Smart analytics

Slide 20

Slide 20 text

© Microsoft Corporation

Slide 21

Slide 21 text

© Microsoft Corporation

Slide 22

Slide 22 text

© Microsoft Corporation Jack Bauer

Slide 23

Slide 23 text

© Microsoft Corporation Chloe O’Brian

Slide 24

Slide 24 text

© Microsoft Corporation

Slide 25

Slide 25 text

© Microsoft Corporation All he needed was a paper clip

Slide 26

Slide 26 text

© Microsoft Corporation smart analytics is actionable

Slide 27

Slide 27 text

© Microsoft Corporation smart analytics is real time

Slide 28

Slide 28 text

© Microsoft Corporation Scene from the movie War Games (1983).

Slide 29

Slide 29 text

© Microsoft Corporation smart analytics is diversity

Slide 30

Slide 30 text

© Microsoft Corporation Researcher Developer Tester Dev. Lead Test Lead Manager stakeholders tools questions

Slide 31

Slide 31 text

© Microsoft Corporation Measurements Surveys Benchmarking Qualitative Analysis Clustering Prediction What-if analysis Segmenting Multivariate Analysis Interviews stakeholders tools questions

Slide 32

Slide 32 text

© Microsoft Corporation Build tools for frequent questions Use data scientists for infrequent questions Frequency Questions stakeholders tools questions

Slide 33

Slide 33 text

© Microsoft Corporation Percentages Question Category Essential Worthwhile+ Unw  Q27 How do users typically use my application? DP 80.0% 99.2% 0.8  Q18 What parts of a software product are most used and/or loved by customers? CR 72.0% 98.5% 0.0  Q50 How effective are the quality gates we run at checkin? DP 62.4% 96.6% 0.8  Q115 How can we improve collaboration and sharing between teams? TC 54.5% 96.4% 0.0  Q86 What are best key performance indicators (KPIs) for monitoring services? SVC 53.2% 93.6% 0.9  Q40 What is the impact of a code change or requirements change to the project and tests? DP 52.1% 94.0% 0.0  Q74 What is the impact of tools on productivity? PROD 50.5% 97.2% 0.9  Q84 How do I avoid reinventing the wheel by sharing and/or searching for code? RSC 50.0% 90.9% 0.9  Q28 What are the common patterns of execution in my application? DP 48.7% 96.6% 0.8  Q66 How well does test coverage correspond to actual code usage by our customers? EQ 48.7% 92.0% 0.0  Q42 What tools can help us measure and estimate the risk associated with code changes? DP 47.8% 92.2% 0.0  Q59 What are effective metrics for ship quality? EQ 47.8% 96.5% 1.7  Q100 How much do design changes cost us and how can we reduce their risk? SL 46.6% 94.8% 0.8  Q19 What are the best ways to change a product's features without losing customers? CR 46.2% 92.3% 1.5  Q131 Which test strategies find the most impactful bugs (e.g., assertions, in-circuit testing, A/B testing)? TP 44.5% 91.8% 0.9  Q83 When should I write code from scratch vs. reuse legacy code? RSC 44.5% 84.5% 3.6 Q1 What is the impact and/or cost of findings bugs at a certain stage in the development cycle? BUG 43.1% 87.9% 2.5  Q92 What is the tradeoff between releasing more features or releasing more often? SVC 42.5% 79.6% 0.0  Q2 What kinds of mistakes do developers make in their software? Which ones are the most common? BUG 41.7% 98.3% 0.0  Q25 How important is a particular requirement? CR 41.7% 87.4% 2.3  Q60 How should we use metrics to help us decide when a feature is good enough to release (or poor enough to cancel)? EQ 41.1% 90.2% 3.5  Q17 What is the best way to collect customer feedback? CR 39.8% 93.0% 1.5  Q3 In what places in their software code do developers make the most mistakes? BUG 35.0% 94.0% 0.0 What kinds of problems happen because there is too much software process? © Microsoft Corporation Analyze This! 145 Questions for Data Scientists in Software Engineering. Andrew Begel, Thomas Zimmermann.

Slide 34

Slide 34 text

© Microsoft Corporation  Customer  Practices and processes  Product quality Analyze This! 145 Questions for Data Scientists in Software Engineering. Andrew Begel, Thomas Zimmermann.

Slide 35

Slide 35 text

© Microsoft Corporation Percentages Question Category Essential Worthwhile+ Unwise Q72 Which individual measures correlate with employee productivity (e.g., employee age, tenure, engineering skills, education, promotion velocity, IQ)? PROD 7.3% 44.5% 25.5% Q71 Which coding measures correlate with employee productivity (e.g., lines of code, time it takes to build the software, a particular tool set, pair programming, number of hours of coding per day, language)? PROD 15.6% 56.9% 22.0% Q75 What metrics can be used to compare employees? PROD 19.4% 67.6% 21.3% Q70 How can we measure the productivity of a Microsoft employee? PROD 19.1% 70.9% 20.9% Q6 Is the number of bugs a good measure of developer effectiveness? BUG 16.4% 54.3% 17.2% Q128 Can I generate 100% test coverage? TP 15.3% 44.1% 14.4% Q113 Who should be in charge of creating and maintaining a consistent company-wide software process and tool chain? PROC 21.9% 55.3% 12.3% Q112 What are the benefits of a consistent, company-wide software process and tool chain? PROC 25.2% 78.3% 10.4% Q34 When are code comments worth the effort to write them? DP 7.9% 41.2% 9.6% Q24 How much time and money does it cost to add customer input into your design? CR 15.9% 68.2% 8.3% Analyze This! 145 Questions for Data Scientists in Software Engineering. Andrew Begel, Thomas Zimmermann. not every question is “wise”

Slide 36

Slide 36 text

© Microsoft Corporation smart analytics is people

Slide 37

Slide 37 text

© Microsoft Corporation The Decider The Brain The Innovator

Slide 38

Slide 38 text

© Microsoft Corporation The Researcher PROMISE 2011, Banff, Canada.

Slide 39

Slide 39 text

© Microsoft Corporation smart analytics is sharing

Slide 40

Slide 40 text

© Microsoft Corporation Sharing Insights Sharing Methods Sharing Models Sharing Data

Slide 41

Slide 41 text

© Microsoft Corporation Sharing Data

Slide 42

Slide 42 text

© Microsoft Corporation

Slide 43

Slide 43 text

© Microsoft Corporation

Slide 44

Slide 44 text

© Microsoft Corporation Sharing Models

Slide 45

Slide 45 text

© Microsoft Corporation Defect prediction • Learn a prediction model from historic data • Predict defects for the same project • Hundreds of prediction models exist • Models work fairly well with precision and recall of up to 80%. Predictor Precision Recall Pre-Release Bugs 73.80% 62.90% Test Coverage 83.80% 54.40% Dependencies 74.40% 69.90% Code Complexity 79.30% 66.00% Code Churn 78.60% 79.90% Org. Structure 86.20% 84.00% From: N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality. ICSE 2008.

Slide 46

Slide 46 text

© Microsoft Corporation Why cross-project prediction? Some projects do have not enough data to train prediction models or the data is of poor quality New projects do have no data yet Can such projects use models from other projects? (=cross-project prediction) Thomas Zimmermann, Nachiappan Nagappan, Harald Gall, Emanuel Giger, Brendan Murphy: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. ESEC/SIGSOFT FSE 2009: 91-100

Slide 47

Slide 47 text

© Microsoft Corporation A first experiment: Firefox and IE Firefox can predict defects in IE. But IE cannot predict Firefox. WHY? precision=0.76; recall=0.88 precision=0.54; recall=0.04 Firefox Internet Explorer

Slide 48

Slide 48 text

© Microsoft Corporation 622 experiments later only 3.4%successful

Slide 49

Slide 49 text

© Microsoft Corporation

Slide 50

Slide 50 text

© Microsoft Corporation Sharing models Sharing models does not always work. In what situations does sharing models work?

Slide 51

Slide 51 text

© Microsoft Corporation Sharing Insights Sharing Insights Sharing Methods

Slide 52

Slide 52 text

© Microsoft Corporation Skill in Halo Reach Jeff Huang, Thomas Zimmermann, Nachiappan Nagappan, Charles Harrison, Bruce C. Phillips: Mastering the art of war: how patterns of gameplay influence skill in Halo. CHI 2013: 695-704

Slide 53

Slide 53 text

How do patterns of play affect players’ skill in Halo Reach? 5 Skill and Other Titles 6 Skill Changes and Retention 7 Mastery and Demographics 8 Predicting Skill 2 Play Intensity 3 Skill after Breaks 4 Skill before Breaks 1 General Statistics

Slide 54

Slide 54 text

The Cohort of Players The mean skill value µ for each player after each Team Slayer match µ ranges between 0 and 10, although 50% fall between 2.5 and 3.5 Initially µ = 3 for each player, stabilizing after a couple dozen matches TrueSkill in Team Slayer We looked at the cohort of players who started in the release week with complete set of gameplay for those players up to 7 months later (over 3 million players) 70 Person Survey about Player Experience

Slide 55

Slide 55 text

© Microsoft Corporation Analysis of Skill Data Step 1: Select a population of players. For our Halo study, we selected a cohort of 3.2 million Halo Reach players on Xbox Live who started playing the game in its first week of release. Step 2: If necessary, sample the population of players and ensure that the sample is representative. In our study we used the complete population of players in this cohort, and our dataset had every match played by that population. Step 3: Divide the population into groups and plot the development of the dependent variable over time. For example, when plotting the players’ skill in the charts, we took the median skill at every point along the x-axis for each group in order to reduce the bias that would otherwise occur when using the mean. Step 4: Convert the time series into a symbolic representation to correlate with other factors, for example retention. Repeat steps 1–4 as needed for any other dependent variables of interest.

Slide 56

Slide 56 text

2 Play Intensity Telegraph operators gradually increase typing speed over time

Slide 57

Slide 57 text

2.1 2.3 2.5 2.7 2.9 3.1 0 10 20 30 40 50 60 70 80 90 100 mu Games Played So Far 2 Play Intensity Median skill typically increases slowly over time

Slide 58

Slide 58 text

2 Play Intensity (Games per Week) 2.1 2.3 2.5 2.7 2.9 3.1 0 10 20 30 40 50 60 70 80 90 100 mu Games Played So Far 0 - 2 games / week [N=59164] 2 - 4 games / week [N=101448] 4 - 8 games / week [N=226161] 8 - 16 games / week [N=363832] 16 - 32 games / week [N=319579] 32 - 64 games / week [N=420258] 64 - 128 games / week [N=415793] 128 - 256 games / week [N=245725] 256+ games / week [N=115010] But players who play more overall eventually surpass those who play 4–8 games per week (not shown in chart) Players who play 4–8 games per week do best Median skill typically increases slowly over time

Slide 59

Slide 59 text

3 Change in Skill Following a Break “In the most drastic scenario, you can lose up to 80 percent of your fitness level in as few as two weeks [of taking a break]…”

Slide 60

Slide 60 text

-0.03 -0.02 -0.01 0 0.01 0.02 0.03 0 5 10 15 20 25 30 35 40 45 50 Δmu Days of Break Next Game 2 Games Later 3 Games Later 4 Games Later 5 games later 10 games later 3 Change in Skill Following a Break Median skill slightly increases after each game played without breaks Longer breaks correlate with larger skill drops, but not linearly On average, it takes 8–10 games to regain skill lost after 30 day breaks Breaks of 1–2 days correlate in tiny drops in skill

Slide 61

Slide 61 text

6 Skill Changes and Retention SAX (Symbolic Aggregate approXimation) discretizes time series into a symbolic representation

Slide 62

Slide 62 text

Time-series of skill measured for first 100 games Most common pattern is steady improvement of skill Next most common pattern is a steady decline in skill 6 Skill Changes and Retention Pattern Frequency Total Games 61791 217 45814 252 36320 257 27290 219 22759 216 22452 253 20659 260 20633 222 19858 247 19292 216 17573 219 17454 245 17389 260 15670 215 13692 236 12516 239

Slide 63

Slide 63 text

Time-series of skill measured for first 100 games Most common pattern is steady improvement of skill Next most common pattern is a steady decline in skill Improving players actually end up playing fewer games than players with declining skill Pattern Frequency Total Games 61791 217 45814 252 36320 257 27290 219 22759 216 22452 253 20659 260 20633 222 19858 247 19292 216 17573 219 17454 245 17389 260 15670 215 13692 236 12516 239 6 Skill Changes and Retention

Slide 64

Slide 64 text

© Microsoft Corporation Social behavior in a Shooter game Sauvik Das, Thomas Zimmermann, Nachiappan Nagappan, Bruce Phillips, Chuck Harrison. Revival Actions in a Shooter Game. DESVIG 2013 Workshop

Slide 65

Slide 65 text

© Microsoft Corporation Impact of social behavior on retention AAA title 26,000 players with ~1,000,000 sessions of game play data Random sample

Slide 66

Slide 66 text

© Microsoft Corporation Players who revive other players Dimension Characteristic Change Engagement Session count +297.44% Skill Kills +100.21% Was revived –54.55% Deaths –12.44% Success Likelihood to win match +18.88% Social Gave weapon +486.14%

Slide 67

Slide 67 text

© Microsoft Corporation A simple social model Player-instigated Team-instigated With-Enemy

Slide 68

Slide 68 text

© Microsoft Corporation Analysis pattern: Cluster + Contrast 1. Use k-means clustering to cluster players in the sample along the social features. 2. Analyze the cluster centroids to understand the differences in social behavior across clusters. 3. Run a survival analysis to observe trends in retention across clusters. 72

Slide 69

Slide 69 text

© Microsoft Corporation Call to Action

Slide 70

Slide 70 text

© Microsoft Corporation Book “Analyzing Software Data” http://menzies.us/asd Proposals due October 15

Slide 71

Slide 71 text

© Microsoft Corporation Data Analysis Patterns http://dapse.unbox.org/

Slide 72

Slide 72 text

© Microsoft Corporation

Slide 73

Slide 73 text

© Microsoft Corporation smart analytics is actionable real time diversity people sharing © Microsoft Corporation Usage analytics Analytics for Xbox games © Microsoft Corporation Sharing Insights Sharing Methods Sharing Models Sharing Data

Slide 74

Slide 74 text

© Microsoft Corporation Thank you!