Xen Project Working for Citrix Member of the group that develops XenServer Chairman of Xen Project Advisory Board The Xen Project Develops the Xen Project Hypervisor Linux Foundation Collaborative Project Used by the biggest cloud providers (AWS, …) Lots of commercial Xen variants
Co-Founder of Bitergia Focused on open analytics Open source tools to analyse open source projects Currently working on The Xen Project Dashboard OpenStack and OPNFV quarterly reports Gender diversity analysis Developer of Metrics Grimoire and VizGrimoire analysis toolchains
traffic Growth of Development Activity >100% in 5 years 2015: Getting a sense of scale 240 developers from 95 orgs 12K commits 50 core developers from 18 orgs 4K commits in core hypervisor repos
Asia several times Architecture/Design Reviews Avoid late disagreements on architecture/design Governance Retry Governance Mostly Failed Seek Help Use statistical Analysis
where each patch is a new thread (e.g. not using git-send) Cross posted messages (e.g. from LKML) Versions: [PATCH vX Y/Z] Not always regular: need to use heuristics / regular expressions Missing versions (e.g. version starts at v5) Patch number (Y of Z): [PATCH vX Y/Z] Not always regular: need to use heuristics / regular expressions Matching (Mail) Threads and Commits Issues with commit timestamps Some patches share the same subject line
6/7 patches = X days Series 3 times larger = 10 x X days Complexity increases code review time The bigger, the worse Solution? Break patches into smaller series? Not always possible for complex features (can’t review code in isolation) Code contributions are more complex and touch more areas of code
reviews are included in the study. Issue: Cross posting of Linux/QEMU/… patches Xen-devel@ Have not yet implemented matching against different non-Xen repositories
Reviewers (people and orgs) UC 2: Identify Imbalances between Reviewers and Contributors (freeloading) UC 3: Identify Post-ACK comments (an indirect measure of “unnecessary conflict”) Performance Use Cases: Spot issues early UC 4: Identify delays due to large number of revisions (quality, conflict, communication) UC 5: Identify delays due to large Patch Series (complexity, coordination) Backlog Use Cases: Optimize Process and Focus UC 6: Merged and not merged (did something get missed?) UC 7: Identify nearly completed Patch Series (focus by % ACKed) UC 8: Identify of Hot/Warm/Tepid/Cold/Freezing/Dead Patch Series (focus by activity)
Intention: • Allow reviewers to focus on nearly complete series Get patches completed more quickly • Can be used effectively with advanced search queries ACK info
Identify freezing/dead reviews (stale code reviews) time release Intention: • Hot and Warm reviews are those with lots of activity in peoples inboxes Allow us to and contributors to easily spot reviews that got forgotten • Release filters help manage what will be in a release
patchwork for threads, etc.) Cross posting of Linux/QEMU/… patches Xen-devel@ Improve Usefulness Provide data that fits into the workflow (e.g. git repos, commit id’s, …) Focus data: are the dashboards too noisy? Iterate, Iterate, Iterate Can easily add new views and panels required by groups of users Make the review process more tools friendly? Minor changes OK, significant ones won’t fly
of effort over a 6 month period Data analysis is no silver bullet Not everyone believes the data Perception is not always a good friend Learned lots about the review process In fact, we may make changes Good starting point for other communities Using an e-mail based review workflow