Authorship dashboard for the Linux Kernel
Jesus M. Gonzalez-Barahona
jgb @ bitergia.com @jgbarah
Bitergia
http://speakerdeck.com/jgbarah
LinuxCon North America
Toronto (Canada), August 22-24, 2016
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 1 / 32
Slide 2
Slide 2 text
Structure of the presentation
1 A bit of context
2 The Linux development history dashboard
3 Exploring the Linux history (some examples)
4 How to build your own dashboard
5 Final remarks
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 2 / 32
Slide 3
Slide 3 text
A bit of context
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 3 / 32
Slide 4
Slide 4 text
Me and my circumstances
Jesus M. Gonzalez-Barahona:
Researcher for many years at
Uni. Rey Juan Carlos
Understanding free, open
source software development
Now collaborating with, and
co-founder of Bitergia
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 4 / 32
Slide 5
Slide 5 text
The company
The software development analytics company
dashboards
reports
consultancy
...
http://bitergia.com
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 5 / 32
Slide 6
Slide 6 text
The Linux development
history dashboard
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 6 / 32
Slide 7
Slide 7 text
Having data is not like understanding data
https://en.wikipedia.org/wiki/Charles_Joseph_Minard
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 7 / 32
Slide 8
Slide 8 text
What’s in the dashboard
All commits in the current Linux (Torvalds’) git repo
(git log)
Git and Demographics
All lines in the current master HEAD
with details about when they were introduced
(git blame)
Git Blame, Git Blame (Charts), and Git Blame (Files)
http://linux.biterg.io
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 8 / 32
Slide 9
Slide 9 text
Git Blame panel
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 9 / 32
Slide 10
Slide 10 text
Git Blame panel
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 10 / 32
Slide 11
Slide 11 text
Git Blame (Charts) panel
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 11 / 32
Slide 12
Slide 12 text
Git Blame (Files) panel
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 12 / 32
Slide 13
Slide 13 text
Git Blame (Files) panel (2)
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 13 / 32
Slide 14
Slide 14 text
How to use the dashboard
Click almost anywhere to apply the corresponding
filter
Eg: click on an author to filter their activity
Interact with filters (green / red buttons that appear
on the top)
Select dates on the top right
Change layout by dragging & resizing widgets
Use the icon below the date selector to share
(includes current filters and layout)
Panels are customized Kibana dashboards:
https://www.elastic.co/guide/en/kibana/4.4/dashboard.html
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 14 / 32
Slide 15
Slide 15 text
Exploring the Linux history
(some examples)
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 15 / 32
Slide 16
Slide 16 text
How old is the code?
Age of lines (date of authorship, “.c” files)
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 16 / 32
Slide 17
Slide 17 text
How old is the code? (files by last changed)
Files by last change (date of authorship)
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 17 / 32
Slide 18
Slide 18 text
How old is the code? (files by first change)
Files by first remaining change (date of authorship, “.c” files)
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 18 / 32
Slide 19
Slide 19 text
How old is the code? drivers/net
Age of lines (data of authorship, “.c” files)
From top left, clockwise: Wireless, USB, IRDA Ethernet
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 19 / 32
Slide 20
Slide 20 text
Code “owned”
“The land belongs
to its workers”
Emiliano Zapata
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 20 / 32
Slide 21
Slide 21 text
Code “owned” (authors of remaining code)
Top authors, by number of lines (since 2002)
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 21 / 32
Slide 22
Slide 22 text
Code “owned” (authors of remaining code)
Top authors, by number of snippets (since 2002)
About 200 to make for 50% of snippets
(Snippet: piece of a file changed in a single commit)
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 22 / 32
Slide 23
Slide 23 text
Where do developers work?
Number of lines by time zone
(lines remaining in current kernel, by time of authorship)
Top: lines from 2009, bottom: 2016
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 23 / 32
Slide 24
Slide 24 text
How to build your own
dashboard
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 24 / 32
Slide 25
Slide 25 text
Ingredients
A git repository with all Linux history
(up to having a meaninful git blame output)
Some scripts based on GrimoireLab
to analyze it and produce data for the dashboard
Python to run those scripts
ElasticSearch to store the data
Kibitter / Kibana to produce the dashboard
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 25 / 32
Slide 26
Slide 26 text
Reconstructing Linux development history
From Dave Jones: 0.01 to 2.4.0
From Thomas Gleixner: 2.4.0 to 2.6.12-rc2
From Torvalds’ Linux repo: 2.6.12 to now
All put together by Yoann Padioleau
and later Rob Landley
https://landley.net/kdocs/fullhist/
Available for your enjoyment (updated up to now)
https://github.com/history-repos/linux/
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 26 / 32
Slide 27
Slide 27 text
Analyzing the repository with Git Blame
Git Blame backend for Perceval
Still not merged upstream, clone from
http://github.com/jgbarah/perceval (gitblame
branch)
Ad-hoc script to produce Kibana indexes
https:
//github.com/jgbarah/blameanalysis.git
Assuming Linux git repo in ~/linux:
blame_analysis.py --repodir ~/linux --store linux-store \
--processed linux-processed --uploaded linux-uploaded \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
--es_url elasticsearch_url
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 27 / 32
Slide 28
Slide 28 text
Producing the dashboard
Starting from a standard Bitergia Analytics Dashboard
for the Linux git repository
(built with GrimoireLab tools)
http://grimoirelab.github.io
Some new panels (Kibana dashboards) for git blame
Git Blame
Git Blame (Charts)
Git Blame (Files)
http://linux.biterg.io
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 28 / 32
Slide 29
Slide 29 text
Final remarks
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 29 / 32
Slide 30
Slide 30 text
A complete dashboard for Linux code history
25 years of Linux development history
This is what the current kernel is made of
Play with the data, explore it!
Our contribution to
Linux 25th anniversary
Dashboard: http://linux.biterg.io
Slides: http://speakerdeck.com/jgbarah
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 30 / 32
Slide 31
Slide 31 text
License
c 2016 Bitergia
Some rights reserved.
This presentation is distributed under the
“Attribution-ShareAlike 3.0” license, by Creative Commons,
available at
http://creativecommons.org/licenses/by-sa/3.0/
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 31 / 32
Slide 32
Slide 32 text
Credits
“Napoleon’s Russian campaign of 1812”
Original by Charles Minard
License: Public domain
https://en.wikipedia.org/wiki/Charles_Joseph_Minard#/media/File:
Minard.png
“Emiliano Zapata”
License: Public Domain
Jesus M. Gonzalez-Barahona (Bitergia) Authorship dashboard for Linux Aug 2016 32 / 32