Save 37% off PRO during our Black Friday Sale! »

GitHub Insights: Understanding Open Source

GitHub Insights: Understanding Open Source

Talk given at OSCON 2016


Georgios Gousios

May 19, 2016


  1. GitHub Insights Understanding Open Source @jeffmcaffer–Microsoft Georgios Gousios –Delft University

    of Technology (TU Delft) Kevin Lewis – Microsoft
  2. Snapshot overview

  3. Inspire confidence

  4. How open is a project?

  5. Commits (core vs community)

  6. Commits (origin)

  7. Comments (core vs community)

  8. PR lifelines

  9. Are we using git in a distributed way?

  10. How may devs are there per country?

  11. Insights

  12. Business insights

  13. Research insights

  14. Cross-domain insights

  15. Operational insights

  16. Approach Data for the masses

  17. GitHub by the numbers (Mid 2016)

  18. Approach

  19. How does it work?

  20. Example event (condensed)

  21. Entities

  22. GHTorrent architecture Github API Event Retrieval Commits Queue Project Events

    Queue Events Data Retrieval Projects Commits evt.commit evt.fork Data Retrieval Data Retrieval Data Retrieval Mirroring Cluster
  23. GHTorrent by the numbers

  24. Using the data You can do it too!

  25. Using the data: Hosted

  26. Using the data: Download

  27. Using the data: Self-service

  28. Using the data: Azure Data Lake

  29. Resources @gousiosg @jeffmcaffer @kelewis