Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData ATL Lightning Talk
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Will McGinnis
June 17, 2016
Programming
0
300
PyData ATL Lightning Talk
https://www.github.com/wdm0006/git-pandas
Will McGinnis
June 17, 2016
Tweet
Share
More Decks by Will McGinnis
See All by Will McGinnis
I made a model, now what?
wdm0006
0
260
Encoding categorical variables with categorical encoders
wdm0006
0
450
Other Decks in Programming
See All in Programming
AIで開発はどれくらい加速したのか?AIエージェントによるコード生成を、現場の評価と研究開発の評価の両面からdeep diveしてみる
daisuketakeda
1
2.5k
Best-Practices-for-Cortex-Analyst-and-AI-Agent
ryotaroikeda
1
110
KIKI_MBSD Cybersecurity Challenges 2025
ikema
0
1.3k
CSC307 Lecture 01
javiergs
PRO
0
690
Oxlint JS plugins
kazupon
1
980
Grafana:建立系統全知視角的捷徑
blueswen
0
330
[KNOTS 2026登壇資料]AIで拡張‧交差する プロダクト開発のプロセス および携わるメンバーの役割
hisatake
0
290
MDN Web Docs に日本語翻訳でコントリビュート
ohmori_yusuke
0
650
疑似コードによるプロンプト記述、どのくらい正確に実行される?
kokuyouwind
0
390
CSC307 Lecture 06
javiergs
PRO
0
690
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
590
OCaml 5でモダンな並列プログラミングを Enjoyしよう!
haochenx
0
140
Featured
See All Featured
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
190
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.9k
Building Adaptive Systems
keathley
44
2.9k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.3k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
14k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
140
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.6k
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
150
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
120
Producing Creativity
orderedlist
PRO
348
40k
How Software Deployment tools have changed in the past 20 years
geshan
0
32k
Transcript
Analyzing Git/ Github data with git-pandas (or: vanity metrics at
scale)
Who am I? • Will McGinnis • Write code at
Predikto (we’re hiring) • www.predikto.com • twitter.com/willmcginnis • github.com/wdm0006
What is git-pandas? • Open source library: https://github.com/wdm0006/git-pandas • Represents
git data as pandas dataframes • Abstracts groups of repos into an object called a ProjectDirectory • Does some common analysis tasks for you
Org-wide Punchcards
Cumulative Blame
Estimate Code Quality • “file owner” • metric for refactors
• how long will an owner’s file go without being refactored?
GitNOC
None
None
Other things • Bus factors (for files, repos and orgs)
• Development time estimation • Rate of change metrics (risk) • Basic datasets (commit history, file changes, branches, tags, etc.) • File owners • File details • and more: https://github.com/wdm0006/git- pandas