Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData ATL Lightning Talk
Search
Will McGinnis
June 17, 2016
Programming
0
300
PyData ATL Lightning Talk
https://www.github.com/wdm0006/git-pandas
Will McGinnis
June 17, 2016
Tweet
Share
More Decks by Will McGinnis
See All by Will McGinnis
I made a model, now what?
wdm0006
0
260
Encoding categorical variables with categorical encoders
wdm0006
0
440
Other Decks in Programming
See All in Programming
GitHub Copilot and GitHub Codespaces Hands-on
ymd65536
1
120
エンジニア向け採用ピッチ資料
inusan
0
160
Google Agent Development Kit でLINE Botを作ってみた
ymd65536
2
190
XP, Testing and ninja testing
m_seki
3
190
DroidKnights 2025 - 다양한 스크롤 뷰에서의 영상 재생
gaeun5744
3
320
明示と暗黙 ー PHPとGoの インターフェイスの違いを知る
shimabox
2
320
PHP 8.4の新機能「プロパティフック」から学ぶオブジェクト指向設計とリスコフの置換原則
kentaroutakeda
2
540
Azure AI Foundryではじめてのマルチエージェントワークフロー
seosoft
0
130
LT 2025-06-30: プロダクトエンジニアの役割
yamamotok
0
450
なぜ「共通化」を考え、失敗を繰り返すのか
rinchoku
1
510
AIエージェントはこう育てる - GitHub Copilot Agentとチームの共進化サイクル
koboriakira
0
380
既存デザインを変更せずにタップ領域を広げる方法
tahia910
1
240
Featured
See All Featured
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Why You Should Never Use an ORM
jnunemaker
PRO
57
9.4k
Thoughts on Productivity
jonyablonski
69
4.7k
Statistics for Hackers
jakevdp
799
220k
4 Signs Your Business is Dying
shpigford
184
22k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
790
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.3k
Facilitating Awesome Meetings
lara
54
6.4k
The World Runs on Bad Software
bkeepers
PRO
69
11k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Build your cross-platform service in a week with App Engine
jlugia
231
18k
Transcript
Analyzing Git/ Github data with git-pandas (or: vanity metrics at
scale)
Who am I? • Will McGinnis • Write code at
Predikto (we’re hiring) • www.predikto.com • twitter.com/willmcginnis • github.com/wdm0006
What is git-pandas? • Open source library: https://github.com/wdm0006/git-pandas • Represents
git data as pandas dataframes • Abstracts groups of repos into an object called a ProjectDirectory • Does some common analysis tasks for you
Org-wide Punchcards
Cumulative Blame
Estimate Code Quality • “file owner” • metric for refactors
• how long will an owner’s file go without being refactored?
GitNOC
None
None
Other things • Bus factors (for files, repos and orgs)
• Development time estimation • Rate of change metrics (risk) • Basic datasets (commit history, file changes, branches, tags, etc.) • File owners • File details • and more: https://github.com/wdm0006/git- pandas