Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData ATL Lightning Talk
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Will McGinnis
June 17, 2016
Programming
0
300
PyData ATL Lightning Talk
https://www.github.com/wdm0006/git-pandas
Will McGinnis
June 17, 2016
Tweet
Share
More Decks by Will McGinnis
See All by Will McGinnis
I made a model, now what?
wdm0006
0
260
Encoding categorical variables with categorical encoders
wdm0006
0
450
Other Decks in Programming
See All in Programming
React 19でつくる「気持ちいいUI」- 楽観的UIのすすめ
himorishige
11
7.4k
組織で育むオブザーバビリティ
ryota_hnk
0
180
CSC307 Lecture 03
javiergs
PRO
1
490
HTTPプロトコル正しく理解していますか? 〜かわいい猫と共に学ぼう。ฅ^•ω•^ฅ ニャ〜
hekuchan
2
690
ノイジーネイバー問題を解決する 公平なキューイング
occhi
0
110
Architectural Extensions
denyspoltorak
0
290
副作用をどこに置くか問題:オブジェクト指向で整理する設計判断ツリー
koxya
1
610
AI巻き込み型コードレビューのススメ
nealle
2
420
OCaml 5でモダンな並列プログラミングを Enjoyしよう!
haochenx
0
140
CSC307 Lecture 08
javiergs
PRO
0
670
コマンドとリード間の連携に対する脅威分析フレームワーク
pandayumi
1
460
ぼくの開発環境2026
yuzneri
0
240
Featured
See All Featured
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
49
9.9k
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.2k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
Become a Pro
speakerdeck
PRO
31
5.8k
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
920
BBQ
matthewcrist
89
10k
Ethics towards AI in product and experience design
skipperchong
2
200
Discover your Explorer Soul
emna__ayadi
2
1.1k
Bash Introduction
62gerente
615
210k
Prompt Engineering for Job Search
mfonobong
0
160
HDC tutorial
michielstock
1
390
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
120
Transcript
Analyzing Git/ Github data with git-pandas (or: vanity metrics at
scale)
Who am I? • Will McGinnis • Write code at
Predikto (we’re hiring) • www.predikto.com • twitter.com/willmcginnis • github.com/wdm0006
What is git-pandas? • Open source library: https://github.com/wdm0006/git-pandas • Represents
git data as pandas dataframes • Abstracts groups of repos into an object called a ProjectDirectory • Does some common analysis tasks for you
Org-wide Punchcards
Cumulative Blame
Estimate Code Quality • “file owner” • metric for refactors
• how long will an owner’s file go without being refactored?
GitNOC
None
None
Other things • Bus factors (for files, repos and orgs)
• Development time estimation • Rate of change metrics (risk) • Basic datasets (commit history, file changes, branches, tags, etc.) • File owners • File details • and more: https://github.com/wdm0006/git- pandas