Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData ATL Lightning Talk
Search
Will McGinnis
June 17, 2016
Programming
300
0
Share
PyData ATL Lightning Talk
https://www.github.com/wdm0006/git-pandas
Will McGinnis
June 17, 2016
More Decks by Will McGinnis
See All by Will McGinnis
I made a model, now what?
wdm0006
0
280
Encoding categorical variables with categorical encoders
wdm0006
0
460
Other Decks in Programming
See All in Programming
Lightning-Fast Method Calls with Ruby 4.1 ZJIT / RubyKaigi 2026
k0kubun
3
3.2k
Agentic UI beyond Chats Architecture Patterns & Open Standards @ngMunich 05/2026
manfredsteyer
PRO
0
100
Cloudflare で始める Data Platform
ta93abe
0
180
決定論 vs 確率論:Gemini 3 FlashとTF-IDFを組み合わせた「法規判定エンジン」の構築
shukob
0
160
次世代リンターで探る、tsgo 時代における型認識カスタムルールの現実解
ytakahashii
0
110
開発体験を左右するライブラリの API 設計 - GraphQL スキーマ構築ライブラリから考える #tskaigi
izumin5210
1
170
いつか誰かが、と思っていた フロントエンド刷新5年間の実践知
kiichisugihara
1
280
Firefoxにコントリビューションして得られた学び
ken7253
2
160
GoogleCloudとterraform完全に理解した
terisuke
1
200
空間オーディオの活用
objectiveaudio
0
150
Agentic UI in the Frontend: Architectures with Open Standards @JAX 2026 in Mainz
manfredsteyer
PRO
0
120
Kubernetesを使わない環境にもCloud Nativeなデプロイを実現する / Enabling Cloud Native deployments without the complexity of Kubernetes
linyows
3
410
Featured
See All Featured
Designing Powerful Visuals for Engaging Learning
tmiket
1
370
Making Projects Easy
brettharned
120
6.6k
How to train your dragon (web standard)
notwaldorf
97
6.6k
RailsConf 2023
tenderlove
30
1.4k
Amusing Abliteration
ianozsvald
1
170
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
120
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
250
What’s in a name? Adding method to the madness
productmarketing
PRO
24
4k
Code Review Best Practice
trishagee
74
20k
sira's awesome portfolio website redesign presentation
elsirapls
0
250
jQuery: Nuts, Bolts and Bling
dougneiner
66
8.5k
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.8k
Transcript
Analyzing Git/ Github data with git-pandas (or: vanity metrics at
scale)
Who am I? • Will McGinnis • Write code at
Predikto (we’re hiring) • www.predikto.com • twitter.com/willmcginnis • github.com/wdm0006
What is git-pandas? • Open source library: https://github.com/wdm0006/git-pandas • Represents
git data as pandas dataframes • Abstracts groups of repos into an object called a ProjectDirectory • Does some common analysis tasks for you
Org-wide Punchcards
Cumulative Blame
Estimate Code Quality • “file owner” • metric for refactors
• how long will an owner’s file go without being refactored?
GitNOC
None
None
Other things • Bus factors (for files, repos and orgs)
• Development time estimation • Rate of change metrics (risk) • Basic datasets (commit history, file changes, branches, tags, etc.) • File owners • File details • and more: https://github.com/wdm0006/git- pandas