Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData ATL Lightning Talk
Search
Will McGinnis
June 17, 2016
Programming
0
300
PyData ATL Lightning Talk
https://www.github.com/wdm0006/git-pandas
Will McGinnis
June 17, 2016
Tweet
Share
More Decks by Will McGinnis
See All by Will McGinnis
I made a model, now what?
wdm0006
0
270
Encoding categorical variables with categorical encoders
wdm0006
0
460
Other Decks in Programming
See All in Programming
Angular-Apps smarter machen mit Gen AI: Lokal und offlinefähig - Hands-on Workshop!
christianliebel
PRO
0
140
モダンOBSプラグイン開発
umireon
0
180
モックわからないマン卒業記 ~振る舞いを起点に見直した、フロントエンドテストにおけるモックの使いどころ~
tasukuwatanabe
3
420
Strategy for Finding a Problem for OSS: With Real Examples
kibitan
0
110
「接続」—パフォーマンスチューニングの最後の一手 〜点と点を結ぶ、その一瞬のために〜
kentaroutakeda
4
2k
へんな働き方
yusukebe
6
2.8k
GC言語のWasm化とComponent Modelサポートの実践と課題 - Scalaの場合
tanishiking
0
130
「効かない!」依存性注入(DI)を活用したAPI Platformのエラーハンドリング奮闘記
mkmk884
0
250
安いハードウェアでVulkan
fadis
1
810
Reactive ❤️ Loom: A Forbidden Love Story
franz1981
2
170
PHP でエミュレータを自作して Ubuntu を動かそう
m3m0r7
PRO
2
150
CS教育のDX AIによる育成の効率化
niftycorp
PRO
0
170
Featured
See All Featured
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
93
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
1
3.5k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.1k
The Cost Of JavaScript in 2023
addyosmani
55
9.8k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
790
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
100
Utilizing Notion as your number one productivity tool
mfonobong
4
270
Leo the Paperboy
mayatellez
4
1.6k
The Curse of the Amulet
leimatthew05
1
11k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.2k
New Earth Scene 8
popppiees
1
1.8k
Paper Plane (Part 1)
katiecoart
PRO
0
6k
Transcript
Analyzing Git/ Github data with git-pandas (or: vanity metrics at
scale)
Who am I? • Will McGinnis • Write code at
Predikto (we’re hiring) • www.predikto.com • twitter.com/willmcginnis • github.com/wdm0006
What is git-pandas? • Open source library: https://github.com/wdm0006/git-pandas • Represents
git data as pandas dataframes • Abstracts groups of repos into an object called a ProjectDirectory • Does some common analysis tasks for you
Org-wide Punchcards
Cumulative Blame
Estimate Code Quality • “file owner” • metric for refactors
• how long will an owner’s file go without being refactored?
GitNOC
None
None
Other things • Bus factors (for files, repos and orgs)
• Development time estimation • Rate of change metrics (risk) • Basic datasets (commit history, file changes, branches, tags, etc.) • File owners • File details • and more: https://github.com/wdm0006/git- pandas