Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData ATL Lightning Talk
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Will McGinnis
June 17, 2016
Programming
0
300
PyData ATL Lightning Talk
https://www.github.com/wdm0006/git-pandas
Will McGinnis
June 17, 2016
Tweet
Share
More Decks by Will McGinnis
See All by Will McGinnis
I made a model, now what?
wdm0006
0
260
Encoding categorical variables with categorical encoders
wdm0006
0
450
Other Decks in Programming
See All in Programming
15年続くIoTサービスのSREエンジニアが挑む分散トレーシング導入
melonps
2
220
izumin5210のプロポーザルのネタ探し #tskaigi_msup
izumin5210
1
130
Basic Architectures
denyspoltorak
0
680
【卒業研究】会話ログ分析によるユーザーごとの関心に応じた話題提案手法
momok47
0
200
AIエージェントのキホンから学ぶ「エージェンティックコーディング」実践入門
masahiro_nishimi
5
470
AI & Enginnering
codelynx
0
120
MDN Web Docs に日本語翻訳でコントリビュート
ohmori_yusuke
0
650
OCaml 5でモダンな並列プログラミングを Enjoyしよう!
haochenx
0
140
フロントエンド開発の勘所 -複数事業を経験して見えた判断軸の違い-
heimusu
7
2.8k
Unicodeどうしてる? PHPから見たUnicode対応と他言語での対応についてのお伺い
youkidearitai
PRO
1
2.6k
責任感のあるCloudWatchアラームを設計しよう
akihisaikeda
3
180
Oxlint JS plugins
kazupon
1
980
Featured
See All Featured
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
0
3.4k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
270
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
66
37k
GitHub's CSS Performance
jonrohan
1032
470k
A Tale of Four Properties
chriscoyier
162
24k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.4k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
Product Roadmaps are Hard
iamctodd
PRO
55
12k
Reality Check: Gamification 10 Years Later
codingconduct
0
2k
How to build a perfect <img>
jonoalderson
1
4.9k
How to Talk to Developers About Accessibility
jct
2
130
Discover your Explorer Soul
emna__ayadi
2
1.1k
Transcript
Analyzing Git/ Github data with git-pandas (or: vanity metrics at
scale)
Who am I? • Will McGinnis • Write code at
Predikto (we’re hiring) • www.predikto.com • twitter.com/willmcginnis • github.com/wdm0006
What is git-pandas? • Open source library: https://github.com/wdm0006/git-pandas • Represents
git data as pandas dataframes • Abstracts groups of repos into an object called a ProjectDirectory • Does some common analysis tasks for you
Org-wide Punchcards
Cumulative Blame
Estimate Code Quality • “file owner” • metric for refactors
• how long will an owner’s file go without being refactored?
GitNOC
None
None
Other things • Bus factors (for files, repos and orgs)
• Development time estimation • Rate of change metrics (risk) • Basic datasets (commit history, file changes, branches, tags, etc.) • File owners • File details • and more: https://github.com/wdm0006/git- pandas