Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Analytics for Developers
Search
Trent Hauck
May 05, 2013
Programming
1
420
Analytics for Developers
A talk I gave at Kansas City Developer Conference 2013.
Trent Hauck
May 05, 2013
Tweet
Share
Other Decks in Programming
See All in Programming
ドメイン・ファーストで考える問題解決に役立つモデル設計 / Domain First Model Design
suzushin54
1
1.5k
今の SmartHR にエンジニアで入社するとどうなるの?
daisukeshinoku
3
1.9k
App Router への移行は「改善」となり得るのか?/ Can migration to App Router be an improvement
takefumiyoshii
1
140
Kotlinを用いたDSL的な設計手法と使用上の注意
kohii00
3
530
WasmOS: Wasmを実行する自作Microkernel
riru
0
380
Why 1 + 1 = 2 in Swift?
1plus4
1
250
オブジェクト指向は必要なのか / Is object-oriented needed?
kishida
27
19k
LLMチャットボットのアプリケーション設計Tips
os1ma
4
670
OpenTelemetry のサービスという概念について
azukiazusa1
1
410
Dockerで始めるAWS Lambda開発
stutkhd0709
14
2.5k
【KMC春合宿2024】実装視点で見るNeural Radiance Fields
runningoutrate
0
150
Deep Dive 大規模システムアーキテクチャ/開発組織エンジニアリング / Deep Dive Large-Scale System Architecture, Development Organization Engineering
nrslib
15
2.9k
Featured
See All Featured
Infographics Made Easy
chrislema
237
18k
Intergalactic Javascript Robots from Outer Space
tanoku
266
26k
Ruby is Unlike a Banana
tanoku
95
10k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
111
35k
JazzCon 2018 Closing Keynote - Leadership for the Reluctant Leader
reverentgeek
178
11k
Code Review Best Practice
trishagee
54
15k
In The Pink: A Labor of Love
frogandcode
137
21k
Product Roadmaps are Hard
iamctodd
43
9.6k
Thoughts on Productivity
jonyablonski
57
3.8k
ParisWeb 2013: Learning to Love: Crash Course in Emotional UX Design
dotmariusz
101
6.6k
Principles of Awesome APIs and How to Build Them.
keavy
119
16k
Docker and Python
trallard
33
2.6k
Transcript
Analytics for Developers and Developing for Analytics
About Me 2006-2011: Educated Accounting & Finance 2011-Present: Reeducated Marketing
& Operations Twitter: @trent_hauck Work: @AlightAnalytics Other: Contribute (now and then) to Pandas & StatsModels
Two Parts Analytics (more) Development (less)
Why should you care?
“In God we trust; all others must bring data.”
To do analytics you need x
Where x is data collection...
Site Analytics Should be a 1st Class Citizen of Development
Collect More Than You Need Now
Now Some GA Code <script type="text/javascript"> var _gaq = _gaq
|| []; _gaq.push(['_setAccount', 'UA-31465642-1']); _gaq.push(['_setDomainName', 'trenthauck.com']); _gaq.push(['_setAllowLinker', true]); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script>
Next Steps Events _gaq.push(['_trackEvent', 'Cat', 'Act', ‘Label’]); Custom Variables _gaq.push(['_setCustomVar',
1, ‘key’, ‘value’, 1])
Where x is data analysis...
Differences in Data Small Data == Math Problem Big Data
== Engineering Problem
The Math Problem
Descriptive Stats (please compute these) Max, Min Quartiles Mean Variance
Mode
Web Stats are Easy A user converts or not... what
are the chances of that? p or q (=1-p) 3 users convert or not... what are the changes of that? p^3 or (p^2)q or p(q^2) or q^3
Hypothetical Worlds Trials = 100, Size = 100, p =
.1
Back to real world Stats: p-bar = .08, SE =
0.027
So then AB Testing 500 Trials A B p 0.1
0.2 SE 0.01 0.01 95% CI .1 +/- .02 .2 +/- 0.02
The Engineering Problem
Build Data Pipelines • Repeatable Flows of Data • Handles
Initial Analysis For You • Literate Programming
Programming For Data Analysis • Scripting good for Discovery •
Larger Jobs need Types • Mapping high dimensional space to lower dimensional space... then add
Where x is visualization....
Visualization Types •Distributions •Comparisons •Time Series •Other (Match Domain)
Distributions Single Variable: Histograms Multiple Variables: Scatter plot
Comparisons Categorical Variables
TimeSeries X Axis is Time
Match Domain with Analysis
Where x is storytelling...
Storytelling
3 Temporal Stages 1. What happened 2. What is happening
3. What will happen (Plus a tease)
Start With the Simple Stuff Friday Saturday Sunday 40º 42º
(Why do I live in KC)º
Build Up to Complex Idea
Thanks... Questions?