Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
MongoDB for Analytics - John Nunemaker, Ordered...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
mongodb
November 01, 2011
Technology
7.7k
9
Share
MongoDB for Analytics - John Nunemaker, Ordered List
MongoChicago 2011
mongodb
November 01, 2011
More Decks by mongodb
See All by mongodb
NoSQL Now! 2012
mongodb
18
3.4k
MongoDB 2.2 At the Silicon Valley MongoDB User Group
mongodb
9
1.4k
Turning off the LAMP Hunter Loftis, Skookum Digital Works
mongodb
2
1.6k
Mobilize Your MongoDB! Developing iPhone and Android Apps in the Cloud Grant Shipley, Red Hat
mongodb
0
560
Beanstalk Data - MongoDB In Production Chris Siefken, CTO Beanstalk Data
mongodb
0
560
New LINQ support in C#/.NET driver Robert Stam, 10gen
mongodb
9
41k
Welcome and Keynote Aaron Heckman, 10gen
mongodb
0
540
Webinar Introduction to MongoDB's Java Driver
mongodb
1
1.3k
Webinar Intro to Schema Design
mongodb
4
1.8k
Other Decks in Technology
See All in Technology
AIを「創る」と「使う」の循環 — HRテックが実践するリアルなAI組織実装
taketo957
0
1.1k
マーケットプレイス版Oracle WebCenter Content For OCI
oracle4engineer
PRO
5
1.8k
AIプラットフォームを運用し続けるための可観測性
tanimuyk
4
1.1k
Spring AI × MCP 入門〜AIエージェントへのツール公開、境界設計から始める最小構成 〜
yuyamiyamoto
0
210
製造業のクラウド活用最適解〜AI,DXを加速するデータ基盤の作り方〜
hamadakoji
0
320
AI-DLCを活用した高品質・安全なAI駆動開発実践 / AI Driven Development
yoshidashingo
1
320
ポスター発表&デモと総括 / Poster Presentations & Demonstrations and Summary
ks91
PRO
0
190
Oracle AI Database@AWS:サービス概要のご紹介
oracle4engineer
PRO
4
2.8k
イベントストーミングとKiroの仕様駆動開発で実現する要件の認識合わせプロセス
syobochim
7
1.1k
速さだけじゃない! VoidZero ツールが移行先に選ばれる理由
mizdra
PRO
6
730
long-running-tasks
cipepser
3
460
Platform Engineering as a Product: Criteria for Improvement and Multi-Tenant Design
kumorn5s
0
490
Featured
See All Featured
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.4k
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
1
590
How to Talk to Developers About Accessibility
jct
2
220
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
2
570
Writing Fast Ruby
sferik
630
63k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
360
Building Applications with DynamoDB
mza
96
7.1k
Build The Right Thing And Hit Your Dates
maggiecrowley
39
3.2k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
47
8.2k
HDC tutorial
michielstock
2
690
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
1.3k
Being A Developer After 40
akosma
91
590k
Transcript
Ordered List John Nunemaker MongoChi 2011 October 18, 2011 MongoDB
for Analytics A loving conversation with @jnunemaker
Background As presented through interpretive dance
None
None
None
~1 month Of evenings and weekends
~4 dog years Since public launch
~6 tiny servers 2 web, 2 app, 2 db
~1-2 Million Page views per day
None
None
Implementation Imma show you how we do what we do
baby
Doing It Live No aggregate querying
get('/track.gif') do Hit.record(...) TrackGif end
class Hit def record site.atomic_update(site_updates) Resolution.record(self) Technology.record(self) Location.record(self) Referrer.record(self) Content.record(self)
Search.record(self) Notification.record(self) View.record(self) end end
class Resolution def record(hit) query = {'_id' => "..."} update
= {'$inc' => {}} update['$inc']["sx.#{hit.screenx}"] = 1 update['$inc']["bx.#{hit.browserx}"] = 1 update['$inc']["by.#{hit.browsery}"] = 1 collection(hit.created_on) .update(query, update, :upsert => true) end end end
Pros
Pros Space
Pros Space RAM
Pros Space RAM Reads
Pros Space RAM Reads Live
Cons
Cons Writes
Cons Writes Constraints
Cons Writes Constraints More Forethought
Cons Writes Constraints More Forethought No raw data
Time Frame Minute, hour, month, day, year, forever?
# of Variations One document vs many
Single Document Per Time Frame
None
{ "t" => 336381, "u" => 158951, "2011" => {
"02" => { "18" => { "t" => 9, "u" => 6 } } } }
{ '$inc' => { 't' => 1, 'u' => 1,
'2011.02.18.t' => 1, '2011.02.18.u' => 1, } }
Single Document For all ranges in time frame
None
{ "_id" =>"...:10", "bx" => { "320" => 85, "480"
=> 318, "800" => 1938, "1024" => 5033, "1280" => 6288, "1440" => 2323, "1600" => 3817, "2000" => 137 }, "by" => { "480" => 2205, "600" => 7359,
"600" => 7359, "768" => 4515, "900" => 3833, "1024"
=> 2026 }, "sx" => { "320" => 191, "480" => 179, "800" => 195, "1024" => 1059, "1280" => 5861, "1440" => 3533, "1600" => 7675, "2000" => 1279 } }
{ '$inc' => { 'sx.1440' => 1, 'bx.1280' => 1,
'by.768' => 1, } }
Many Documents Search terms, content, referrers...
None
[ { "_id" => "<oid>:<hash>", "t" => "ruby class variables",
"sid" => BSON::ObjectId('<oid>'), "v" => 352 }, { "_id" => "<oid>:<hash>", "t" => "ruby unless", "sid" => BSON::ObjectId('<oid>'), "v" => 347 }, ]
Writes {'_id' => "#{site_id}:#{hash}"}
Reads [['sid', 1], ['v', -1]]
Growth The best laid plans of mice and men
Partition Hot Data Currently using collections for time frames
Bigger, Faster Server More CPU, RAM, Disk Space
Users Sites Content Referrers Terms Engines Resolutions Locations Users Sites
Content Referrers Terms Engines Resolutions Locations
Partition by Function Spread writes across a few servers
Users Sites Content Referrers Terms Engines Resolutions Locations
Partition by Server Spread writes across a ton of servers,
way down the road, not worried yet
Ordered List Thank you!
[email protected]
John Nunemaker MongoChi 2011 October
18, 2011 @jnunemaker