Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
MongoDB for Analytics - John Nunemaker, Ordered...
Search
mongodb
November 01, 2011
Technology
7.7k
9
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
MongoDB for Analytics - John Nunemaker, Ordered List
MongoChicago 2011
mongodb
November 01, 2011
More Decks by mongodb
See All by mongodb
NoSQL Now! 2012
mongodb
18
3.4k
MongoDB 2.2 At the Silicon Valley MongoDB User Group
mongodb
9
1.4k
Turning off the LAMP Hunter Loftis, Skookum Digital Works
mongodb
2
1.6k
Mobilize Your MongoDB! Developing iPhone and Android Apps in the Cloud Grant Shipley, Red Hat
mongodb
0
570
Beanstalk Data - MongoDB In Production Chris Siefken, CTO Beanstalk Data
mongodb
0
570
New LINQ support in C#/.NET driver Robert Stam, 10gen
mongodb
9
41k
Welcome and Keynote Aaron Heckman, 10gen
mongodb
0
550
Webinar Introduction to MongoDB's Java Driver
mongodb
1
1.3k
Webinar Intro to Schema Design
mongodb
4
1.8k
Other Decks in Technology
See All in Technology
作って終わりにしない タイミーのセマンティックレイヤー育成の現在地
chanyou0311
3
2.1k
社内 AI エージェント Synapse と セマンティックレイヤーの育て方
hiroakis
2
1.6k
データサイエンスを価値につなげるプロジェクト設計 〜 DS一年目が現場で得た気づき 〜
ysd113
1
150
「速く作る」から「正しく作る」へ ─ 生成AI時代の開発フロー改革の ロードマップと実行 ─
starfish719
0
9.8k
失敗を資産に変えるClaude Code
shinyasaita
0
300
白金鉱業Meetup_Vol.24_「AIエージェントは分けるほど良い」は本当か? / Is it true that “the more you divide AI agents, the better”?
brainpadpr
1
270
10倍の生産性を実現するAI駆動並列エージェントのすべて
kumaiu
4
1.3k
生成 AI × MCP で切り拓く次世代 SRE!自律型運用への挑戦と開発者体験の進化
_awache
0
190
現地で盛り上がった WWDC26 Keynote
zozotech
PRO
1
170
AIソロプレナー時代に2ヶ月で20人増員した事業創造会社の開発組織の話
miyatakoji
0
570
フロンティアAIのゲート化と地政学リスク
nagatsu
0
110
Disciplined Vibes: Scaling AI-Assisted Engineering
sheharyar
0
130
Featured
See All Featured
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
430
Impact Scores and Hybrid Strategies: The future of link building
tamaranovitovic
0
300
How GitHub (no longer) Works
holman
316
150k
Mind Mapping
helmedeiros
PRO
1
240
Bioeconomy Workshop: Dr. Julius Ecuru, Opportunities for a Bioeconomy in West Africa
akademiya2063
PRO
1
140
Site-Speed That Sticks
csswizardry
13
1.2k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
A Modern Web Designer's Workflow
chriscoyier
698
190k
Leading Effective Engineering Teams in the AI Era
addyosmani
9
2k
Ruling the World: When Life Gets Gamed
codingconduct
0
250
The Language of Interfaces
destraynor
162
27k
Why Our Code Smells
bkeepers
PRO
340
58k
Transcript
Ordered List John Nunemaker MongoChi 2011 October 18, 2011 MongoDB
for Analytics A loving conversation with @jnunemaker
Background As presented through interpretive dance
None
None
None
~1 month Of evenings and weekends
~4 dog years Since public launch
~6 tiny servers 2 web, 2 app, 2 db
~1-2 Million Page views per day
None
None
Implementation Imma show you how we do what we do
baby
Doing It Live No aggregate querying
get('/track.gif') do Hit.record(...) TrackGif end
class Hit def record site.atomic_update(site_updates) Resolution.record(self) Technology.record(self) Location.record(self) Referrer.record(self) Content.record(self)
Search.record(self) Notification.record(self) View.record(self) end end
class Resolution def record(hit) query = {'_id' => "..."} update
= {'$inc' => {}} update['$inc']["sx.#{hit.screenx}"] = 1 update['$inc']["bx.#{hit.browserx}"] = 1 update['$inc']["by.#{hit.browsery}"] = 1 collection(hit.created_on) .update(query, update, :upsert => true) end end end
Pros
Pros Space
Pros Space RAM
Pros Space RAM Reads
Pros Space RAM Reads Live
Cons
Cons Writes
Cons Writes Constraints
Cons Writes Constraints More Forethought
Cons Writes Constraints More Forethought No raw data
Time Frame Minute, hour, month, day, year, forever?
# of Variations One document vs many
Single Document Per Time Frame
None
{ "t" => 336381, "u" => 158951, "2011" => {
"02" => { "18" => { "t" => 9, "u" => 6 } } } }
{ '$inc' => { 't' => 1, 'u' => 1,
'2011.02.18.t' => 1, '2011.02.18.u' => 1, } }
Single Document For all ranges in time frame
None
{ "_id" =>"...:10", "bx" => { "320" => 85, "480"
=> 318, "800" => 1938, "1024" => 5033, "1280" => 6288, "1440" => 2323, "1600" => 3817, "2000" => 137 }, "by" => { "480" => 2205, "600" => 7359,
"600" => 7359, "768" => 4515, "900" => 3833, "1024"
=> 2026 }, "sx" => { "320" => 191, "480" => 179, "800" => 195, "1024" => 1059, "1280" => 5861, "1440" => 3533, "1600" => 7675, "2000" => 1279 } }
{ '$inc' => { 'sx.1440' => 1, 'bx.1280' => 1,
'by.768' => 1, } }
Many Documents Search terms, content, referrers...
None
[ { "_id" => "<oid>:<hash>", "t" => "ruby class variables",
"sid" => BSON::ObjectId('<oid>'), "v" => 352 }, { "_id" => "<oid>:<hash>", "t" => "ruby unless", "sid" => BSON::ObjectId('<oid>'), "v" => 347 }, ]
Writes {'_id' => "#{site_id}:#{hash}"}
Reads [['sid', 1], ['v', -1]]
Growth The best laid plans of mice and men
Partition Hot Data Currently using collections for time frames
Bigger, Faster Server More CPU, RAM, Disk Space
Users Sites Content Referrers Terms Engines Resolutions Locations Users Sites
Content Referrers Terms Engines Resolutions Locations
Partition by Function Spread writes across a few servers
Users Sites Content Referrers Terms Engines Resolutions Locations
Partition by Server Spread writes across a ton of servers,
way down the road, not worried yet
Ordered List Thank you!
[email protected]
John Nunemaker MongoChi 2011 October
18, 2011 @jnunemaker