GitHub
John Nunemaker
MongoChicago 2012
November 12, 2012
MongoDB for Analytics
A loving conversation with @jnunemaker
Slide 2
Slide 2 text
Background
How hernias can be good for you
Slide 3
Slide 3 text
No content
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
1 month
Of evenings and weekends
Slide 6
Slide 6 text
18 months
Since public launch
Slide 7
Slide 7 text
10-15 Million
Page views per day
Slide 8
Slide 8 text
2.7 Billion
Page views to date
Slide 9
Slide 9 text
13 tiny servers
2 web, 6 app, 3 db, 2 queue
Slide 10
Slide 10 text
requests/sec
Slide 11
Slide 11 text
ops/sec
Slide 12
Slide 12 text
cpu %
Slide 13
Slide 13 text
lock %
Slide 14
Slide 14 text
Implementation
How we do what we do
Slide 15
Slide 15 text
Doing It (mostly) Live
No aggregate querying
Slide 16
Slide 16 text
No content
Slide 17
Slide 17 text
No content
Slide 18
Slide 18 text
get('/track.gif') do
track_service.record(...)
TrackGif
end
Slide 19
Slide 19 text
class TrackService
def record(attrs)
message = MessagePack.pack(attrs)
@client.set(@queue, message)
end
end
Slide 20
Slide 20 text
class TrackProcessor
def run
loop { process }
end
def process
record @client.get(@queue)
end
def record(message)
attrs = MessagePack.unpack(message)
Hit.record(attrs)
end
end
Slide 21
Slide 21 text
http://bit.ly/rt-kestrel
Slide 22
Slide 22 text
class Hit
def record
site.atomic_update(site_updates)
Resolution.record(self)
Technology.record(self)
Location.record(self)
Referrer.record(self)
Content.record(self)
Search.record(self)
Notification.record(self)
View.record(self)
end
end
Slide 23
Slide 23 text
class Resolution
def record(hit)
query = {'_id' => "..."}
update = {'$inc' => {}}
update['$inc']["sx.#{hit.screenx}"] = 1
update['$inc']["bx.#{hit.browserx}"] = 1
update['$inc']["by.#{hit.browsery}"] = 1
collection(hit.created_on)
.update(query, update, :upsert => true)
end
end
end
Slide 24
Slide 24 text
Pros
Slide 25
Slide 25 text
Pros
Space
Slide 26
Slide 26 text
Pros
Space
RAM
Slide 27
Slide 27 text
Pros
Space
RAM
Reads
Slide 28
Slide 28 text
Pros
Space
RAM
Reads
Live
Slide 29
Slide 29 text
Cons
Slide 30
Slide 30 text
Cons
Writes
Slide 31
Slide 31 text
Cons
Writes
Constraints
Slide 32
Slide 32 text
Cons
Writes
Constraints
More Forethought
Slide 33
Slide 33 text
Cons
Writes
Constraints
More Forethought
No raw data