mongodb + ex.fm @ MongoPGH 2012

mongo + ex.fm Lucas Hrabovsky CTO #MongoPGH

ex.fm turns websites into CD’s

browser extensions

_id and indexes •  Bad Ideas – ObjectId("4fb284…") – Big Compound Indexes
– Long,VariableWidthStringsMissIndexes •  Good Ideas – Make _id mean something – Fixed Width Hashes – Use _id as a compound index

activity feeds: first attempt db.user.feed.find({‘username’: ‘lucas’, ‘verb’: ‘love’}) .sort({‘created’: -1})
{“_id”: “201109122304-lucas-dan- c7dede43…”, "username”: “lucas”, "created”: 201109122304, "actor”: “dan”, “verb”: “love”} Working just fine for 4MM documents, but getting slow…

new version of activity feeds db.user.feed.find({‘vid’: /^lucas-/}) .sort({‘vid’: -1}) {“_id”:
“201109122304-lucas-dan- c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”: lucas-love-201109122304, "actor”: “dan”} Fast for all 3 use cases!

removing indexes pays off Don’t need to buy more/bigger machines!

sites! sites! sites!

padding factor •  Variable document size •  Allocate for the
latest and fattest •  Document moves •  Can be very inefficient •  More RAM! •  Pre-allocate to prevent moves

unbounded embedded lists •  Useful for followers, favorites •  Good
for a few things, bad for lots •  Constantly bumping up padding factor •  Lots of document moves

a metaphor •  You run a coffee shop and can
buy only one size of cup. Which size do you buy? •  On average, each customer has only one cup •  Heavy drinkers have hundreds of cups credit: Macintex macintex.deviantart.com

bucketing! •  Split list across multiple documents •  Median number
of items = bucket size •  Pre-allocate •  Easy seeking and traversal •  Much faster

site.meta 1 site.songs 1 site.songs 2 site.meta 2 Allocated and
unused Allocated and full of data hey charts!

same charts when using bucketing site.meta 1 site.songs 1 -2
site.songs 1 - 1 site.songs 2 - 1 site.songs 2 -6 site.songs 2 - 3 site.songs 2 - 4 site.songs 2 - 5 site.songs 2 - 2 site.meta 2 Allocated and unused Allocated and full of data

doesn’t work for everything… •  Picking right bucket size • 
Defragging •  Random insertion – Easy for things you don’t much care about the order of – More difficult is you’re going to insert and change the order later

micro documents db.site.songs.find({_id: / ^bfc25de08d964a8a41226c6016dd7753-/}). sort({_id:-1}) { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114",
”s" : 18436532 } { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" : 18804590 } { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" : 18804591 }

paying it back •  Bent mongoengine to make this easy
•  Follow github.com/exfm •  Also added tooling for – Trace all queries – Aggregate tracing by request middleware – Raise exceptions when queries miss an index

thanks! github.com/exfm [email protected]

mongodb + ex.fm @ MongoPGH 2012

mongodb + ex.fm @ MongoPGH 2012

Lucas Hrabovsky

More Decks by Lucas Hrabovsky

Other Decks in Technology

Featured

Transcript

mongo + ex.fm Lucas Hrabovsky CTO #MongoPGH

ex.fm turns websites into CD’s

browser extensions

_id and indexes •  Bad Ideas – ObjectId("4fb284…") – Big Compound Indexes

activity feeds: first attempt db.user.feed.find({‘username’: ‘lucas’, ‘verb’: ‘love’}) .sort({‘created’: -1})

new version of activity feeds db.user.feed.find({‘vid’: /^lucas-/}) .sort({‘vid’: -1}) {“_id”:

removing indexes pays off Don’t need to buy more/bigger machines!

sites! sites! sites!

padding factor •  Variable document size •  Allocate for the

unbounded embedded lists •  Useful for followers, favorites •  Good

a metaphor •  You run a coffee shop and can

bucketing! •  Split list across multiple documents •  Median number

site.meta 1 site.songs 1 site.songs 2 site.meta 2 Allocated and

same charts when using bucketing site.meta 1 site.songs 1 -2

doesn’t work for everything… •  Picking right bucket size •

micro documents db.site.songs.find({_id: / ^bfc25de08d964a8a41226c6016dd7753-/}). sort({_id:-1}) { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114",

paying it back •  Bent mongoengine to make this easy

thanks! github.com/exfm [email protected]