Practical Tips from 2 Years of Growing on MongoDB

Practical Tips from 2 Years of Growing on MongoDB Juan
Patten – Raﬄecopter

I’m J.R. www.Raﬄecopter.com @raﬄecopter @runningskull

What We Talkin’ Bout? • Practical lessons learned from 3
iterations of schema design • A few clever(?) tricks • Our approach to no-downtime schema migration

Mongo Is... “Web-scale for dummies!” “Schema-free, man. w00t!”

Mongo Is... “Web-scale for dummies!” “Schema-free, man. w00t!” structured data
== schema! You still have to think!

Users Raﬄes Entries Entrants Raﬄecopter

Schema #1 Raffle = { _id: <string> [ ... ]
entries: [ {...}, {...}, ... ] } • Tough to access • Padding Factor • Max Document Size LESSONS

Schema #2 • “Denormalize!” • Each entry in own document
• _id is UUID() • Indexes for each access pattern • Documents (almost) never grow • Cache results of expensive queries

Schema #2 – Lessons • Complex queries for simple things
• Indexes gigantic - killed performance • Old data “lingered” • _id unused

Schema #3 – Goals • Minimal indexes • One document
per entrant per raﬄe • _id derivable from known data • Old data should “expire” naturally

ObjectID’s – A (not so) Secret Weapon 24-byte string →
12-byte binary value new ObjectId() new ObjectId("47cc67093475061e3d95369d") or

B-TREES ObjectID’s – A (not so) Secret Weapon

_id = UUID() RAM ObjectID’s – A (not so) Secret
Weapon

_id = new ObjectId() RAM ObjectID’s – A (not so)
Secret Weapon

_id = new ObjectId() timestamp (seconds since epoch) “misc” ID
info (machine_id | pid | incr) not derivable later! ObjectID’s – A (not so) Secret Weapon

timestamp (raﬄe_date_created) “misc” ID info md5(entrant_id | raﬄe_id | salt)
_id = new ObjectId(X) X = “1234567890abcdefabcdef” ObjectID’s – A (not so) Secret Weapon

Schema #3 – Results ✓Minimal indexes (5 large → 2
small) ✓One document per entrant per raﬄe ✓_id derivable from known data ✓Old data should “expire” naturally • write lock % cut by 9x • page faults cut by 10x • open cursor size cut by 5x

Schema Takeaways • access patterns == indexes. design the schema
around them • save queries and index space by deriving ID’s from known data • denormalize in moderation

Painless Schema Migration { [ ... ] _legacy: [ ‘flag_a’,
‘flag_b’, ‘flag_c’ ] }

Thanks! @runningskull JuanPatten.com Need cheap, eﬀective marketing? Try giveaways! www.Raﬄecopter.com

Practical Tips from 2 Years of Growing on MongoDB

Practical Tips from 2 Years of Growing on MongoDB

Juan Patten

Other Decks in Programming

Featured

Transcript