Two Years of MongoDB at Sailthru

Two Years of MongoDB at Ian White @eonwhite MongoNYC 2012
5/24/12 Scaling and Design

Sail ru • Behavioral communication and analytics platform • Powering
relevancy: one-to-one personalization across email, web, mobile • Original idea: API-based transactional email • 3 engineers two years ago, now ~65 employees • Some Clients: Fab, Huﬃngton Post, OpenSky, Patch, Thrillist, Reﬁnery 29, Totsy, Business Insider, Savored, NY Observer, College Humor, Oscar De La Renta, Tippr, NY Post, American Media, Flavorpill, Codecademy, Ahalife, GroupCommerce, BustedTees, Lifebooker, BET, Newsweek/Daily Beast, turntable.fm

Sail ru and MongoDB • MongoDB has been Sail ru’s
production database since mid-2010 (ﬁrst prototype was MySQL) • (I’ve been using MongoDB in production since 2008)

User Profile Data email onsite mobile user profile purchase customer
specific geo

Content Tailored To User Interests {if horizon_interest(‘facebook’) >= 2}

User Proﬁle + Content + Template = Message content template
message end user user proﬁle

JSON-Based DSL For Personalization (Zephyr) {* Page Format Logic *}
{page_format = ""} {if skin_ads.left || skin_ads.top || skin_ads.right} {if skin_ads.left.vars.right_rail || skin_ads.top.vars.right_rail || skin_ads.right.vars.right_rail} {page_format = "skinned_piece_with_right_rail"} {else} {page_format = "skinned_piece"} {if skin_ads.left.vars.alignment == "left"} {block.header_extension = block.header_extension - (header_content_diff + 20)} {else} {block.header_extension = block.header_extension - (half_header_content_diff + 10)} {/if} {if 10 > block.header_extension} {block.header_extension = 0} {/if} {/if} {else} {if right_rail_ad} {page_format = "piece_with_right_rail"} {else} {page_format = "centered_piece"} {/if} {/if} {followingSlugs = filter(profile.vars.sellerSlugs, lambda sellerSlug: ! contains(['bluedot', 'clearance'], sellerSlug) && sellers[sellerSlug])}

Sail ru Scale • 200 million user proﬁles • 40
million emails sent per day • 1000 requests per second • 8 replica sets, 40 nodes • Billions of documents

Sail ru Architecture • Critical services: API, link rewriting, onsite
tracking/recommendations, email delivery, reporting/user interface • Uptime is critical, any downtime impacts our customers’ revenue • Infrastructure split between Amazon EC2 and colo (Peer1) • Java, LAMP, puppet, scribe, ActiveMQ

Sail ru MongoDB Architecture • Different replica sets for different
purposes (e.g. messages vs user profiles) • Largest logical collections are partitioned at e application level • Made sense for us as our data is naturally partitioned by customer

How We Got To Mongo from MySQL • JSON is
e lingua franca • Migrated one table at a time (very, very carefully) and ran bo for a while • Glad at’s long over wi • Simpliﬁed stack

Advantages of MongoDB at Sail ru • Rapid development •
Makes it easy to store ﬂexible JSON- based customer input (many now use Mongo emselves) • Good performance • Encourages scalable approach • We know it well

Basic mongod • mongod --dbpath /path/to/db --logpath /path/to/log/ mongodb.log --logappend
--fork --rest --replSet main1 • Don’t ever run wi out replication • Don’t ever kill -9 • Don’t run wi out writing to a log • Run behind a ﬁrewall • Use journaling • Default oplog size seems ﬁne

SCALING AND OPERATIONS

Do The Simplest Thing That Could Possibly Work • Simplicity
= ﬂexibility = speed = scalability • Complexity is e enemy • The simpler it is, e more likely you can scale it

Focus On The Big Wins • Three collection types represent
almost all of our data storage • So a small win ere counts for much more an a big win elsewhere

Monitoring Is Every ing • Users will surprise you •
Systems will surprise you • Production systems are complex • Graph every ing you can, so you can see when some ing you did changed e pattern • Alerts when some ing’s wrong

Some Things To Monitor • lock %, r/w queue size,
load average • faults/sec: if is starts to creep up, you may be nearing exceeding working set • number of connections: could be driver or network connectivity problem • replication lag: usually load on primary • dataSize and indexSize

Understand What Is Happening • Graph every ing you can
• MMS is a great tool for diagnosing issues • But also Graphite / StatsD / Nagios • And don’t forget e log • explain() can shed light on pa ological queries

Control What Is Happening • All our MongoDB access happens
rough one in wrapper class which we wrote • If you use someone else’s lib or ORM, make sure (if you had to) you could do stuﬀ like: • set timeouts or enable failfast retry add a $hint for all instances of a query queue writes elsewhere temporarily ensure all writes are “safe” for a collection

Resiliency • If a write fails, can it be queued
somewhere and tried later? (What if e queue fails?) • What if a queued write is failing indeﬁnitely? • If a read fails, can you timeout quickly and try again on a diﬀerent node? (failfast retry) • In some cases we might not care, in some cases lives might depend on it

Amazon EC2 Gotchas • EBS volumes can go into degraded
state unpredictably Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf 0.00 0.00 0.00 1.50 0.00 16.00 10.67 135.13 19564.67 667.33 100.10 sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 316.31 0.00 0.00 100.10 • Replica sets are designed to promote masters on downtime, not degraded state • Thankfully, it doesn’t last too long

Amazon EC2 Gotchas • Must distribute across multiple Availability Zones
for redundancy • But you will see connectivity problems between AZs sometimes • Have seen is cause replica sets to stepDown primaries randomly

DESIGN (and a couple tricks)

Develop Your Mental Model of MongoDB • You don’t need
to know all e internals • But try to gain a working understanding of how MongoDB operates, especially RAM, indexes, and replication

Big-Picture Design Questions • What is e data I want
to store? • How will I want to use at data later? • How big will e data get? • If e answers are “I don’t know yet”, guess wi your best YAGNI

Speciﬁc MongoDB Design Questions • Embed vs top-level collection? •
Denormalize (double-store data)? • How many/which indexes? • Arrays vs hashes for embedding? • Implicit schema (ﬁeld names and types)

Favor Human- Readable Foreign Keys • DBRefs are a bit
cumbersome, we never use em • Referencing by MongoId can mean doing extra lookups • Build human-readable references to save you doing lookups and manual joins • Just be mindful of space tradeoﬀs for readability

Embed vs Top-Level Collections? • Major question of MongoDB schema
design • If you can ask e question at all, you might want to err on e side of embedding • Don’t embed if e embedding could get huge • Don’t feel too bad about denormalizing by embedding AND storing in a top-level collection

Typical Properties of Top-Level Collections • Independence: They don’t “belong”
conceptually to ano er collection • Nouns: e building blocks of your system • Easily referenceable and updatable

Embedding Pros • Fast retrieval of document wi related data
• Atomic updates • “Ownership” of embedded document is obvious • Usually maps well to code structures

Embedding Cons • Harder to get at, do mass queries
• Does not size up inﬁnitely, will hit 16MB limit • Hard to create references to embedded object • Limited ability to indexed-sort e embedded objects • Really huge objects are cumbersome and will have deserialization overhead

Indexes • Indexes are a tradeoﬀ • Keep your indexes
as small as you can and maximize e value of e ones you do add • Only worry about index size for big (or potentially big) collections

Take Advantage of Multiple-Field Indexes • If you have an
index on {client_id: 1, email: 1 } • Then you also have e {client_id: 1} index “for free” • but not { email: 1}

A Fun Gotcha We Hit (Multiple-Field Indexes) > db.test.save( {
a: 1, b: ["t1", "t2", "t3"] } ); > db.test.save( { a: 1, b: ["t4", "t5"] } ); > db.test.ensureIndex( { a: 1, b: 1 } ); > db.test.find( { a: 1 } ).explain(); • Pop quiz: what is nscanned (number of objects scanned) going to be?

A Fun Gotcha We Hit (Multiple-Field Indexes) > db.test.find( {
a: 1 } ).explain(); { "cursor" : "BtreeCursor a_1_b_1", "nscanned" : 5, "nscannedObjects" : 5, "n" : 2, • Pop quiz: what is nscanned (number of objects scanned) going to be? > db.test.save( { a: 1, b: ["t1", "t2", "t3"] } ); > db.test.save( { a: 1, b: ["t4", "t5"] } ); > db.test.ensureIndex( { a: 1, b: 1 } ); > db.test.find( { a: 1 } ).explain();

Use your _id • You must use an _id for
every collection, which will cost you index size • So do some ing useful wi _id

Take advantage of fast ^indexes • Messages have _ids like:
32423.00000341 • Need all messages in blast 32423: • db.message.blast.find( { _id: /^32423\./ } ); • (Yeah, I know e \. is ugly. Don’t use a dot if you do is.)

Questions? Looking for a job? ian@sail ru.com @eonwhite

Two Years of MongoDB at Sailthru

Two Years of MongoDB at Sailthru

More Decks by sailthru

Other Decks in Technology

Featured

Transcript