MongoNYC 2012: Rock Solid MongoDB Ops: Running MongoDB Like a Pro

ROCK-SOLID MONGO OPS Running MongoDB like a Pro

Who am I? Todd O. Dampier [email protected] @t0dampier §  CTO
for mongolab.com §  In 3 cloud providers §  Many hosts,

Four operational essentials ①  Stay up. ②  Stay fast. ③ 
Take good care of your data. ④  Always know

The world needs your application ’round the clock! 1. Stay
up.

Replica Sets –

Part of staying up is knowing how to survive the
election process. §  Understand the dynamics of failover! ú  It’s not magic; there are rules & gotchas. ú  Vulnerable to false positives in the real world   network ﬂaps, high load è failover ú  Do what you can to minimize false positives   Synchronize your clocks (ntp)   Use version ≥ 2.0.3

Graceful failure starts at the client. §  Conﬁgure driver for
a cluster connection. §  Anticipate failovers; where appropriate… §  catch exceptions, §  use retry loops, & §  set timeouts §  Is eventual consistency ok? §  If master goes down, are lost writes ok?

Replica sets are great for planned changes, too.

replicate replicate mongod (SECONDARY) mongod (SECONDARY) old mongod (SECONDARY ➠
gone) new mongod (PRIMARY) 4 5 4 …then take the old master ofﬂine.

No one likes slow software. 2. Stay fast.

Be sure you have the right indexes. §  At scale,
indexes mean the difference between fast, slow, and toast. ú  Many page faults per query can kill the server. ú  Even with entire working set in RAM, scanning a collection ⇒ O(n) more cycles per query. §  But don’t overcompensate. ú  Each index increases insert latency and memory footprint. ú  Nonselective indexes are worse than useless.   e.g., indexing on a ﬁeld with values ∊ { 0, 1 }

What are “the right indexes”? §  Learn to think about
indexes & queries. ú  http://mongodb.org/display/DOCS/Indexes §  Discover missed index opportunities. ú  egrep 'nscanned:\w{5,}' mongodb.log ú  Use proﬁler to dissect slow queries: http://bit.ly/mlabprof   “slow”? egrep '\w{5,}ms$' mongodb.log §  Sometimes it’s better to ﬁx the query, application logic, and/or schema design.

Understand MongoDB concurrency. §  The One Global Write Lock :
TOGWL™ ú  lots of write cycles è this can ruin your day. ú  build indexes ofﬂine or in the background! ú  B-tree rebalancing: the silent killer. §  Holding lock + no indexes è very bad ú  e.g., findAndModify with poor/no index §  Troubleshooting : mongostat 5 ú  large #s in “faults” col è see “index” slides ú  large #s in “wq|rq” col è who’s got the lock?

Q: When is a write not a write? A: When
it does not get written (enough). 3. Take good care of your data.

Embrace single-node durability. §  Use mongod journaling feature. ú  Hard
crash will leave databases intact. ú  Allows one to snapshot ﬁles without locking server. ú  On by default in 2.0; use -‐-‐journal in 1.8 §  Tip ☞ Keep 3 pre-allocated 1GB journal ﬁles on the spindle for a quicker restart. §  Tip ☞ In non-production setting, restart without journaling for any big, disposable data load. ú  e.g., mongoimport, full resync, etc. ú  to do this in 2.0, use -‐-‐nojournal

Be disciplined about backups. §  Backup from a (hidden) SECONDARY;
ú  PRIMARY has enough load already. §  Approaches[1]: 1.  fsync, lock, cp 2.  mongodump   when in doubt, -‐-‐forceTableScan   -‐-‐oplog è point-in-time for whole server 3.  point-in-time fs snapshot (EBS or LVM) §  Store in a safe place (e.g., S3) §  Consider frequency & retention ú  e.g., keep 5 dailies and 3 weeklies

Think through replica set reads. §  “slaveOk” ReadPreference.secondary ú  Reads
from replicas can boost performance ú  Setting means “slave secondary if at all possible” – master Primary won’t contribute to read throughput if any slaves are available. §  “Eventual consistency” ú  data from previous writes may not be there yet.

Think through replica set writes. §  Every mutation must hold
TOGWL™. §  Durability: mutations not guaranteed to persist until they reside on the disks of a majority of nodes. §  In the event of a failover, is there anything to be concerned about? ú  Let’s look at an example …

The reality is: slaves lag behind master’s ops. replicate replicate
mongod (SECONDARY) mongod (SECONDARY) mongod (PRIMARY) heartbeat heartbeat heartbeat replicating data replicating data client inserts 3 2 1 2 1 1 2 3 4 5

mongod (SECONDARY) mongod (SECONDARY) mongod (PRIMARY) no heartbeat! no heartbeat!
heartbeat 10278 dbclient error communicating with server 10278 dbclient error communicating with server client inserts 3 2 1 2 1 1 2 3 4 5 election time! Master can become unreachable before slave replicates all data…

mongod (SECONDARY) mongod (PRIMARY) mongod (PRIMARY) no heartbeat no heartbeat
heartbeat client inserts 3 2 1 2 1 1 2 3 4 5 I won! 6 7 Client who retries a failed insert, will take his business to the newly-elected master.

mongod (SECONDARY) mongod (PRIMARY) mongod (RECOVERING) heartbeat replicating data heartbeat
heartbeat 3 2 1 2 1 1 2 3 4 5 7 6 7 6 3 rollback/t.bson 4 5 To come back online as a slave, old master must rollback un-replicated inserts.

mongod (SECONDARY) mongod (PRIMARY) mongod (SECONDARY) heartbeat replicating ops heartbeat
heartbeat replicating ops 1 2 1 1 2 3 6 3 6 7 6 7 client i/u/d ops 3 ß 8 ß 3 8 ß 7 8 Not just INSERT ops! UPDATE and DELETE ops may be caught unsync’ed at failover time – no rollback ﬁle for these. um, okay … so what do I do about that data?

Can distributed consistency problems be avoided? §  Yes (mostly). Client
must cope. §  For reads: slaveOk not okay §  For writes: Set w > ( N / 2.0 ) §  w: “majority” does this automagically in 2.0 §  But cluster will be less available & slower. §  CAP theorem (q.v.) does apply to you as well. §  For thus have the wise men blogged.

So “write concern” ⇔ high-value ops §  { getLastError :
1, w : 2 } ⇒ deliver to 2 nodes before returning §  For all but the 1st node, “delivered” is in the TCP/IP sense; §  the written op isn’t on a node’s disk until the next journal “group commit”. §  Durable from there. replicate replicate mongod (SECONDARY) mongod (SECONDARY) mongod (PRIMARY) heartbeat heartbeat heartbeat Client Application MongoDB Driver slaveOk slaveOk R/W

You can still sleep at night…

Monitoring & alerting – WHAT? §  Instrument / measure /
probe §  Collect / store §  Exhibit / ops dashboards §  Threshold critical measures §  Alarm / notify if crossed ú  control noise: “capacitance” & “de-bouncing” §  Escalate / Resolve – workﬂows §  Track / analyze / report §  Enable/disable : surprisingly big PITA §  Monitor proactively è grow panic-free

Monitoring & alerting – HOW? §  Monitoring systems ú  MMS
by 10gen ú  Munin / plugins ú  Cacti; Zabbix; &c. §  Measures ú  Page faults ú  Lock % (TOGWL)   wq , rq ú  Disk throughput ú  Replication lag ú  (many others) §  Alerting systems ú  Nagios ú  Site24x7 ú  PagerDuty §  Thresholds ú  “warn” ú  “critical” ú  “DOWN” §  Actions: SMS, email

Monitoring & alerting – OMG.

Wait .. is that all? And then … ?

Many more aspects to consider… •  Choice of “machine” • 
Mass storage •  Conﬁguration tweaks •  Availability /

Resources online from a great community! http://www.10gen.com/presentations/ mongomunich-2011/operational-mongodb http://www.10gen.com/presentations/ mongomunich-2011/learning-by-doing-
running-a-mongodb-the-hard-way Operations Understanding MongoDB & Keeping it Happy Brendan McAdams 10gen, Inc. [email protected] @rit Monday, October 10, 11 Learning by doing - running a mongoDB, the hard way 10.10.2011 – 10gen Mongo Munich, Sandro Grundmann

Questions? or, you could just enjoy this clip-art kitten…

THANK YOU. for @mongolab, I have been @t0dampier.

MongoNYC 2012: Rock Solid MongoDB Ops: Running ...

MongoNYC 2012: Rock Solid MongoDB Ops: Running MongoDB Like a Pro

mongodb

More Decks by mongodb

Featured

Transcript

ROCK-SOLID MONGO OPS Running MongoDB like a Pro

Who am I? Todd O. Dampier [email protected] @t0dampier §  CTO

Four operational essentials ①  Stay up. ②  Stay fast. ③

The world needs your application ’round the clock! 1. Stay

Replica Sets –

Part of staying up is knowing how to survive the

Graceful failure starts at the client. §  Conﬁgure driver for

Replica sets are great for planned changes, too.

replicate replicate mongod (SECONDARY) mongod (SECONDARY) old mongod (SECONDARY ➠

No one likes slow software. 2. Stay fast.

Be sure you have the right indexes. §  At scale,

What are “the right indexes”? §  Learn to think about

Understand MongoDB concurrency. §  The One Global Write Lock :

Q: When is a write not a write? A: When

Embrace single-node durability. §  Use mongod journaling feature. ú  Hard

Be disciplined about backups. §  Backup from a (hidden) SECONDARY;

Think through replica set reads. §  “slaveOk” ReadPreference.secondary ú  Reads

Think through replica set writes. §  Every mutation must hold

The reality is: slaves lag behind master’s ops. replicate replicate

mongod (SECONDARY) mongod (SECONDARY) mongod (PRIMARY) no heartbeat! no heartbeat!

mongod (SECONDARY) mongod (PRIMARY) mongod (PRIMARY) no heartbeat no heartbeat

mongod (SECONDARY) mongod (PRIMARY) mongod (RECOVERING) heartbeat replicating data heartbeat

mongod (SECONDARY) mongod (PRIMARY) mongod (SECONDARY) heartbeat replicating ops heartbeat

Can distributed consistency problems be avoided? §  Yes (mostly). Client

So “write concern” ⇔ high-value ops §  { getLastError :

You can still sleep at night…

Monitoring & alerting – WHAT? §  Instrument / measure /

Monitoring & alerting – HOW? §  Monitoring systems ú  MMS

Monitoring & alerting – OMG.

Wait .. is that all? And then … ?

Many more aspects to consider… •  Choice of “machine” •

Resources online from a great community! http://www.10gen.com/presentations/ mongomunich-2011/operational-mongodb http://www.10gen.com/presentations/ mongomunich-2011/learning-by-doing-

Questions? or, you could just enjoy this clip-art kitten…

THANK YOU. for @mongolab, I have been @t0dampier.