chefconf 2013, managing multiple mongodb clusters with chef

Charity Majors @mipsytipsy Friday, April 26, 13

• Replica sets • Multiple clusters • Sharding • Arbiters
• Provisioning Managing multiple MongoDB clusters with Chef • Snapshots • Fragmentation • Monitoring • State of the mongo cookbook • Glossary of tools Friday, April 26, 13

Basic replica set This conﬁguration can survive any single node
or Availability Zone outage Friday, April 26, 13

How do I chef that? • Grab the mongodb and
aws cookbooks • Make a wrapper cookbook (mongodb_parse) with a recipe to construct our default replica set. Friday, April 26, 13

Now make a role for your cluster that uses your
default replica set recipe. ... and launch some nodes. Friday, April 26, 13

Initiate the replica set in the mongo shell, And you’re
set! You now have one empty mongo cluster with no data. Friday, April 26, 13

Ok. But what if I have data? Backup options: •
snapshot • mongodump Provisioning options: • snapshot • secondary sync • mongorestore (in theory) Snapshots are great and you should use them. Friday, April 26, 13

Provisioning with initial sync • Compacts and repairs your collections
and databases • Hard on your primary, does a full table scan of all data • On > 2.2.0 you can sync from a secondary by button-mashing rs.syncFrom(“host:port”) on startup • Or use iptables to block secondary from viewing primary (all versions) • Not riskless, it resets the padding factor for all collections to 1 Friday, April 26, 13

Conﬁguring RAID for EBS • Decide how many volumes to
RAID (this is hard to change later!) • Include mongodb::raid_data in your replica set recipe • Set a role attribute for mongodb[:vols] to indicate how many volumes to RAID • Set a role attribute for mongodb[:volsize] to indicate the size of the volumes • Set the mongodb[:use_piops] attribute if you want to use PIOPS volumes Friday, April 26, 13

Adding snapshots • Pick a node to be your snapshot
host • Specify that node as the mongodb[:backup_host] in your cluster role • Get the volume ids that are mounted on /var/lib/mongodb. Add them to your cluster role attributes as backups[:mongo_volumes]. Friday, April 26, 13

mongodb::backups How it works: • Installs a cron job on
the backup host • Cron job does an ec2-consistent-snapshot of the volumes speciﬁed in backups[:mongo_volumes] • Locks mongo during snapshot • Tags a “daily” snapshot once a day, so you can easily prune hourly snapshot sets while keeping raid array sets coherent Friday, April 26, 13

Provisioning nodes from snapshot How it works • checks to
see if mongo $dbpath is mounted • if not, grab the latest completed set of snapshots for the volumes in backup[:mongo_volumes] • provision and attach a new volume from each snapshot • assemble the RAID array, mount on $dbpath To reprovision, just delete the aws attributes from the node, detach the old volumes, and re-run chef-client. Friday, April 26, 13

• different types of data (e.g. separate application data from
analytics) • different performance characteristics or hardware requirements • collection-level sharding isn’t appropriate, collections need to stay locally intact • staging or test clusters • remove as much as possible from the critical path Why use multiple clusters instead of just sharding? Multiple MongoDB clusters Friday, April 26, 13

Multiple clusters? Easy! Just make a role for each cluster,
with a distinct cluster name, backup host, and backup volumes. Friday, April 26, 13

What if you need to shard? Make a new recipe
in your wrapper cookbook that includes mongodb::shard, and use that recipe in the cluster’s role. Set a shard name per role. Friday, April 26, 13

Arbiters • Arbiters are mongod processes that don’t do anything
but vote. • They are awesome. They give you more ﬂexibility and reliability. • To provision an arbiter, use the LWRP. • If you have lots of clusters, you may want to run arbiters for multiple clusters on each arbiter host. • Arbiters tend to be more reliable than nodes because they have less to do. Friday, April 26, 13

arbiters’ recipe Friday, April 26, 13

Managing votes with arbiters • Three arbiter processes on each
arbiter node, one arbiter per cluster • You can have a maximum of seven votes per replica set • Now you can survive all secondaries dying, or an AZ outage • If you have even one healthy node per RS, you can continue to serve trafﬁc Friday, April 26, 13

More about snapshots • Snapshot often • Set snapshot node
to priority = 0, hidden = 1 • Lock Mongo or stop mongod during snapshot • Always warm up a snapshot before promoting • in mongodb 2.4 you can use db.touch() • warm up both indexes and data, use dd on data ﬁles to pull from S3 • http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/ • Run continuous compaction on your snapshot node Friday, April 26, 13

Fragmentation • Your RAM gets fragmented too! • Leads to
underuse of memory • Deletes are not the only source of fragmentation • db.<collection>.stats to ﬁnd the padding factor (between 1 - 2, the higher the more fragmentation) • Repair, compact, or reslave regularly (db.printReplicationInfo() to get the length of your oplog to see if repair is a viable option) Friday, April 26, 13

3 ways to ﬁx fragmentation: • Re-sync a secondary from
scratch • limited by the size of your data & oplog • very hard on your primary, use rs.syncFrom() or iptables to sync from secondary • Repair your node • also limited by the size of your oplog • can cause small discrepancies in your data • Run continuous compaction on your snapshot node • http://blog.parse.com/2013/03/26/always-be-compacting/ Friday, April 26, 13

Fragmentation is terrible Latency before and after rotating in a
compacted primary. Friday, April 26, 13

Monitoring Most people start with MMS. • hosted solution from
10gen • single python script • very pretty graphs • auto-detects replica sets from single node • can do alerting • populated by db.serverStatus() Friday, April 26, 13

Nagios • Nagios should alert (at least) if a node
actually goes down, or secondaries are lagging • We use check_mongodb.py extensively • Automatically apply all mongo checks to all mongo nodes with a hostgroup: Friday, April 26, 13

Ganglia • the goal: graph everything MMS does, and more
• use mongodb gmond python modules • our fork of the ganglia cookbook supports multiple clusters. Add the cluster name as a role attribute. Friday, April 26, 13

... and assign cluster names to ports in your ganglia
wrapper cookbook. Friday, April 26, 13

Future plans for the mongodb cookbook • Get Parse’s changes
fully upstreamed • Use LWRPs for mongod, mongos, mongoc • Add better support for ephemeral storage • Populate backup volume attributes from backup host • Support bringing up nodes with secondary initial sync • Choose which secondary to sync from via attribute • Optionally auto-join the cluster • Make EBS raid a LWRP • Add ebs_optimized support for PIOPS • ... and more. Friday, April 26, 13

Glossary of resources • Opscode AWS cookbook • https://github.com/opscode-cookbooks/aws •
edelight MongoDB cookbook • https://github.com/edelight/chef-mongodb • Parse MongoDB cookbook fork • https://github.com/ParsePlatform/Ops/tree/master/chef/cookbooks/ mongodb • Parse compaction scripts and warmup scripts • http://blog.parse.com/2013/03/07/techniques-for-warming-up-mongodb/ • http://blog.parse.com/2013/03/26/always-be-compacting/ Friday, April 26, 13

Glossary of resources (cont’d) • Opscode nagios cookbook • https://github.com/opscode-cookbooks/nagios
• nagios check_mongodb.py check • https://github.com/mzupan/nagios-plugin-mongodb • Heavywater ganglia cookbook • http://github.com/hw-cookbooks/ganglia • Parse ganglia cookbook fork • https://github.com/ParsePlatform/Ops/tree/master/chef/cookbooks/ganglia • Ganglia gmond modules • https://github.com/ganglia/gmond_python_modules/tree/master/mongodb Friday, April 26, 13

join us at the hackathon tomorrow :) Friday, April 26,
13

Charity Majors @mipsytipsy Friday, April 26, 13

chefconf 2013, managing multiple mongodb cluste...

chefconf 2013, managing multiple mongodb clusters with chef

Charity Majors

More Decks by Charity Majors

Other Decks in Technology

Featured

Transcript

Charity Majors @mipsytipsy Friday, April 26, 13

• Replica sets • Multiple clusters • Sharding • Arbiters

Basic replica set This conﬁguration can survive any single node

How do I chef that? • Grab the mongodb and

Now make a role for your cluster that uses your

Initiate the replica set in the mongo shell, And you’re

Ok. But what if I have data? Backup options: •

Provisioning with initial sync • Compacts and repairs your collections

Conﬁguring RAID for EBS • Decide how many volumes to

Adding snapshots • Pick a node to be your snapshot

mongodb::backups How it works: • Installs a cron job on

Provisioning nodes from snapshot How it works • checks to

• different types of data (e.g. separate application data from

Multiple clusters? Easy! Just make a role for each cluster,

What if you need to shard? Make a new recipe

Arbiters • Arbiters are mongod processes that don’t do anything

arbiters’ recipe Friday, April 26, 13

Managing votes with arbiters • Three arbiter processes on each

More about snapshots • Snapshot often • Set snapshot node

Fragmentation • Your RAM gets fragmented too! • Leads to

3 ways to ﬁx fragmentation: • Re-sync a secondary from

Fragmentation is terrible Latency before and after rotating in a

Monitoring Most people start with MMS. • hosted solution from

Nagios • Nagios should alert (at least) if a node

Ganglia • the goal: graph everything MMS does, and more

... and assign cluster names to ports in your ganglia

Future plans for the mongodb cookbook • Get Parse’s changes

Glossary of resources • Opscode AWS cookbook • https://github.com/opscode-cookbooks/aws •

Glossary of resources (cont’d) • Opscode nagios cookbook • https://github.com/opscode-cookbooks/nagios

join us at the hackathon tomorrow :) Friday, April 26,

Charity Majors @mipsytipsy Friday, April 26, 13