Slide 1

Slide 1 text

Mark  Hillick [email protected] via  [email protected]  :) http://www.mongodb.org Who, what & in the cloud :) Friday 23 November 12

Slide 2

Slide 2 text

Summary/Agenda • Who & what • Example Deployments • EC2 Notes • EC2 Best Practices • Further Tuning Friday 23 November 12

Slide 3

Slide 3 text

Example Deployments • Replica Sets • Shards • Some notes on EC2 deployments Friday 23 November 12

Slide 4

Slide 4 text

Replica Set Configurations Primary Arbiter Secondary Primary Secondary Secondary Primary Secondary Secondary Secondary Secondary (Minimum) (Typical) Friday 23 November 12

Slide 5

Slide 5 text

Some RS Notes • Asynchronous replication (single primary) • Automatic failover • App-level definition of “write replication” • Secondary nodes can replicate with a slaveDelay • Secondary nodes can be hidden • Maximum of 12 nodes, with 7 voting Friday 23 November 12

Slide 6

Slide 6 text

Sharding Primary Secondary Secondary Primary Secondary Secondary Primary Secondary Secondary Primary Secondary Secondary mongos mongos mongos mongos config DB config DB config DB Friday 23 November 12

Slide 7

Slide 7 text

Sharding Notes • Each “shard” usually a Replica Set (same options) • Meta Data for shard stored in ConfigDB • Copy of meta data stored in-memory by mongos • Config DB cluster is *not* a replica set • Data split into chunks, using range based shard key • Chunks may be migrated between shards • New chunks created by “splitting” old chunks Friday 23 November 12

Slide 8

Slide 8 text

Shard Server in EC2 (1) Category/Impact Low Medium High Disk Speed x Disk Capacity x RAM x CPU x Friday 23 November 12

Slide 9

Slide 9 text

Shard Server in EC2 (2) • MongoDB designed for OS defaults on 64 bit instance • Use standard virtual memory page size • Raise “nofiles” ulimit • Use RAID10 & modern f/s -> ext4, xfs etc • Use “noatime” mount option Friday 23 November 12

Slide 10

Slide 10 text

Shard Server in EC2 (3) • kernel >= 2.6.23/2.6.25 respectively •Readahead: how much more to read than what you asked for • If too high => possible performance impact • Set to 0 on EBS devices • Set to desired value on RAID device Friday 23 November 12

Slide 11

Slide 11 text

Config Server in EC2 (1) Category/Impact Low Medium High Disk Speed x Disk Capacity x RAM x CPU x Friday 23 November 12

Slide 12

Slide 12 text

Config Server in EC2 (2) • Use Raid10 • Use 64 bit instance • Can run on shard servers Friday 23 November 12

Slide 13

Slide 13 text

Arbiter in EC2 (1) Category/Impact Low Medium High Disk Speed x Disk Capacity x RAM x CPU x Friday 23 November 12

Slide 14

Slide 14 text

Arbiter in EC2 (2) • Can use micro instance • Elections may be slower • Can use instance store • Still want backups :) Friday 23 November 12

Slide 15

Slide 15 text

EC2 Notes Friday 23 November 12

Slide 16

Slide 16 text

Instance Types and Capabilities Instance Type API Name Available RAM (GB) Network (Gbps) Cores EC2 Units Standard Hi-Mem Hi-CPU Cluster Compute Micro m1.small 1.71 0.25 1 1 m1.medium 3.75 0.25 1 2 m1.large 7.5 0.5 2 2 m1.xlarge 15 1.0 4 8 m2.xlarge 17.1 0.25 2 6.5 m2.2xlarge 34.2 0.5 4 13 m2.4xlarge 68.4 1.0 8 26 c1.medium 1.7 0.25 2 5 c1.xlarge 7 1.0 8 20 cc1.4xlarge 23 10* 8 33.5 cc1.8xlarge 60 10* 16 88 t1.micro 0.613 0.1 1** 2** * Although Cluster Compute nodes have 10Gbps dedicated, there is a 2Gbps rate limit between the instances and EBS, limiting IO to 2GBps ** Micro instances are really just for testing - even their stated EC2 units are burst only Friday 23 November 12

Slide 17

Slide 17 text

Instances Guidelines (1) • Use 64-bit only, 32-bit is not recommended • Primary/Secondary should be equal* • High CPU is not necessary • High Memory for large mongod instances • Network capacity is also IO capacity • EBS Friday 23 November 12

Slide 18

Slide 18 text

Instances Guidelines (2) • Note the trade-offs - memory/network • m1.large to m2.xlarge = 2x Mem, 0.5x Network • Do not use micro except for testing • m1.medium is usually sufficient for config DB • m1.small can be used for Arbiters Friday 23 November 12

Slide 19

Slide 19 text

EC2/MongoDB Best Practices • https://wiki.10gen.com/display/DOCS/Amazon+EC2 • https://wiki.10gen.com/display/DOCS/Production+Notes Friday 23 November 12

Slide 20

Slide 20 text

System Configuration • Use 64-bit, Linux preferred • Set file descriptor limits (20,000 or above) • Turn off atime on filesystem (pre-2.6.30 especially) • Use ext4/XFS as the filesystem (not ext3) • RAID 10 is recommended everywhere • mitigates slow EBS volumes (fail the bad volume) • Do not use large VM pages • Do configure swap to prevent OOM Killer Friday 23 November 12

Slide 21

Slide 21 text

Backups • EBS Snapshots - RAID complicates things • If possible, single EBS volume, hidden slave can be used to simplify • Single EBS volume, with journaling means: • No fsync & lock required • Similar applies to LVM snapshots • http://www.mongodb.org/pages/viewpage.action? pageId=19562846 Friday 23 November 12

Slide 22

Slide 22 text

Further Tuning Friday 23 November 12

Slide 23

Slide 23 text

Tweaking for Performance • Place journal on separate EBS volume(s) - leave readahead as-is • On data volume, lower readahead to a reasonable level (mongod must be restarted) • Each EBS volume is ~100 IOPS • Use MMS and munin-node to track IO over time • Also track Flush average • Fragmentation can cause operations to be expensive • Trade-offs for using compact and repair Friday 23 November 12