Slide 1

Slide 1 text

© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. December 05, 2014 | SFO AWS Pop Up Loft MongoDB on AWS J. Randall Hunt, AWS @jrhunt

Slide 2

Slide 2 text

MongoDB

Slide 3

Slide 3 text

Assumptions • You're familiar with MongoDB • You've run production workloads with MongoDB • You're familiar with NoSQL vocabulary • If you have questions ask them now.

Slide 4

Slide 4 text

Disclaimer • I don't work for MongoDB • Some of this could be wrong • Run all benchmarks yourself! • Test your workloads now

Slide 5

Slide 5 text

MongoDB 2.8 Major Changes • Pluggable Storage Engines • Collection Level Locking • Larger Replica Set Support • Better Authentication Methods • Better Query Introspection • Toolchain rewritten in Go

Slide 6

Slide 6 text

Pluggable Storage Engine • Wired Tiger: – Storage engine written for modern hardware – Focuses on efficient use of RAM • MMAPv1 – Default, traditional – Collection Level Locking – What you're used to

Slide 7

Slide 7 text

Wired Tiger • Mixed storage formats (B-tree, column, LSM) • non-blocking access and write implementations • no in-pace updates • use cache resources • compression choices

Slide 8

Slide 8 text

Wired Tiger Compression storage: dbPath: "/data/wt_snappy" engine: "wiredTiger"

Slide 9

Slide 9 text

Wired Tiger Compression, zlib storage: dbPath: "/ssd/db/wt_zlib" engine: "wiredTiger" wiredTiger: collectionConfig: "block_compressor=zlib"

Slide 10

Slide 10 text

Larger Replica Set Support • Replicasets can now have 50 members • Still only 7 voting members • Replica set code rewritten, many more tests • More stable (in my tests)

Slide 11

Slide 11 text

SCRAM-SHA1 Authentication • Salted Challenge Response Authentication Mechanism • SASL mechanism • RFC 5802 • Even if data is taken damage is limited

Slide 12

Slide 12 text

Query Introspection • .explain() works on every operation • .explain() provides detailed information about query planner

Slide 13

Slide 13 text

Entire Toolchain Re-written In Go • https://github.com/mongodb/mongo-tools/ • bsondump (explore bson files) • mongoimport/export (csv/json import/export) • mongodump/restore (bson import/export) • mongostat (monitor servers or clusters) • mongofiles (IO for gridfs) • mongooplog (replay oplog entries) • mongotop (like top for mongod)

Slide 14

Slide 14 text

MMS Automation https://docs.mms.mongodb.com/tutorial/configure-aws-settings/

Slide 15

Slide 15 text

Running On AWS

Slide 16

Slide 16 text

AWS Marketplace • https://aws.amazon.com/marketplace/cp/MongoDB/ • Free (still pay for instances) • Tested • Optimized

Slide 17

Slide 17 text

CloudFormation Templates • https://github.com/crcsmnky/aws-cfn-mongodb • Not terribly up to date • Doesn't work in every region • Infrastructure as code! • Easy to fork and edit • Good bootstrapping for Chef, Puppet, etc.

Slide 18

Slide 18 text

Runtime Options • Deploy on generic EC2 • Deploy as an opsworks stack • Deploy with elastic beanstalk (so cool) • Deploy with EMR integration

Slide 19

Slide 19 text

Opsworks

Slide 20

Slide 20 text

Opsworks

Slide 21

Slide 21 text

Elastic Beanstalk • Autoscale App Servers • Autoscale MongoS • Manage application/drivers in one place

Slide 22

Slide 22 text

Elastic Load Balancer App Server mongos App Server mongos App Server mongos MongoDB Cluster Security Group Auto Scaling Group

Slide 23

Slide 23 text

Elastic Map Reduce • https://github.com/mongodb/mongo-hadoop • MongoDB-Hadoop bi-directional connector • Hive and Pig integrations • Works with BSON or live MongoDB

Slide 24

Slide 24 text

Infrastructure • EIPs • VPC • IAM • Route53 (DNS) • CloudWatch

Slide 25

Slide 25 text

EC2 Instance Types • General Purpose (t2,m3) • Compute Optimized (c3) • GPU (G2) • Memory Optimized (R3) • Storage Optimized (I2,HS1)

Slide 26

Slide 26 text

Sizing Details mongod core process memory, network, storage m3, i2, r3 config server sharding metadata low power t2, m3 mongos query router low power deploy with app servers

Slide 27

Slide 27 text

Replica Set: Highly Available • 5 Servers • 2 Regions • 3 AZs

Slide 28

Slide 28 text

Questions To Ask For Production • How much availability do I really need? • How much IO do I really need? • Can I withstand the loss of a zone? of a region? • Where are my users coming from? • What are my security requirements?

Slide 29

Slide 29 text

Storage Options • EBS – GP2 (fast, burstable, credit system) – PIOPS (stable) – Magnetic • Instance – Ephemeral (fast but scary)

Slide 30

Slide 30 text

Randall's Best Practices

Slide 31

Slide 31 text

Instance Configuration • Use Amazon Linux on HVM • Install from the mongodb repo with yum • Use PIOPS EBS for consistent performance • Use EBS optimized

Slide 32

Slide 32 text

Storage Settings (more important) • Separate EBS volumes for data, log, journal • Use EXT4 or XFS (but XFS can be bad) • Set read ahead to 16k (blockdev --setra) (for EBS volumes only) • No need for RAID 10, just use RAID 0 • Mount noatime, noexec, nodiratime

Slide 33

Slide 33 text

Networking Settings • set ulimits to max • set TCP keepalive to very high • use CNAMEs for servers and configuration so that it's easy to upgrade/add/remove instnaces • Use an IAM mongodb role

Slide 34

Slide 34 text

Metrics (Cloudwatch + Alerting) • Storage: – Queue depth – Read/Write latency

Slide 35

Slide 35 text

Tips and Tricks • Use a delayed secondary • dump/restore won't work in a timely manner for larger deployments, use MMS backup or EBS snapshots instead: https://github.com/msaffitz/ mongolly • Use MMS for everything.

Slide 36

Slide 36 text

Please give us your feedback on this presentation © 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. #awspopuploft