Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MongoDB 2.8 and MongoDB on AWS

MongoDB 2.8 and MongoDB on AWS

A few notes about running MongoDB on AWS


Randall Hunt

December 05, 2014


  1. © 2014 Amazon.com, Inc. and its affiliates. All rights reserved.

    May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. December 05, 2014 | SFO AWS Pop Up Loft MongoDB on AWS J. Randall Hunt, AWS @jrhunt
  2. MongoDB

  3. Assumptions • You're familiar with MongoDB • You've run production

    workloads with MongoDB • You're familiar with NoSQL vocabulary • If you have questions ask them now.
  4. Disclaimer • I don't work for MongoDB • Some of

    this could be wrong • Run all benchmarks yourself! • Test your workloads now
  5. MongoDB 2.8 Major Changes • Pluggable Storage Engines • Collection

    Level Locking • Larger Replica Set Support • Better Authentication Methods • Better Query Introspection • Toolchain rewritten in Go
  6. Pluggable Storage Engine • Wired Tiger: – Storage engine written

    for modern hardware – Focuses on efficient use of RAM • MMAPv1 – Default, traditional – Collection Level Locking – What you're used to
  7. Wired Tiger • Mixed storage formats (B-tree, column, LSM) •

    non-blocking access and write implementations • no in-pace updates • use cache resources • compression choices
  8. Wired Tiger Compression storage: dbPath: "/data/wt_snappy" engine: "wiredTiger"

  9. Wired Tiger Compression, zlib storage: dbPath: "/ssd/db/wt_zlib" engine: "wiredTiger" wiredTiger:

    collectionConfig: "block_compressor=zlib"
  10. Larger Replica Set Support • Replicasets can now have 50

    members • Still only 7 voting members • Replica set code rewritten, many more tests • More stable (in my tests)
  11. SCRAM-SHA1 Authentication • Salted Challenge Response Authentication Mechanism • SASL

    mechanism • RFC 5802 • Even if data is taken damage is limited
  12. Query Introspection • .explain() works on every operation • .explain()

    provides detailed information about query planner
  13. Entire Toolchain Re-written In Go • https://github.com/mongodb/mongo-tools/ • bsondump (explore

    bson files) • mongoimport/export (csv/json import/export) • mongodump/restore (bson import/export) • mongostat (monitor servers or clusters) • mongofiles (IO for gridfs) • mongooplog (replay oplog entries) • mongotop (like top for mongod)
  14. MMS Automation https://docs.mms.mongodb.com/tutorial/configure-aws-settings/

  15. Running On AWS

  16. AWS Marketplace • https://aws.amazon.com/marketplace/cp/MongoDB/ • Free (still pay for instances)

    • Tested • Optimized
  17. CloudFormation Templates • https://github.com/crcsmnky/aws-cfn-mongodb • Not terribly up to date

    • Doesn't work in every region • Infrastructure as code! • Easy to fork and edit • Good bootstrapping for Chef, Puppet, etc.
  18. Runtime Options • Deploy on generic EC2 • Deploy as

    an opsworks stack • Deploy with elastic beanstalk (so cool) • Deploy with EMR integration
  19. Opsworks

  20. Opsworks

  21. Elastic Beanstalk • Autoscale App Servers • Autoscale MongoS •

    Manage application/drivers in one place
  22. Elastic Load Balancer App Server mongos App Server mongos App

    Server mongos MongoDB Cluster Security Group Auto Scaling Group
  23. Elastic Map Reduce • https://github.com/mongodb/mongo-hadoop • MongoDB-Hadoop bi-directional connector •

    Hive and Pig integrations • Works with BSON or live MongoDB
  24. Infrastructure • EIPs • VPC • IAM • Route53 (DNS)

    • CloudWatch
  25. EC2 Instance Types • General Purpose (t2,m3) • Compute Optimized

    (c3) • GPU (G2) • Memory Optimized (R3) • Storage Optimized (I2,HS1)
  26. Sizing Details mongod core process memory, network, storage m3, i2,

    r3 config server sharding metadata low power t2, m3 mongos query router low power deploy with app servers
  27. Replica Set: Highly Available • 5 Servers • 2 Regions

    • 3 AZs
  28. Questions To Ask For Production • How much availability do

    I really need? • How much IO do I really need? • Can I withstand the loss of a zone? of a region? • Where are my users coming from? • What are my security requirements?
  29. Storage Options • EBS – GP2 (fast, burstable, credit system)

    – PIOPS (stable) – Magnetic • Instance – Ephemeral (fast but scary)
  30. Randall's Best Practices

  31. Instance Configuration • Use Amazon Linux on HVM • Install

    from the mongodb repo with yum • Use PIOPS EBS for consistent performance • Use EBS optimized
  32. Storage Settings (more important) • Separate EBS volumes for data,

    log, journal • Use EXT4 or XFS (but XFS can be bad) • Set read ahead to 16k (blockdev --setra) (for EBS volumes only) • No need for RAID 10, just use RAID 0 • Mount noatime, noexec, nodiratime
  33. Networking Settings • set ulimits to max • set TCP

    keepalive to very high • use CNAMEs for servers and configuration so that it's easy to upgrade/add/remove instnaces • Use an IAM mongodb role
  34. Metrics (Cloudwatch + Alerting) • Storage: – Queue depth –

    Read/Write latency
  35. Tips and Tricks • Use a delayed secondary • dump/restore

    won't work in a timely manner for larger deployments, use MMS backup or EBS snapshots instead: https://github.com/msaffitz/ mongolly • Use MMS for everything.
  36. Please give us your feedback on this presentation © 2014

    Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. #awspopuploft