Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MongoDB (Or: How I learned to stop worrying, an...

MongoDB (Or: How I learned to stop worrying, and love the meshuggah DB)

A high level overview of MongoDB given by Jeff Hickson on May 9th, 2014, including its sharding capabilities common gotchas.

VM Farms Inc.

June 02, 2014
Tweet

More Decks by VM Farms Inc.

Other Decks in Technology

Transcript

  1. How I know mongo so well. I have spent about

    30+ hours like this, talking to mongo people around the world. At all times of day.
  2. How mongo works. Main Components: • mongod ◦ config server

    • mongos • tools • driver (language)
  3. Components - Non-Sharded Mongod • Main database program. • Stores

    data, serves queries. Driver • Programming language component to allow you to query mongo from $language.
  4. Components - Non-Sharded Tools! • Tailing the log. Very useful.

    • mongostat - Like top for mongo. • mongotop - a not as useful top for mongo. • mongodump/restore - backup and load. • mongo - a mongo shell. • mongoimport/export - backup to other formats.
  5. Components - Sharded Mongos • Acts as a pass through,

    serves queries to shards, assembles results. ◦ Sorts, merges results, etc. • Can be a performance bottleneck! ◦ Not single-threaded, it just sucks.
  6. Components - Sharded Mongod - Config Server • Normal mongod

    running with a special start option. • Holds metadata about what information is on what shard. ◦ Super-critical information. Cannot actively re-gen.
  7. The Ws of shards. What? • It breaks your data

    into chunks and distributes over a number of servers. Why? • To scale write performance. Read performance can scale with replicas.
  8. The Ws of shards. When? • Before you have write

    throughput issues. How? • Very Carefully, and with some downtime. ◦ 2-3 restarts to enable it. ◦ Time to migrate data to other shards (days-weeks) ◦ Create a sharded cluster at the start.
  9. Turning on sharding. First, you need config servers. • You

    can only have one or three. Never have only one. Second, every server should have at least one replica. • No, config servers cannot have replicas (this is dumb).
  10. Turning on sharding. Third, prepare your shard keys. • Prepare

    these EXTREMELY carefully. ◦ These determine if the cluster will hotspot, how much it needs to migrate data. ◦ Changing it is extremely painful, hours of downtime. Fourth, add shards into the sharding config, and enable sharding on collections with your keys.
  11. Databases, collections, documents. Difference? • Mongo does not enforce schema.

    You can throw anything into anywhere. ◦ No Pancakes. Mongo MySQL Database Database Collection Table Document Row
  12. Indexes • Always use them. • Mongo is picky over

    using them. • Tack on .explain() to any query to see what mongo wants to do (query plan). • Desperately try to avoid table scans. • Mongo is really slow without indexes.