Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NoSQL Matters : MongoDB Sharding

NoSQL Matters : MongoDB Sharding

NoSQL Matters 2012 : MongoDB Sharding

cj_harris5

May 31, 2012
Tweet

More Decks by cj_harris5

Other Decks in Technology

Transcript

  1. Replication • mongoDB replication like MySQL replication • Asynchronous master/slave

    • Replica sets • A cluster of N servers • All writes to primary • Reads can be to primary (default) or a secondary • Any (one) node can be primary • Consensus election of primary • Automatic failover • Automatic recovery Thursday, 31 May 12
  2. How mongoDB Replication works Member 1 Member 2 Member 3

    • Set is made up of 2 or more nodes Thursday, 31 May 12
  3. How mongoDB Replication works • Election establishes the PRIMARY •

    Data replication from PRIMARY to SECONDARY Member 1 Member 2 Primary Member 3 Thursday, 31 May 12
  4. How mongoDB Replication works • PRIMARY may fail • Automatic

    election of new PRIMARY if majority exists Member 1 Member 2 DOWN Member 3 negotiate new master Thursday, 31 May 12
  5. How mongoDB Replication works • New PRIMARY elected • Replica

    Set re-established Member 1 Member 2 DOWN Member 3 Primary Thursday, 31 May 12
  6. How mongoDB Replication works • Automatic recovery Member 1 Member

    3 Primary Member 2 Recovering Thursday, 31 May 12
  7. How mongoDB Replication works • Replica Set re-established Member 1

    Member 3 Primary Member 2 Thursday, 31 May 12
  8. > cfg = { _id : "myset", members : [

    { _id : 0, host : "germany1.acme.com" }, { _id : 1, host : "germany2.acme.com" }, { _id : 2, host : "germany3.acme.com" } ] } > use admin > db.runCommand( { replSetInitiate : cfg } ) Creating a Replica Set Thursday, 31 May 12
  9. Write scaling - add Shards write read shard1 primary secondary

    secondary shard2 primary secondary secondary shard3 primary secondary secondary Thursday, 31 May 12
  10. Big Data at a Glance Large Dataset Primary Key as

    “username” a b c d e f g h s t u v w x y z ... Thursday, 31 May 12
  11. Big Data at a Glance • Systems like Google File

    System (which inspired Hadoop’s HDFS) and MongoDB’s Sharding handle the scale problem by chunking Large Dataset Primary Key as “username” a b c d e f g h s t u v w x y z ... Thursday, 31 May 12
  12. Big Data at a Glance • Systems like Google File

    System (which inspired Hadoop’s HDFS) and MongoDB’s Sharding handle the scale problem by chunking • Break up pieces of data into smaller chunks, spread across many data nodes Large Dataset Primary Key as “username” a b c d e f g h s t u v w x y z ... Thursday, 31 May 12
  13. Big Data at a Glance • Systems like Google File

    System (which inspired Hadoop’s HDFS) and MongoDB’s Sharding handle the scale problem by chunking • Break up pieces of data into smaller chunks, spread across many data nodes • Each data node contains many chunks Large Dataset Primary Key as “username” a b c d e f g h s t u v w x y z ... Thursday, 31 May 12
  14. Big Data at a Glance • Systems like Google File

    System (which inspired Hadoop’s HDFS) and MongoDB’s Sharding handle the scale problem by chunking • Break up pieces of data into smaller chunks, spread across many data nodes • Each data node contains many chunks • If a chunk gets too large or a node overloaded, data can be rebalanced Large Dataset Primary Key as “username” a b c d e f g h s t u v w x y z ... Thursday, 31 May 12
  15. Big Data at a Glance Large Dataset Primary Key as

    “username” a b c d e f g h s t u v w x y z Thursday, 31 May 12
  16. Big Data at a Glance Large Dataset Primary Key as

    “username” a b c d e f g h s t u v w x y z MongoDB Sharding ( as well as HDFS ) breaks data into chunks (~64 mb) Thursday, 31 May 12
  17. Large Dataset Primary Key as “username” Scaling Data Node 1

    25% of chunks Data Node 2 25% of chunks Data Node 3 25% of chunks Data Node 4 25% of chunks a b c d e f g h s t u v w x y z Representing data as chunks allows many levels of scale across n data nodes Thursday, 31 May 12
  18. Large Dataset Primary Key as “username” Scaling Data Node 1

    25% of chunks Data Node 2 25% of chunks Data Node 3 25% of chunks Data Node 4 25% of chunks a b c d e f g h s t u v w x y z Representing data as chunks allows many levels of scale across n data nodes Thursday, 31 May 12
  19. Scaling Data Node 1 Data Node 2 Data Node 3

    Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z The set of chunks can be evenly distributed across n data nodes Thursday, 31 May 12
  20. Scaling Data Node 1 Data Node 2 Data Node 3

    Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z The set of chunks can be evenly distributed across n data nodes Thursday, 31 May 12
  21. Add Nodes: Chunk Rebalancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z The goal is equilibrium - an equal distribution. As nodes are added (or even removed) chunks can be redistributed for balance. Thursday, 31 May 12
  22. Add Nodes: Chunk Rebalancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z The goal is equilibrium - an equal distribution. As nodes are added (or even removed) chunks can be redistributed for balance. Thursday, 31 May 12
  23. Writes Routed to Appropriate Chunk Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z Thursday, 31 May 12
  24. Writes Routed to Appropriate Chunk Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z Thursday, 31 May 12
  25. Writes Routed to Appropriate Chunk Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z Write to key“ziggy” z Writes are efficiently routed to the appropriate node & chunk Thursday, 31 May 12
  26. Writes Routed to Appropriate Chunk Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z Write to key“ziggy” z Writes are efficiently routed to the appropriate node & chunk Thursday, 31 May 12
  27. Chunk Splitting & Balancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z Write to key“ziggy” z If a chunk gets too large (default in MongoDB - 64mb per chunk), It is split into two new chunks Thursday, 31 May 12
  28. Chunk Splitting & Balancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z z If a chunk gets too large (default in MongoDB - 64mb per chunk), It is split into two new chunks Thursday, 31 May 12
  29. Chunk Splitting & Balancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z z If a chunk gets too large (default in MongoDB - 64mb per chunk), It is split into two new chunks Thursday, 31 May 12
  30. Chunk Splitting & Balancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z z2 If a chunk gets too large (default in MongoDB - 64mb per chunk), It is split into two new chunks z1 Thursday, 31 May 12
  31. Chunk Splitting & Balancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z z2 If a chunk gets too large (default in MongoDB - 64mb per chunk), It is split into two new chunks z1 Thursday, 31 May 12
  32. Chunk Splitting & Balancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z z2 z1 Each new part of the Z chunk (left & right) now contains half of the keys Thursday, 31 May 12
  33. Chunk Splitting & Balancing Data Node 1 Data Node 2

    Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z z2 z1 As chunks continue to grow and split, they can be rebalanced to keep an equal share of data on each server. Thursday, 31 May 12
  34. Reads with Key Routed Efficiently Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y z1 Read Key “xavier” Reading a single value by Primary Key Read routed efficiently to specific chunk containing key z2 Thursday, 31 May 12
  35. Reads with Key Routed Efficiently Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y Read Key “xavier” Reading a single value by Primary Key Read routed efficiently to specific chunk containing key z1 z2 Thursday, 31 May 12
  36. Reads with Key Routed Efficiently Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y Read Key “xavier” Reading a single value by Primary Key Read routed efficiently to specific chunk containing key z1 z2 Thursday, 31 May 12
  37. Reads with Key Routed Efficiently Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y Read Keys “T”->”X” Reading multiple values by Primary Key Reads routed efficiently to specific chunks in range t u v w x z1 z2 Thursday, 31 May 12
  38. Reads with Key Routed Efficiently Data Node 1 Data Node

    2 Data Node 3 Data Node 4 Data Node 5 a b c d e f g h s t u v w x y Read Keys “T”->”X” Reading multiple values by Primary Key Reads routed efficiently to specific chunks in range t u v w x z1 z2 Thursday, 31 May 12
  39. MongoDB Sharding • Automatic partitioning and management • Range based

    • Convert to sharded system with no downtime • Fully consistent Thursday, 31 May 12
  40. How MongoDB Sharding works > db.posts.save( {age:40} ) -∞ +∞

    -∞ 40 41 +∞ •Data in inserted •Ranges are split into more “chunks” Thursday, 31 May 12
  41. How MongoDB Sharding works > db.posts.save( {age:40} ) > db.posts.save(

    {age:50} ) > db.posts.save( {age:60} ) -∞ +∞ -∞ 40 41 +∞ 41 50 51 +∞ 61 +∞ 51 60 Thursday, 31 May 12
  42. mongos • Shard Router • Acts just like a MongoD

    • 1 or as many as you want • Can run on App Servers • Caches meta-data from config servers Thursday, 31 May 12
  43. Config Server • 3 of them • Changes use 2

    phase commit • If any are down, meta data goes read only • System is online as long as 1/3 is up Thursday, 31 May 12
  44. Keys { name: “Jared”, email: “[email protected]”, } { name: “Scott”,

    email: “[email protected]”, } { name: “Dan”, email: “[email protected]”, } > db.runCommand( { shardcollection: “test.users”, key: { email: 1 }} ) Thursday, 31 May 12
  45. Chunks Min Key Max Key Shard -∞ [email protected] 1 [email protected]

    [email protected] 1 [email protected] [email protected] 1 [email protected] +∞ 1 • Stored in the config servers • Cached in MongoS • Used to route requests and keep cluster balanced Thursday, 31 May 12
  46. Balancing Shard 1 Shard 2 Shard 3 Shard 4 5

    9 1 6 10 2 7 11 3 8 12 4 17 21 13 18 22 14 19 23 15 20 24 16 29 33 25 30 34 26 31 35 27 32 36 28 41 45 37 42 46 38 43 47 39 44 48 40 mongos balancer config config config Chunks! Thursday, 31 May 12
  47. Balancing mongos balancer config config config Shard 1 Shard 2

    Shard 3 Shard 4 5 9 1 6 10 2 7 11 3 8 12 4 21 22 23 24 33 34 35 36 45 46 47 48 Imbalance Imbalance Thursday, 31 May 12
  48. Balancing mongos balancer Move chunk 1 to Shard 2 config

    config config Shard 1 Shard 2 Shard 3 Shard 4 5 9 1 6 10 2 7 11 3 8 12 4 21 22 23 24 33 34 35 36 45 46 47 48 Thursday, 31 May 12
  49. Balancing mongos balancer config config config Shard 1 Shard 2

    Shard 3 Shard 4 5 9 6 10 2 7 11 3 8 12 4 21 22 23 24 33 34 35 36 45 46 47 48 1 Thursday, 31 May 12
  50. Balancing mongos balancer config config config Shard 1 Shard 2

    Shard 3 Shard 4 5 9 6 10 2 7 11 3 8 12 4 21 22 23 24 33 34 35 36 45 46 47 48 1 Thursday, 31 May 12
  51. Balancing mongos balancer Chunk 1 now lives on Shard 2

    config config config Shard 1 Shard 2 Shard 3 Shard 4 5 9 1 6 10 2 7 11 3 8 12 4 21 22 23 24 33 34 35 36 45 46 47 48 Thursday, 31 May 12
  52. Switching Request mongos Shard 1 Shard 2 Shard 3 1

    2 3 4 1. Query arrives at MongoS 2. MongoS routes query to a single shard 3. Shard returns results of query 4. Results returned to client Thursday, 31 May 12
  53. Scatter Gather mongos Shard 1 Shard 2 Shard 3 1

    4 1. Query arrives at MongoS 2. MongoS broadcasts query to all shards 3. Each shard returns results for query 4. Results combined and returned to client 2 2 3 3 2 3 Thursday, 31 May 12
  54. Writes Inserts Requires shard key db.users.insert({ name: “Jared”, email: “[email protected]”})

    Removes Routed db.users.delete({ email: “[email protected]”}) Removes Scattered db.users.delete({name: “Jared”}) Updates Routed db.users.update( {email: “[email protected]”}, {$set: { state: “CA”}}) Updates Scattered db.users.update( {state: “FZ”}, {$set:{ state: “CA”}} ) Thursday, 31 May 12
  55. Queries By Shard Key Routed db.users.find( {email: “[email protected]”}) Sorted by

    shard key Routed in order db.users.find().sort({email:-1}) Find by non shard key Scatter Gather db.users.find({state:”CA”}) Sorted by non shard key Distributed merge sort db.users.find().sort({state:1}) Thursday, 31 May 12