Slide 1

Slide 1 text

Sharding with MongoDB Tyler Brock [email protected] @TylerBrock

Slide 2

Slide 2 text

Philosophy Concepts Architecture Mechanics

Slide 3

Slide 3 text

Philosophy

Slide 4

Slide 4 text

Philosophy MongoDB is a database for developers.

Slide 5

Slide 5 text

Build Philosophy

Slide 6

Slide 6 text

Build Scale Philosophy

Slide 7

Slide 7 text

How to Draw an Owl Philosophy

Slide 8

Slide 8 text

How to Draw an Owl Philosophy

Slide 9

Slide 9 text

> db.runCommand({enablesharding: "" }) > db.runCommand({ shardcollection: "", key: }) Draw Two Circles Philosophy

Slide 10

Slide 10 text

Concepts

Slide 11

Slide 11 text

datastore app Read/Write Simple Web Application

Slide 12

Slide 12 text

What happens when your working set exceeds memory?

Slide 13

Slide 13 text

What happens if your write load is enormous?

Slide 14

Slide 14 text

datastore app Vertical Scaling

Slide 15

Slide 15 text

app Vertical Scaling datastore

Slide 16

Slide 16 text

app Vertical Scaling datastore app app 68 GB Ram Raid10 EBS

Slide 17

Slide 17 text

datastore app Vertical Scaling app app 128 GB Ram Raid10 SSD

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

app datastore datastore datastore Horizontal Scaling 60gb

Slide 20

Slide 20 text

app datastore datastore datastore 20gb 20gb 20gb Horizontal Scaling

Slide 21

Slide 21 text

Routing Logic app datastore datastore datastore 20gb 20gb 20gb Horizontal Scaling metadata

Slide 22

Slide 22 text

Routing Logic app datastore datastore datastore 20gb 20gb Horizontal Scaling metadata 60gb

Slide 23

Slide 23 text

app Routing Logic Balancer datastore datastore datastore 20gb 20gb Horizontal Scaling metadata 60gb

Slide 24

Slide 24 text

app Routing Logic Balancer datastore datastore datastore Horizontal Scaling metadata 30gb 30gb 30gb

Slide 25

Slide 25 text

Architecture

Slide 26

Slide 26 text

Really is just a mongod (or replica set) Where your data lives mongod Shard

Slide 27

Slide 27 text

Mongod started with --configsvr option Must have 3 (or 1 in development) Data is commited using 2 phase commit config Config Server

Slide 28

Slide 28 text

mongos Acts just like shard router / proxy One or as many as you want Light weight -- can run on App servers Caches meta-data from config servers mongos

Slide 29

Slide 29 text

Routing Logic Balancing metadata datastore datastore datastore

Slide 30

Slide 30 text

metadata datastore mongos datastore datastore

Slide 31

Slide 31 text

metadata datastore mongos datastore datastore app

Slide 32

Slide 32 text

datastore mongos config datastore datastore app

Slide 33

Slide 33 text

datastore mongos config datastore datastore config config app

Slide 34

Slide 34 text

mongos config mongod mongod mongod config config app

Slide 35

Slide 35 text

mongos config mongod mongod mongod mongod mongod mongod mongod mongod mongod RS RS RS config config app

Slide 36

Slide 36 text

mongos config mongod mongod mongod mongod mongod mongod mongod mongod mongod RS RS RS config config app

Slide 37

Slide 37 text

Mechanics

Slide 38

Slide 38 text

How does MongoDB balance my data?

Slide 39

Slide 39 text

{ name: “Joe”, email: “[email protected]”, }, { name: “Bob”, email: “[email protected]”, }, { name: “Tyler”, email: “[email protected]”, } Keys test.users

Slide 40

Slide 40 text

> db.runCommand({ }) { name: “Joe”, email: “[email protected]”, }, { name: “Bob”, email: “[email protected]”, }, { name: “Tyler”, email: “[email protected]”, } shardcollection: “test.users”, Keys key: { email: 1 } test.users

Slide 41

Slide 41 text

{ name: “Joe”, email: “[email protected]”, }, { name: “Bob”, email: “[email protected]”, }, { name: “Tyler”, email: “[email protected]”, } shardcollection: “test.users”, Keys key: { email: 1 } test.users

Slide 42

Slide 42 text

{ name: “Joe”, email: “[email protected]”, }, { name: “Bob”, email: “[email protected]”, }, { name: “Tyler”, email: “[email protected]”, } Keys key: { email: 1 } test.users

Slide 43

Slide 43 text

Chunks -∞ +∞

Slide 44

Slide 44 text

Slide 45

Slide 45 text

Slide 46

Slide 46 text

Chunks -∞ +∞ Split! This is a chunk This is a chunk [email protected] [email protected] [email protected]

Slide 47

Slide 47 text

Slide 48

Slide 48 text

Slide 49

Slide 49 text

Slide 50

Slide 50 text

Splitting config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 51

Slide 51 text

Splitting config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 52

Slide 52 text

Splitting config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 53

Slide 53 text

Splitting config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 54

Slide 54 text

Splitting config config config mongos Shard 1 Shard 2 Shard 3 Shard 4 Split this big chunk into 2 chunks

Slide 55

Slide 55 text

Splitting config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 56

Slide 56 text

Splitting config config config mongos Shard 1 Shard 2 Shard 3 Shard 4 These chunks have split

Slide 57

Slide 57 text

Balancing config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 58

Slide 58 text

Balancing config config config mongos Shard 1 Shard 2 Shard 3 Shard 4 Shard1, move a chunk to Shard2

Slide 59

Slide 59 text

Balancing config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 60

Slide 60 text

Balancing config config config mongos Shard 1 Shard 2 Shard 3 Shard 4 Shard1, move another chunk to Shard3

Slide 61

Slide 61 text

Balancing config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 62

Slide 62 text

Balancing config config config mongos Shard 1 Shard 2 Shard 3 Shard 4 Shard1, move another chunk to Shard4

Slide 63

Slide 63 text

Balancing config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 64

Slide 64 text

Balancing config config config mongos Shard 1 Shard 2 Shard 3 Shard 4

Slide 65

Slide 65 text

How does MongoDB route my queries?

Slide 66

Slide 66 text

Routed Request mongos shard shard shard

Slide 67

Slide 67 text

Routed Request 1 mongos shard shard shard 1. Query arrives at Mongos

Slide 68

Slide 68 text

Routed Request 1 2 mongos shard shard shard 1. Query arrives at Mongos 2. Mongos routes query to a single shard

Slide 69

Slide 69 text

Routed Request 1 2 3 mongos shard shard shard 1. Query arrives at Mongos 2. Mongos routes query to a single shard 3. Shard returns results of query

Slide 70

Slide 70 text

Routed Request 1 2 3 4 mongos shard shard shard 1. Query arrives at Mongos 2. Mongos routes query to a single shard 3. Shard returns results of query 4. Results returned to client

Slide 71

Slide 71 text

Scatter Gather Request shard shard shard mongos

Slide 72

Slide 72 text

Scatter Gather Request 1 1. Query arrives at Mongos shard shard shard mongos

Slide 73

Slide 73 text

Scatter Gather Request 1 1. Query arrives at Mongos 2 2 2 shard shard shard mongos 2. Mongos broadcasts query to all shards

Slide 74

Slide 74 text

Scatter Gather Request 1 1. Query arrives at Mongos 2 2 2 3 3 3 shard shard shard mongos 2. Mongos broadcasts query to all shards 3. Each shard returns results for query

Slide 75

Slide 75 text

Scatter Gather Request 1 4 1. Query arrives at Mongos 2 2 2 3 3 3 shard shard shard mongos 2. Mongos broadcasts query to all shards 3. Each shard returns results for query 4. Results combined and returned to client

Slide 76

Slide 76 text

mongos Distributed Merge Sort Req. shard shard shard

Slide 77

Slide 77 text

mongos Distributed Merge Sort Req. 1 shard shard shard 1. Query arrives at Mongos

Slide 78

Slide 78 text

mongos Distributed Merge Sort Req. 1 2 2 2 shard shard shard 1. Query arrives at Mongos 2. Mongos broadcasts query to all shards

Slide 79

Slide 79 text

mongos Distributed Merge Sort Req. 1 2 2 2 shard shard shard 3 3 3 1. Query arrives at Mongos 2. Mongos broadcasts query to all shards 3. Each shard locally sorts results

Slide 80

Slide 80 text

mongos Distributed Merge Sort Req. 1 2 2 2 4 4 4 shard shard shard 3 3 3 1. Query arrives at Mongos 2. Mongos broadcasts query to all shards 3. Each shard locally sorts results 4. Results returned to mongos

Slide 81

Slide 81 text

mongos Distributed Merge Sort Req. 1 5 2 2 2 4 4 4 shard shard shard 3 3 3 1. Query arrives at Mongos 2. Mongos broadcasts query to all shards 3. Each shard locally sorts results 4. Results returned to mongos 5. Mongos merges sorted results

Slide 82

Slide 82 text

mongos Distributed Merge Sort Req. 1 6 5 2 2 2 4 4 4 shard shard shard 3 3 3 1. Query arrives at Mongos 2. Mongos broadcasts query to all shards 3. Each shard locally sorts results 4. Results returned to mongos 5. Mongos merges sorted results 6. Combined results returned to client

Slide 83

Slide 83 text

Queries By Shard Key Routed db.users.find({email: “[email protected]”}) Sorted by shard key Routed in order db.users.find().sort({email:-1}) Find by non shard key Scatter Gather db.users.find({state:”NY”}) Sorted by non shard key Distributed merge sort db.users.find().sort({state:1})

Slide 84

Slide 84 text

Writes Inserts Requires shard key db.users.insert({ name: “Bob”, email: “[email protected]”}) Removes Routed db.users.delete({ email: “[email protected]”}) Removes Scattered db.users.delete({name: “Bob”}) Updates Routed db.users.update( {email: “[email protected]”}, {$set: { state: “NY”}}) Updates Scattered db.users.update( {state: “CA”}, {$set:{ state: “NY”}} )

Slide 85

Slide 85 text

How do I choose my shard key?

Slide 86

Slide 86 text

Choose a field that is common to your queries. Rule of Thumb

Slide 87

Slide 87 text

Write Scaling Writes should be distributed.

Slide 88

Slide 88 text

{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken" } Bad { time : 1 } Writes should be distributed

Slide 89

Slide 89 text

{ node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken" } Bad { time : 1 } Better {node:1, application:1, time:1} Writes should be distributed

Slide 90

Slide 90 text

Query Isolation & Data Locality Queries should be routed to one shard.

Slide 91

Slide 91 text

Bad {msg: 1, node: 1} { node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken” } Queries should be routed to one shard

Slide 92

Slide 92 text

Better {node: 1, time: 1} Bad {msg: 1, node: 1} { node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken” } Queries should be routed to one shard

Slide 93

Slide 93 text

Cardinality Chunks should be able to split.

Slide 94

Slide 94 text

Bad {node: 1} { node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken" } Chunks should be able to split

Slide 95

Slide 95 text

Better {node:1, time:1} Bad {node: 1} { node: "ny153.example.com", application: "apache", time: "2011-01-02T21:21:56Z", level: "ERROR", msg: "something is broken" } Chunks should be able to split

Slide 96

Slide 96 text

Configuration

Slide 97

Slide 97 text

mongod mongod mongod Bring up mongods or Replica Sets mongod mongod mongod mongod mongod mongod RS RS RS mongod --shardsvr mongod --replSet --shardsvr

Slide 98

Slide 98 text

config mongod mongod mongod mongod mongod mongod mongod mongod mongod RS RS RS Bring up Config Servers config config mongod --configsvr

Slide 99

Slide 99 text

config mongod mongod mongod mongod mongod mongod mongod mongod mongod RS RS RS Bring up Mongos config config mongos mongos --configdb

Slide 100

Slide 100 text

> use admin > db.runCommand({"addShard": }) Connect to Mongos + Add Shards Enable Sharding > db.runCommand( { enablesharding : "" } ); > db.runCommand( { shardcollection : "", key : }); Shard a Collection