Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
2011-MongoDC-Scaling.pdf
Search
mongodb
July 12, 2011
Programming
2
200
2011-MongoDC-Scaling.pdf
mongodb
July 12, 2011
Tweet
Share
More Decks by mongodb
See All by mongodb
NoSQL Now! 2012
mongodb
18
3.3k
MongoDB 2.2 At the Silicon Valley MongoDB User Group
mongodb
9
1.4k
Turning off the LAMP Hunter Loftis, Skookum Digital Works
mongodb
2
1.5k
Mobilize Your MongoDB! Developing iPhone and Android Apps in the Cloud Grant Shipley, Red Hat
mongodb
0
510
Beanstalk Data - MongoDB In Production Chris Siefken, CTO Beanstalk Data
mongodb
0
520
New LINQ support in C#/.NET driver Robert Stam, 10gen
mongodb
9
41k
Welcome and Keynote Aaron Heckman, 10gen
mongodb
0
490
Webinar Introduction to MongoDB's Java Driver
mongodb
1
1.2k
Webinar Intro to Schema Design
mongodb
4
1.8k
Other Decks in Programming
See All in Programming
開発組織の戦略的な役割と 設計スキル向上の効果
masuda220
PRO
10
1.7k
品質ワークショップをやってみた
nealle
0
640
組込みだけじゃない!TinyGo で始める無料クラウド開発入門
otakakot
2
380
Google Opalで使える37のライブラリ
mickey_kubo
3
150
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
510
CSC305 Lecture 09
javiergs
PRO
0
320
オープンソースソフトウェアへの解像度🔬
utam0k
17
3.2k
Migration to Signals, Resource API, and NgRx Signal Store
manfredsteyer
PRO
0
120
Vueのバリデーション、結局どれを選べばいい? ― 自作バリデーションの限界と、脱却までの道のり ― / Which Vue Validation Library Should We Really Use? The Limits of Self-Made Validation and How I Finally Moved On
neginasu
2
1.6k
釣り地図SNSにおける有料機能の実装
nokonoko1203
0
200
NIKKEI Tech Talk#38
cipepser
0
270
ドメイン駆動設計のエッセンス
masuda220
PRO
12
3k
Featured
See All Featured
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4k
jQuery: Nuts, Bolts and Bling
dougneiner
65
7.9k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Facilitating Awesome Meetings
lara
57
6.6k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.7k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
230
22k
The Illustrated Children's Guide to Kubernetes
chrisshort
49
51k
KATA
mclloyd
PRO
32
15k
Designing Experiences People Love
moore
142
24k
How to train your dragon (web standard)
notwaldorf
97
6.3k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.7k
Transcript
Eliot Horowitz @eliothorowitz MongoDC June 27, 2011 Practical Scaling and
Sharding
Scaling by Optimization • Schema Design • Index Design •
Hardware Configuration
Horizontal Scaling • Vertical scaling is limited • Hard to
scale vertically in the cloud • Can scale wider than higher
Replica Sets • One master at any time • Programmer
determines if read hits master or a slave • Easy to setup to scale reads
db.people.find( { state : “NY” } ).addOption( SlaveOK ) •
routed to a secondary automatically • will use master if no secondary is available
Not Enough • Writes don’t scale • Reads are out
of date on slaves • RAM/Data Size doesn’t scale
• Distribute write load • Keep working set in RAM
• Consistent reads • Preserve functionality Why Shard?
Sharding Design Goals • Scale linearly • Increase capacity with
no downtime • Transparent to the application • Low administration to add capacity
Sharding and Documents • Rich documents reduce need for joins
• No joins makes sharding solvable
• Choose how you partition data • Convert from single
replica set to sharding with no downtime • Full feature set • Fully consistent by default Basics
Architecture client mongos ... mongos mongod mongod ... Shards mongod
mongod mongod Config Servers mongod mongod mongod mongod mongod mongod mongod client client client
Data Center Primary Data Center Secondary S1 p=1 S1 p=1
S1 p=0 S2 p=0 S3 p=0 S2 p=1 S3 p=1 S2 p=1 S3 p=1 Config 2 Config 2 Config 1 mongos mongos mongos mongos Typical Basic Setup
Range Based • collection is broken into chunks by range
• chunks default to 64mb or 100,000 objects
Choosing a Shard Key • Shard key determines how data
is partitioned • Hard to change • Most important performance decision
Use Case: Photos { photo_id : ???? , data :
<binary> } What’s the right key? • auto increment • MD5( data ) • month() + MD5(data)
Initial Loading • System start with 1 chunk • Writes
will hit 1 shard and then move • Pre-splitting for initial bulk loading can dramatically improve bulk load time
Administering a Cluster • Do not wait too long to
add capacity • Need capacity for normal workload + cost of moving data • Stay < 70% operational capacity
Hardware Considerations • Understand working set and make sure it
can fit in RAM • Choose appropriate sized boxes for shards • Too small and admin/overhead goes up • Too large, and you can’t add capacity smoothly
DEMO
Download MongoDB http://www.mongodb.org and let us know what you think
@eliothorowitz @mongodb 10gen is hiring! http://www.10gen.com/jobs
Use Case: User Profiles { email : “
[email protected]
” , addresses
: [ { state : “NY” } ] } • Shard by email • Lookup by email hits 1 node • Index on { “addresses.state” : 1 }
Use Case: Activity Stream { user_id : XXX, event_id :
YYY , data : ZZZ } • Shard by user_id • Looking up an activity stream hits 1 node • Writing even is distributed • Index on { “event_id” : 1 } for deletes