Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
2011-MongoDC-Scaling.pdf
Search
mongodb
July 12, 2011
Programming
2
200
2011-MongoDC-Scaling.pdf
mongodb
July 12, 2011
Tweet
Share
More Decks by mongodb
See All by mongodb
NoSQL Now! 2012
mongodb
18
3.3k
MongoDB 2.2 At the Silicon Valley MongoDB User Group
mongodb
9
1.4k
Turning off the LAMP Hunter Loftis, Skookum Digital Works
mongodb
2
1.5k
Mobilize Your MongoDB! Developing iPhone and Android Apps in the Cloud Grant Shipley, Red Hat
mongodb
0
500
Beanstalk Data - MongoDB In Production Chris Siefken, CTO Beanstalk Data
mongodb
0
510
New LINQ support in C#/.NET driver Robert Stam, 10gen
mongodb
9
41k
Welcome and Keynote Aaron Heckman, 10gen
mongodb
0
480
Webinar Introduction to MongoDB's Java Driver
mongodb
1
1.2k
Webinar Intro to Schema Design
mongodb
4
1.8k
Other Decks in Programming
See All in Programming
Kotlin エンジニアへ送る:Swift 案件に参加させられる日に備えて~似てるけど色々違う Swift の仕様 / from Kotlin to Swift
lovee
1
260
すべてのコンテキストを、 ユーザー価値に変える
applism118
3
1.2k
5つのアンチパターンから学ぶLT設計
narihara
1
160
Azure AI Foundryではじめてのマルチエージェントワークフロー
seosoft
0
150
WebViewの現在地 - SwiftUI時代のWebKit - / The Current State Of WebView
marcy731
0
110
CursorはMCPを使った方が良いぞ
taigakono
1
220
Result型で“失敗”を型にするPHPコードの書き方
kajitack
5
590
プロダクト志向ってなんなんだろうね
righttouch
PRO
0
180
「Cursor/Devin全社導入の理想と現実」のその後
saitoryc
0
750
#kanrk08 / 公開版 PicoRubyとマイコンでの自作トレーニング計測装置を用いたワークアウトの理想と現実
bash0c7
1
680
スタートアップの急成長を支えるプラットフォームエンジニアリングと組織戦略
sutochin26
0
3.2k
Modern Angular with Signals and Signal Store:New Rules for Your Architecture @enterJS Advanced Angular Day 2025
manfredsteyer
PRO
0
190
Featured
See All Featured
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
331
22k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
125
52k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Producing Creativity
orderedlist
PRO
346
40k
Git: the NoSQL Database
bkeepers
PRO
430
65k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Gamification - CAS2011
davidbonilla
81
5.3k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
How to Ace a Technical Interview
jacobian
277
23k
Java REST API Framework Comparison - PWX 2021
mraible
31
8.7k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
130
19k
Building a Modern Day E-commerce SEO Strategy
aleyda
42
7.4k
Transcript
Eliot Horowitz @eliothorowitz MongoDC June 27, 2011 Practical Scaling and
Sharding
Scaling by Optimization • Schema Design • Index Design •
Hardware Configuration
Horizontal Scaling • Vertical scaling is limited • Hard to
scale vertically in the cloud • Can scale wider than higher
Replica Sets • One master at any time • Programmer
determines if read hits master or a slave • Easy to setup to scale reads
db.people.find( { state : “NY” } ).addOption( SlaveOK ) •
routed to a secondary automatically • will use master if no secondary is available
Not Enough • Writes don’t scale • Reads are out
of date on slaves • RAM/Data Size doesn’t scale
• Distribute write load • Keep working set in RAM
• Consistent reads • Preserve functionality Why Shard?
Sharding Design Goals • Scale linearly • Increase capacity with
no downtime • Transparent to the application • Low administration to add capacity
Sharding and Documents • Rich documents reduce need for joins
• No joins makes sharding solvable
• Choose how you partition data • Convert from single
replica set to sharding with no downtime • Full feature set • Fully consistent by default Basics
Architecture client mongos ... mongos mongod mongod ... Shards mongod
mongod mongod Config Servers mongod mongod mongod mongod mongod mongod mongod client client client
Data Center Primary Data Center Secondary S1 p=1 S1 p=1
S1 p=0 S2 p=0 S3 p=0 S2 p=1 S3 p=1 S2 p=1 S3 p=1 Config 2 Config 2 Config 1 mongos mongos mongos mongos Typical Basic Setup
Range Based • collection is broken into chunks by range
• chunks default to 64mb or 100,000 objects
Choosing a Shard Key • Shard key determines how data
is partitioned • Hard to change • Most important performance decision
Use Case: Photos { photo_id : ???? , data :
<binary> } What’s the right key? • auto increment • MD5( data ) • month() + MD5(data)
Initial Loading • System start with 1 chunk • Writes
will hit 1 shard and then move • Pre-splitting for initial bulk loading can dramatically improve bulk load time
Administering a Cluster • Do not wait too long to
add capacity • Need capacity for normal workload + cost of moving data • Stay < 70% operational capacity
Hardware Considerations • Understand working set and make sure it
can fit in RAM • Choose appropriate sized boxes for shards • Too small and admin/overhead goes up • Too large, and you can’t add capacity smoothly
DEMO
Download MongoDB http://www.mongodb.org and let us know what you think
@eliothorowitz @mongodb 10gen is hiring! http://www.10gen.com/jobs
Use Case: User Profiles { email : “
[email protected]
” , addresses
: [ { state : “NY” } ] } • Shard by email • Lookup by email hits 1 node • Index on { “addresses.state” : 1 }
Use Case: Activity Stream { user_id : XXX, event_id :
YYY , data : ZZZ } • Shard by user_id • Looking up an activity stream hits 1 node • Writing even is distributed • Index on { “event_id” : 1 } for deletes