Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
2011-MongoDC-Scaling.pdf
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
mongodb
July 12, 2011
Programming
2
200
2011-MongoDC-Scaling.pdf
mongodb
July 12, 2011
Tweet
Share
More Decks by mongodb
See All by mongodb
NoSQL Now! 2012
mongodb
18
3.4k
MongoDB 2.2 At the Silicon Valley MongoDB User Group
mongodb
9
1.4k
Turning off the LAMP Hunter Loftis, Skookum Digital Works
mongodb
2
1.5k
Mobilize Your MongoDB! Developing iPhone and Android Apps in the Cloud Grant Shipley, Red Hat
mongodb
0
540
Beanstalk Data - MongoDB In Production Chris Siefken, CTO Beanstalk Data
mongodb
0
550
New LINQ support in C#/.NET driver Robert Stam, 10gen
mongodb
9
41k
Welcome and Keynote Aaron Heckman, 10gen
mongodb
0
530
Webinar Introduction to MongoDB's Java Driver
mongodb
1
1.3k
Webinar Intro to Schema Design
mongodb
4
1.8k
Other Decks in Programming
See All in Programming
[PHPerKaigi 2026]PHPerKaigi2025の企画CodeGolfが最高すぎて社内で内製して半年運営して得た内製と運営の知見
ikezoemakoto
0
280
Redox OS でのネームスペース管理と chroot の実現
isanethen
0
440
コーディングルールの鮮度を保ちたい / keep-fresh-go-internal-conventions
handlename
0
230
Reactive ❤️ Loom: A Forbidden Love Story
franz1981
2
160
モックわからないマン卒業記 ~振る舞いを起点に見直した、フロントエンドテストにおけるモックの使いどころ~
tasukuwatanabe
3
420
Claude Codeログ基盤の構築
giginet
PRO
7
3.6k
Claude Code Skill入門
mayahoney
0
430
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
990
仕様漏れ実装漏れをなくすトレーサビリティAI基盤のご紹介
orgachem
PRO
7
3.1k
What Spring Developers Should Know About Jakarta EE
ivargrimstad
0
620
Pythonデータ分析コトハジメinFukuoka
kanan
0
100
Windows on Ryzen and I
seosoft
0
390
Featured
See All Featured
Typedesign – Prime Four
hannesfritz
42
3k
Discover your Explorer Soul
emna__ayadi
2
1.1k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
[SF Ruby Conf 2025] Rails X
palkan
2
850
The SEO identity crisis: Don't let AI make you average
varn
0
420
Tips & Tricks on How to Get Your First Job In Tech
honzajavorek
0
460
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
300
Un-Boring Meetings
codingconduct
0
240
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
68
38k
How Software Deployment tools have changed in the past 20 years
geshan
0
33k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1.1k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Transcript
Eliot Horowitz @eliothorowitz MongoDC June 27, 2011 Practical Scaling and
Sharding
Scaling by Optimization • Schema Design • Index Design •
Hardware Configuration
Horizontal Scaling • Vertical scaling is limited • Hard to
scale vertically in the cloud • Can scale wider than higher
Replica Sets • One master at any time • Programmer
determines if read hits master or a slave • Easy to setup to scale reads
db.people.find( { state : “NY” } ).addOption( SlaveOK ) •
routed to a secondary automatically • will use master if no secondary is available
Not Enough • Writes don’t scale • Reads are out
of date on slaves • RAM/Data Size doesn’t scale
• Distribute write load • Keep working set in RAM
• Consistent reads • Preserve functionality Why Shard?
Sharding Design Goals • Scale linearly • Increase capacity with
no downtime • Transparent to the application • Low administration to add capacity
Sharding and Documents • Rich documents reduce need for joins
• No joins makes sharding solvable
• Choose how you partition data • Convert from single
replica set to sharding with no downtime • Full feature set • Fully consistent by default Basics
Architecture client mongos ... mongos mongod mongod ... Shards mongod
mongod mongod Config Servers mongod mongod mongod mongod mongod mongod mongod client client client
Data Center Primary Data Center Secondary S1 p=1 S1 p=1
S1 p=0 S2 p=0 S3 p=0 S2 p=1 S3 p=1 S2 p=1 S3 p=1 Config 2 Config 2 Config 1 mongos mongos mongos mongos Typical Basic Setup
Range Based • collection is broken into chunks by range
• chunks default to 64mb or 100,000 objects
Choosing a Shard Key • Shard key determines how data
is partitioned • Hard to change • Most important performance decision
Use Case: Photos { photo_id : ???? , data :
<binary> } What’s the right key? • auto increment • MD5( data ) • month() + MD5(data)
Initial Loading • System start with 1 chunk • Writes
will hit 1 shard and then move • Pre-splitting for initial bulk loading can dramatically improve bulk load time
Administering a Cluster • Do not wait too long to
add capacity • Need capacity for normal workload + cost of moving data • Stay < 70% operational capacity
Hardware Considerations • Understand working set and make sure it
can fit in RAM • Choose appropriate sized boxes for shards • Too small and admin/overhead goes up • Too large, and you can’t add capacity smoothly
DEMO
Download MongoDB http://www.mongodb.org and let us know what you think
@eliothorowitz @mongodb 10gen is hiring! http://www.10gen.com/jobs
Use Case: User Profiles { email : “
[email protected]
” , addresses
: [ { state : “NY” } ] } • Shard by email • Lookup by email hits 1 node • Index on { “addresses.state” : 1 }
Use Case: Activity Stream { user_id : XXX, event_id :
YYY , data : ZZZ } • Shard by user_id • Looking up an activity stream hits 1 node • Writing even is distributed • Index on { “event_id” : 1 } for deletes