Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
2011-MongoDC-Storage.pdf
Search
mongodb
July 12, 2011
Programming
180
1
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
2011-MongoDC-Storage.pdf
mongodb
July 12, 2011
More Decks by mongodb
See All by mongodb
NoSQL Now! 2012
mongodb
18
3.4k
MongoDB 2.2 At the Silicon Valley MongoDB User Group
mongodb
9
1.5k
Turning off the LAMP Hunter Loftis, Skookum Digital Works
mongodb
2
1.6k
Mobilize Your MongoDB! Developing iPhone and Android Apps in the Cloud Grant Shipley, Red Hat
mongodb
0
570
Beanstalk Data - MongoDB In Production Chris Siefken, CTO Beanstalk Data
mongodb
0
580
New LINQ support in C#/.NET driver Robert Stam, 10gen
mongodb
9
41k
Welcome and Keynote Aaron Heckman, 10gen
mongodb
0
550
Webinar Introduction to MongoDB's Java Driver
mongodb
1
1.3k
Webinar Intro to Schema Design
mongodb
4
1.8k
Other Decks in Programming
See All in Programming
LLM本来の能力を解き放つサンドボックス技術とAI民主化への適用
yukukotani
3
4.5k
なぜ型を書くのか? TSKaigi2026で改めて考える #tskaigi_smarthr
kajitack
0
140
AI 時代のソフトウェア設計の学び方
masuda220
PRO
29
13k
セキュリティの専門家じゃなくてもできる。「セキュリティ意識」をアップデートして サプライチェーン攻撃への耐性を高めよう。
tk3fftk
5
920
代数的データ型って何が嬉しいの? #frontend_phpcon_do
kajitack
8
3.8k
才能?センス?知らん、 続けたもん勝ちだ。-- 結婚・出産・癌を越えてなお、私がプロダクトを創り続ける理由
16bitidol
1
260
決定論的オーケストレーションの設計と実装 / Design and Implementation of Deterministic Orchestration
nrslib
4
1.5k
Skillsは効率化、Agentsは"自分の拡張"——Builder時代のエージェント編成(CC Night 2026)
wemra
1
160
Claspは野良GASの夢をみるか
takter00
0
210
Dataformのリポジトリを立ち上げるときにまずやること / dataform-day0-2026
snhryt
0
180
1B+ /day規模のログを管理する技術
broadleaf
0
110
フロントエンドとバックエンドで「1文字」を揃えよう
youkidearitai
PRO
0
740
Featured
See All Featured
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
2
1.6k
What does AI have to do with Human Rights?
axbom
PRO
1
2.2k
How STYLIGHT went responsive
nonsquared
100
6.2k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.4k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Paper Plane (Part 1)
katiecoart
PRO
0
9.2k
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Ethics towards AI in product and experience design
skipperchong
2
310
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.3k
Fireside Chat
paigeccino
42
4k
The Curious Case for Waylosing
cassininazir
1
400
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
280
Transcript
Eliot Horowitz @eliothorowitz MongoDC June 27, 2011 Storage and Journalling
Directory Layout -rw------- 1 erh admin 64M Jun 26 00:15
test.0 -rw------- 1 erh admin 128M Jun 21 00:20 test.1 -rw------- 1 erh admin 256M Jun 26 00:15 test.2 -rw------- 1 erh admin 512M Jun 21 00:20 test.3 -rw------- 1 erh admin 1.0G Jun 26 00:15 test.4 -rw------- 1 erh admin 2.0G Jun 25 23:08 test.5 -rw------- 1 erh admin 16M Jun 26 00:15 test.ns •Separate files per database •Aggressive preallocation •Always spare file
Internal File Format • Files broken into extents • A
collection has 1 to many extents • Grow exponentially up to 2gb (max file size as well) • Indexes have different extents than data
Sample Extents > db.foo.validate( { full : true } ).extents.forEach(
function(z){ print( z.loc + "\t\t" + z.size ); } ) 0:3000 20480 0:12000 81920 0:26000 327680 0:76000 1310720 0:1da000 5242880 0:76a000 6291456 0:d6a000 7553024 0:16de000 9064448 0:1f83000 10878976 0:29e3000 13058048 1:2000 15671296 1:ef4000 18808832 1:29e4000 22573056 1:3f6b000 27090944 1:5941000 32509952
Index Extents > db.system.namespaces.find() { "name" : "test2.foo" } {
"name" : "test2.system.indexes" } { "name" : "test2.foo.$_id_" } > db["foo.$_id_"].validate( { full : true } ).extents.forEach( function(z){ print( z.loc + "\t\t" + z.size ); } ) 0:9000 36864 0:1b6000 147456 0:6da000 589824 0:149e000 2359296 1:20e4000 9437184
Memory Mapped • All data files memory mapped • Virtual
size = total data size + overhead • Journaled virtual size = ( total data size * 2 ) + overhead • fsync every 60 seconds (--syncdelay)
Journalling • Write ahead log • Operations written to journal
before memory mapped regions • Once journal written, data safe unless hardware problem
When is Data Written • Journal flushed every 100ms or
100mb written • j=true flag to force a journal flush
Journal Admin • /journal sub directory in <dbpath> (/data/ db)
• 3 1gb files that get rotated • Can symlink to a different spindle
Performance • On 99.9% read systems, no impact • Write
performance 5-30% slowdown on same drive • Using separate drive as low as 3%
When to use • Single node - required for any
data integrity • Replica Set - at least 1 node • All nodes for large data sets removes need for large resyncs
Fragmentation • Files can get fragmented over time if documents
change size • Need to improve free list
Compacting • 1.8 and previous: repairDatabase • 2.0+ : compact
command
drop/dropDatabase • drop: frees extents, not data files • dropDatabase:
frees files
update and moves • Updates can make documents bigger •
Moves are more expensive than other operations
padding • adaptive padding between 1.0 and 2.0 • manual
control coming
Download MongoDB http://www.mongodb.org and let us know what you think
@eliothorowitz @mongodb 10gen is hiring! http://www.10gen.com/jobs