Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Look Ma! No more blobs
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Aparna Chaudhary
April 27, 2013
Technology
1
2.3k
Look Ma! No more blobs
Binary storage using GridFS.
Aparna Chaudhary
April 27, 2013
Tweet
Share
More Decks by Aparna Chaudhary
See All by Aparna Chaudhary
Understanding JVM
aparnachaudhary
0
160
Esper - Complex Event Processing
aparnachaudhary
1
300
Other Decks in Technology
See All in Technology
日本の85%が使う公共SaaSは、どう育ったのか
taketakekaho
1
240
22nd ACRi Webinar - NTT Kawahara-san's slide
nao_sumikawa
0
100
SREのプラクティスを用いた3領域同時 マネジメントへの挑戦 〜SRE・情シス・セキュリティを統合した チーム運営術〜
coconala_engineer
2
750
AIエージェントを開発しよう!-AgentCore活用の勘所-
yukiogawa
0
180
こんなところでも(地味に)活躍するImage Modeさんを知ってるかい?- Image Mode for OpenShift -
tsukaman
1
170
Agile Leadership Summit Keynote 2026
m_seki
1
660
クレジットカード決済基盤を支えるSRE - 厳格な監査とSRE運用の両立 (SRE Kaigi 2026)
capytan
6
2.8k
配列に見る bash と zsh の違い
kazzpapa3
3
160
今こそ学びたいKubernetesネットワーク ~CNIが繋ぐNWとプラットフォームの「フラッと」な対話
logica0419
3
290
OCI Database Management サービス詳細
oracle4engineer
PRO
1
7.4k
Oracle AI Database移行・アップグレード勉強会 - RAT活用編
oracle4engineer
PRO
0
110
M&A 後の統合をどう進めるか ─ ナレッジワーク × Poetics が実践した組織とシステムの融合
kworkdev
PRO
1
490
Featured
See All Featured
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
Raft: Consensus for Rubyists
vanstee
141
7.3k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
650
How to Ace a Technical Interview
jacobian
281
24k
Neural Spatial Audio Processing for Sound Field Analysis and Control
skoyamalab
0
170
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
67
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
220
SEOcharity - Dark patterns in SEO and UX: How to avoid them and build a more ethical web
sarafernandez
0
120
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.7k
Speed Design
sergeychernyshev
33
1.5k
Transcript
Look Ma! No more blobs Aparna Chaudhary NoSQL matters, @Cologne
Germany 2013
EMBRACE POLYGLOT PERSISTENCE! STOP RDBMS ABUSE! KNOW YOUR USE CASE
Parse Extract Store Read XML We don't do rocket science...
Use Case Runtime support for document types Metadata definition provided at runtime Document type names - max 50 char Look up content based on metadata RA
Challenges Storage of up to one million documents of 10KB
to 2GB per document type per year Write 1MB < x msec Retrieve 1MB < y msec ......and details RA But…the Numbers make it interesting...
How? File System MongoDB RDBMS JCR Document Management
if you want to store files, its logical to use
file system. ain't it? File System ✓ Ease of Use ✓ No special skill-set ✓ Backup and Recovery ✓ It’s free!
How do I name them? Support for metadata storage? Performance
with too many small files? Query - Administration? High Availability? Limitation on total number of files?
Relational database Integrity Consistency Durability Atomicity Joins Backups High Availability
You name it, We have it! RDBMS Aggregations
RDBMS Developer’s Perspective
Challenge #1 RA We need runtime support for document type.
RA We need runtime support for document type.
Challenge #1 DOC_1 DOC_2 DOC_3 DOC_4 DOC_5 DOC_6 Dynamic DDL
Generation DOC_1 DOC_2 DOC_3 DOC_4 DOC_5 DOC_6 Dynamic DDL Generation
Challenge #1 String concatenations are ugly… DEV String concatenations are
ugly… DEV
Challenge #1 Let's build a utility. DEV Let's build a
utility. DEV
Challenge #1 More Work More Work
Challenge #2 RA Document type is 50 char long RA
Document type is 50 char long
Challenge #2 TABLE NAME LIMITS Wait… SQL-92 says 128 Char
? We rule. Let's support only 30 char. TABLE NAME LIMITS Wait… SQL-92 says 128 Char ? We rule. Let's support only 30 char.
Challenge #2 DOC_TYPE_MAPPING Let's create a mapping table. DEV DOC_TYPE_MAPPING
Let's create a mapping table. DEV
Challenge #2 Ugly unreadable table names! Ugly unreadable table names!
So...finally... Read XML Dynamic DDL generation Document Type Alias DocumentType
Defined Yes No Extract Metadata Store Metadata Store Content Simple use case becomes complex...
Remember... Our Challenge QA Let's see if we are in
spec for response time. Aah..what about performance now? DEV
MongoDB Document Based GridFS B-Tree Dynamic Schema JSON BSON Query
Scalable http://www.10gen.com/presentations/storage-engine-internals Joins Complex Transaction
F1 F2 F3 F4 F5 ID1 ID2 ID3 ID4 ID5
F1 F1 F1 F1 F2 F2 F3 F4 F5 F6 F2 F3 F4 F5 Fx F8 F3 F9 F7 Concepts Database Collection Collection Collection Collection Collection Collection Database Collection Collection Collection Collection Collection Collection Database Collection Collection Collection Collection Collection Collection Database Collection Collection Collection Collection Collection Collection Table = Collection Column = Field Row = Document Database = Database
GridFS MongoDB divides the large content into chunks Stores Metadata
and Chunks separately http://docs.mongodb.org/manual/core/gridfs/
> mybucket.files { "_id" : ObjectId("514d5cb8c2e6ea4329646a5c"), "chunkSize" : NumberLong(262144), "length"
: NumberLong(103015), "md5" : "34d29a163276accc7304bd69c5520e55", "filename" : "health_record_2.xml", "contentType" : application/xml, "uploadDate" : ISODate("2013-03-23T07:41:44.907Z"), "aliases" : null, "metadata" : { "fname" : "Aparna", "lname" : "Chaudhary","country" : "Netherlands" } } ObjectId - 12 Byte BSON: 4 Byte - Seconds since Epoch 3 Byte - Machine Id 2 Byte - Process Id 3 Byte - Counter
> mybucket.chunks { "_id" : ObjectId("514d5cb8c2e6ea4329646a5d"), "files_id" : ObjectId("514d5cb8c2e6ea4329646a5c"), "n"
: 0, "data" : BinData(0,...) }
? I'm storing 10KB file, but would it use 256KB
on disk? Last Chunk = FileSize % 256 + Metadata overhead 256 1128KB 256 256 256 104 + x 10KB 10 + x Chunk is as big as it needs to be...
Challenge #1 DEV MongoDB supports Dynamic Schema. You can use
collection per docType and they are created dynamically. RA We need runtime support for document type.
Challenge #2 RA Document type is 50 char long DEV
MongoDB namespace can be up to 123 char.
So...finally... Simple use case remains simple...well becomes simpler... Read XML
Extract Metadata Store Metadata & Content
Remember... Our Challenge QA Let's see if we are in
spec for response time. DEV Performance test is part of our definition of 'DONE'
BEcause seeing is believing! Demo ‣ GridFS 2.4.0 ‣ PostgreSQL
9.2 ‣ Spring Data ‣ JMeter 2.7 ‣ Mac OS X 10.8.3 2.3GHz Quad-Core Intel Core i7, 16GB RAM https://github.com/aparnachaudhary/nosql-matters-demo
EMBRACE POLYGLOT PERSISTENCE! STOP RDBMS ABUSE! KNOW YOUR USE CASE
@aparnachaudhary
Java Developer, Data Lover Eindhoven, Netherlands http://blog.aparnachaudhary.com/ @aparnachaudhary Thank You!