Slide 1

Slide 1 text

Schema Design by Alex Litvinok

Slide 2

Slide 2 text

Schema Design Basic unit of data – Document..

Slide 3

Slide 3 text

Schema Design What is document? • BSON Document • Embedding • Links across documents

Slide 4

Slide 4 text

Schema Design Example event = { _id: ObjectId(‘47cc67093475061e3d95369d’), name: ‘MeetUP #2’, date: ISODate(‘2012-04-05 19:00:00'), where: { city: ‘Minsk’, adress: ‘Nezavisimosti, 186’ } } 01. 02. 03. 04. 05. 06. 07. 08.

Slide 5

Slide 5 text

Schema Design RDBMS? @#$.? NoSQL! Relation DB Document DB Database Database Table Collection Row(s) Document Index Index Join Embedding and Links Partition Shard Partition Key Shard Key

Slide 6

Slide 6 text

Schema Design Why? • Make queries easy and fast • Facilitate sharding and automaticity

Slide 7

Slide 7 text

Schema Design Strategy • Start with a normalized model • Embed docs for simplicity and optimization

Slide 8

Slide 8 text

Schema Design Normalized? Denormalized?

Slide 9

Slide 9 text

Product • _id • name • price • desc Schema Design Normalized schema Order = { _id : orderId, user : userInfo, items : [ productId1, productId2, productId3 ] } Product = { _id: productId, name : name, price : price, desc : description } 01. 02. 03. 04. 05. 06. 07. 08. 09. 10. 11. 12. 13. 14. 15. * Link to collection of product Order • _id • user • items *

Slide 10

Slide 10 text

Schema Design Normalized schema • Normalized documents are a perfectably acceptable way to use MongoDB. • Normalized documents provide maximum flexibility.

Slide 11

Slide 11 text

Schema Design Links across documents DBRef { $ref : , $id : [, $db : ] } Or simple storage of _id..

Slide 12

Slide 12 text

Schema Design Denormalized schema Order = { _id : orderId, user : userInfo, items : [ { _id: productId1, name : name1, price : price1 }, { _id: productId2, name : name2, price : price3 } ] } 01. 02. 03. 04. 05. 06. 07. 08. 09. 10. 11. 12. 13. Order • _id • user • items • _id • name • price • _id • name • price

Slide 13

Slide 13 text

Schema Design Denormalized schema • Embedded documents are good for fast queries. • The embedded documents always available with the parent documents. • Embedded and nested documents are good for storing complex hierarchies.

Slide 14

Slide 14 text

Schema Design Embedding documents { title : "Contributors", data: [ { name: “Grover" }, { name: “James", surname: “Madison" }, { surname: “Grant" } ] } 01. 02. 03. 04. 05. 06. 07. 08. 09.

Slide 15

Slide 15 text

Schema Design ..fast queries

Slide 16

Slide 16 text

Schema Design Indexes Basics > db.collection.ensureIndex({ name:1 }); Indexing on Embedded Fields > db.collection.ensureIndex({ location.city:1 }) Compound Keys > db.collection.ensureIndex({ name:1, age:-1 })

Slide 17

Slide 17 text

Schema Design Also indexes.. The _id Index • Automatically created except capped collection • Index is special and cannot be deleted • Enforces uniqueness for its keys Indexing Array Elements • Indexes for each element of the array Compound Keys • Direction of the index ( 1 for ascending or -1 for descending )

Slide 18

Slide 18 text

Schema Design Again indexes... Create options sparse, unique, dropDups, background, v… Geospatial Indexing > db.places.ensureIndex( { loc : "2d" } ) > db.places.ensureIndex( { loc : "2d" } , { min : -500 , max : 500 } ) > db.places.ensureIndex( { loc : "2d" } , { bits : 26 } )

Slide 19

Slide 19 text

Schema Design Analysis and Optimization Profiler | Explain

Slide 20

Slide 20 text

Schema Design Database Profiler Profiling Level • 0 - Off • 1 - log slow operations (by default, >100ms is considered slow) • 2 - log all operations > db.setProfilingLevel(2);

Slide 21

Slide 21 text

Schema Design Database Profiler Viewing the Data – collection system.profile > db.system.profile.find() { "ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query test.$cmd ntoreturn:1 reslen:66 nscanned:0
query: { profile: 2 } nreturned:1 bytes:50" , "millis" : 0}

Slide 22

Slide 22 text

Schema Design Explain > db.collection.find( … ).explain() { cursor : "BasicCursor", indexBounds : [ ], nscanned : 57594, nscannedObjects : 57594, nYields : 2 , n : 3 , millis : 108, indexOnly : false, isMultiKey : false, nChunkSkips : 0 }

Slide 23

Slide 23 text

Schema Design From theory to Actions..

Slide 24

Slide 24 text

Schema Design Seating plan { _id: ObjectId, event_id: ObjectId seats: { A1:1, A2:1, A3:0, … H30:0 } }

Slide 25

Slide 25 text

Schema Design Seating plan { _id: { event_id: ObjectId, seat: ‘C9’ }, updated: new Date(), state: ‘AVALIBLE’ }

Slide 26

Slide 26 text

Schema Design Feed reader • Users • Feed • Entries

Slide 27

Slide 27 text

Schema Design Feed reader Storage users { _id: ObjectId, name: ‘username’, feeds: [ ObjectId, ObjectId, … ] }

Slide 28

Slide 28 text

Schema Design Feed reader Storage feeds { _id: ObjectId, url: ‘http://bbc.com/news/feed’, name: ‘BBC News’, latest: Date(‘2012-01-10T12:30:13Z’), enties:[{ latest: Date(‘2012-01-10T12:30:13Z’), title: ‘Bomb kills Somali sport officials’, description: ‘…’, … }] }

Slide 29

Slide 29 text

Schema Design Some tips 1. Duplicate data for speed, reference data for integrity 2. Try to fetch data in a single query 3. Design documents to be self-sufficient 4. Override _id when you have your own simple, unique id 5. Don’t always use an index

Slide 30

Slide 30 text

Schema Design Conclusion • Embedded docs are good for fast queries • Embedded and nested docs are good for storing hierarchies • Normalized docs are a most acceptable

Slide 31

Slide 31 text

Schema Design ? ???