Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MongoDB Schema Design | Alex Litvinok

MongoDB Schema Design | Alex Litvinok

Alex Litvinok
Meetup #2

More Decks by Minsk MongoDB User Group

Other Decks in Technology

Transcript

  1. Schema Design Example event = { _id: ObjectId(‘47cc67093475061e3d95369d’), name: ‘MeetUP

    #2’, date: ISODate(‘2012-04-05 19:00:00'), where: { city: ‘Minsk’, adress: ‘Nezavisimosti, 186’ } } 01. 02. 03. 04. 05. 06. 07. 08.
  2. Schema Design RDBMS? @#$.? NoSQL! Relation DB Document DB Database

    Database Table Collection Row(s) Document Index Index Join Embedding and Links Partition Shard Partition Key Shard Key
  3. Schema Design Why? • Make queries easy and fast •

    Facilitate sharding and automaticity
  4. Schema Design Strategy • Start with a normalized model •

    Embed docs for simplicity and optimization
  5. Product • _id • name • price • desc Schema

    Design Normalized schema Order = { _id : orderId, user : userInfo, items : [ productId1, productId2, productId3 ] } Product = { _id: productId, name : name, price : price, desc : description } 01. 02. 03. 04. 05. 06. 07. 08. 09. 10. 11. 12. 13. 14. 15. * Link to collection of product Order • _id • user • items *
  6. Schema Design Normalized schema • Normalized documents are a perfectably

    acceptable way to use MongoDB. • Normalized documents provide maximum flexibility.
  7. Schema Design Links across documents DBRef { $ref : <collname>,

    $id : <idvalue>[, $db : <dbname>] } Or simple storage of _id..
  8. Schema Design Denormalized schema Order = { _id : orderId,

    user : userInfo, items : [ { _id: productId1, name : name1, price : price1 }, { _id: productId2, name : name2, price : price3 } ] } 01. 02. 03. 04. 05. 06. 07. 08. 09. 10. 11. 12. 13. Order • _id • user • items • _id • name • price • _id • name • price
  9. Schema Design Denormalized schema • Embedded documents are good for

    fast queries. • The embedded documents always available with the parent documents. • Embedded and nested documents are good for storing complex hierarchies.
  10. Schema Design Embedding documents { title : "Contributors", data: [

    { name: “Grover" }, { name: “James", surname: “Madison" }, { surname: “Grant" } ] } 01. 02. 03. 04. 05. 06. 07. 08. 09.
  11. Schema Design Indexes Basics > db.collection.ensureIndex({ name:1 }); Indexing on

    Embedded Fields > db.collection.ensureIndex({ location.city:1 }) Compound Keys > db.collection.ensureIndex({ name:1, age:-1 })
  12. Schema Design Also indexes.. The _id Index • Automatically created

    except capped collection • Index is special and cannot be deleted • Enforces uniqueness for its keys Indexing Array Elements • Indexes for each element of the array Compound Keys • Direction of the index ( 1 for ascending or -1 for descending )
  13. Schema Design Again indexes... Create options sparse, unique, dropDups, background,

    v… Geospatial Indexing > db.places.ensureIndex( { loc : "2d" } ) > db.places.ensureIndex( { loc : "2d" } , { min : -500 , max : 500 } ) > db.places.ensureIndex( { loc : "2d" } , { bits : 26 } )
  14. Schema Design Database Profiler Profiling Level • 0 - Off

    • 1 - log slow operations (by default, >100ms is considered slow) • 2 - log all operations > db.setProfilingLevel(2);
  15. Schema Design Database Profiler Viewing the Data – collection system.profile

    > db.system.profile.find() { "ts" : "Thu Jan 29 2009 15:19:32 GMT-0500 (EST)" , "info" : "query test.$cmd ntoreturn:1 reslen:66 nscanned:0 <br>query: { profile: 2 } nreturned:1 bytes:50" , "millis" : 0}
  16. Schema Design Explain > db.collection.find( … ).explain() { cursor :

    "BasicCursor", indexBounds : [ ], nscanned : 57594, nscannedObjects : 57594, nYields : 2 , n : 3 , millis : 108, indexOnly : false, isMultiKey : false, nChunkSkips : 0 }
  17. Schema Design Seating plan { _id: { event_id: ObjectId, seat:

    ‘C9’ }, updated: new Date(), state: ‘AVALIBLE’ }
  18. Schema Design Feed reader Storage users { _id: ObjectId, name:

    ‘username’, feeds: [ ObjectId, ObjectId, … ] }
  19. Schema Design Feed reader Storage feeds { _id: ObjectId, url:

    ‘http://bbc.com/news/feed’, name: ‘BBC News’, latest: Date(‘2012-01-10T12:30:13Z’), enties:[{ latest: Date(‘2012-01-10T12:30:13Z’), title: ‘Bomb kills Somali sport officials’, description: ‘…’, … }] }
  20. Schema Design Some tips 1. Duplicate data for speed, reference

    data for integrity 2. Try to fetch data in a single query 3. Design documents to be self-sufficient 4. Override _id when you have your own simple, unique id 5. Don’t always use an index
  21. Schema Design Conclusion • Embedded docs are good for fast

    queries • Embedded and nested docs are good for storing hierarchies • Normalized docs are a most acceptable