Slide 1

Slide 1 text

Schema  Design  by  Example     Matthew  Shopsin   Software  Engineer   [email protected]   1  

Slide 2

Slide 2 text

•  MongoDB Data Model •  Blog Posts & Comments •  Geospatial Check-Ins •  QA Agenda

Slide 3

Slide 3 text

{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "Spirited Away", tags : [ "Tezuka", "Manga" ], comments : [ { author : "Fred", date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text : "Best Movie Ever" } ]} Embedded Documents

Slide 4

Slide 4 text

Parallels RDBMS   MongoDB   Table   Collection   Row   Document   Column   Field   Index   Index   Join   Embedding  &  Linking   Schema  Object  

Slide 5

Slide 5 text

User • Name • Email Address Category • Name • Url Article • Name • Slug • Publish date • Text Tag • Name • Url Comment • Comment • Date • Author Relational

Slide 6

Slide 6 text

User • Name • Email Address Article • Name • Slug • Publish date • Text • Author Tag[] • Value Comment[] • Comment • Date • Author Category[] • Value MongoDB

Slide 7

Slide 7 text

Blog Posts and Comments

Slide 8

Slide 8 text

How Should the Documents Look? What Are We Going to Do with the Data?

Slide 9

Slide 9 text

1) Fully Embedded { blog-title: bCommuting to Work`, blog-text: [ bThis section is about airplanes`, bthis section is about trains` ], comments: [ { author: bKevin Hanson`, comment: bdude, what about driving?` }, { author: bJohn Smith`, comment: bthis blog is aWful!!11!!!!` } ], }

Slide 10

Slide 10 text

1) Fully Embedded Pros •  Can query the comments or the blog for results •  Cleanly encapsulated Cons •  What if we get too many comments? (16MB MongoDB doc size) •  What if we want our results to be comments, not blog posts?

Slide 11

Slide 11 text

2) Each Comment Gets Own Doc { blog-title: bCommuting to Work`, blog-text: [ bThis section is about airplanes`, bthis section is about trains`] } { commenter: bKevin Hanson`, comment: bdude, what about driving?` } { commenter: bJohn Smith`, comment: bthis blog is aWful!!11!!!!` }

Slide 12

Slide 12 text

2) Each Comment Gets Own Doc Pros •  Can Query Individual Comments •  Never Need to Worry About Doc Size Cons •  Many Documents •  Standard Use Cases Become Complicated

Slide 13

Slide 13 text

Managing Arrays Pushing to an Array Infinitely... •  Document Will Grow Larger than Allocated Space •  Document May Increase Max Doc Size of 16MB Can this be avoided?? •  Yes! •  A Hybrid of Linking and Embedding

Slide 14

Slide 14 text

Geospatial Check-Ins

Slide 15

Slide 15 text

We Need 3 Things Places Check-Ins Users

Slide 16

Slide 16 text

Places Q: Current location A: Places near location User Generated Content Places

Slide 17

Slide 17 text

Tags, Geo Coordinates, and Tips                  {  name:  “10gen  HQ”,                    address:  “578  Broadway,  7th  Floor”,                    city:  “New  York”,                    zip:  10012,                    tags:  [“MongoDB”,  “business”],                    latlong:  [40.0,  72.0],                    tips:  [{user:  “kevin”,  time:  “3/15/2012”,tip:   “Make  sure  to  stop  by  for  office  hours!”}],}  

Slide 18

Slide 18 text

Updating Tips db.places.update({name:"10gen  HQ"},    {$push  :{tips:        {user:"nosh",  time: 3/15/2012,        tip:"stop  by  for  office  hours   on        Wednesdays  from  4-­‐6"}}})    

Slide 19

Slide 19 text

Querying Places ★ Creating  Indexes    db.places.ensureIndex({tags:1})db.places.ensureIndex({name:1})    db.places.ensureIndex({latlong:”2d”})   ★ Finding  Places    db.places.find({latlong:{$near:[40,70]}})   ★ Regular  Expressions    db.places.find({name:  /^typeaheadstring/)   ★ Using  Tags    db.places.find({tags:  “business”})  

Slide 20

Slide 20 text

User Check-Ins Record User Check-Ins Check-Ins Users Stats Users Stats

Slide 21

Slide 21 text

Users user1  =  {  name:  “Kevin  Hanson”  e-­‐mail:   “[email protected]”,  check-­‐ins:   [4b97e62bf1d8c7152c9ccb74,    5a20e62bf1d8c736ab]   }   checkins  []  =  ObjectId  reference  to  Check-­‐Ins   Collection  

Slide 22

Slide 22 text

Check-Ins checkin  =  {  place:    “10gen  HQ”,  ts:    9/20/2010   10:12:00,  userId:     }   Every  Check-­‐In  is  Two  Operations   •   Insert  a  Check-­‐In  Object  (check-­‐ins  collection)   •   Update  ($push)  user  object  with  check-­‐in  ID  (users   collection)  

Slide 23

Slide 23 text

Simple Stats db.checkins.find({place: “10gen HQ”) db.checkins.find({place: “10gen HQ”}) .sort({ts:-1}).limit(10)

Slide 24

Slide 24 text

Stats w/ MapReduce mapFunc  =  function()  {   emit(this.place,  1);}   reduceFunc  =  function(key,  values)  {   return  Array.sum(values);}     res  =  db.checkins.mapReduce(mapFunc,reduceFunc,    {    query:  {      timestamp:  {        $gt:nowminus3hrs          }      }   })   res  =  [{_id:”10gen  HQ”,  value:  17},  …..,  ….]  

Slide 25

Slide 25 text

More info at http://www.mongodb.org/ Matthew  Shopsin   Software  Engineer,  10gen   [email protected]