Slide 1

Slide 1 text

1 Sridhar Nanjundeswaran, 10gen [email protected] @snanjund

Slide 2

Slide 2 text

2 • Quick introduction to mongoDB • Data modeling in mongoDB, queries, geospatial, updates and map reduce. • Using a location-based app as an example • Example works in mongoDB JS shell

Slide 3

Slide 3 text

3

Slide 4

Slide 4 text

4 MongoDB is a scalable, high-performance, open source, document-oriented database. • Fast Querying • In-place updates • Full Index Support • Replication /High Availability • Auto-Sharding • Aggregation; Map/Reduce • GridFS

Slide 5

Slide 5 text

5 MongoDB is Implemented in C++ • Windows, Linux, Mac OS-X, Solaris Drivers are available in many languages 10gen supported • C, C# (.Net), C++, Erlang, Haskell, Java, JavaScript, Perl, PHP, Python, Ruby, Scala, nodejs • Multiple community supported drivers

Slide 6

Slide 6 text

6 RDBMS MongoDB Table Collection Row(s) Document Index Index Partition Shard Join Embedding/Linking Fixed Schema Flexible/Implied Schema

Slide 7

Slide 7 text

7 { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : “Sridhar", date : ISODate("2012-02-02T11:52:27.442Z"), text : "About MongoDB...", tags : [ "tech", "databases", "nosql" ], comments : [{ author : "Doug", date : ISODate("2012-02-03T17:22:21.124Z"), text : "Best Post Ever!" }], comment_count : 1 }

Slide 8

Slide 8 text

8 •JSON has powerful, limited set of datatypes – Mongo extends datatypes with Date, Int types, ObjectId, … •MongoDB stores data in BSON •BSON is a binary representation of JSON – Optimized for performance and navigational abilities – Also compression See: bsonspec.org

Slide 9

Slide 9 text

9 • Intrinsic support for fast, iterative development • Super low latency access to your data • Very little CPU overhead • No additional caching layer required • Built in replication and horizontal scaling support

Slide 10

Slide 10 text

10 •Want to build an app where users can check in to a location •Leave notes or comments about that location

Slide 11

Slide 11 text

11 "As a user I want to be able to find other locations nearby" • Need to store locations (Offices, Restaurants, etc) – name, address, tags – coordinates – User generated content e.g. tips / notes

Slide 12

Slide 12 text

12 "As a user I want to be able to 'checkin' to a location" Checkins – User should be able to 'check in' to a location – Want to be able to generate statistics: • Recent checkins • Popular locations

Slide 13

Slide 13 text

13 users user1, user2 loc1, loc2, loc3 locations checkins checkin1, checkin2

Slide 14

Slide 14 text

14 > location_1 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012 }

Slide 15

Slide 15 text

15 > location_1 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012 } > db.locations.save(location_1) > db.locations.find({name: "Lotus Flower"})

Slide 16

Slide 16 text

16 > location_1 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012 } > db.locations.ensureIndex({name: 1}) > db.locations.find({name: "Lotus Flower"})

Slide 17

Slide 17 text

17 > location_2 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012, tags: ["restaurant", "dumplings"] }

Slide 18

Slide 18 text

18 > location_2 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012, tags: ["restaurant", "dumplings"] } > db.locations.ensureIndex({tags: 1})

Slide 19

Slide 19 text

19 > location_2 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012, tags: ["restaurant", "dumplings"] } > db.locations.ensureIndex({tags: 1}) > db.locations.find({tags: "dumplings"})

Slide 20

Slide 20 text

20 > location_3 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012, tags: ["restaurant", "dumplings"], lat_long: [52.5184, 13.387] }

Slide 21

Slide 21 text

21 > location_3 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012, tags: ["restaurant", "dumplings"], lat_long: [52.5184, 13.387] } > db.locations.ensureIndex({lat_long: "2d"})

Slide 22

Slide 22 text

22 > location_3 = { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012, tags: ["restaurant", "dumplings"], lat_long: [52.5184, 13.387] } > db.locations.ensureIndex({lat_long: "2d"}) > db.locations.find({lat_long: {$near:[52.53, 13.4]}})

Slide 23

Slide 23 text

23 // creating your indexes: > db.locations.ensureIndex({tags: 1}) > db.locations.ensureIndex({name: 1}) > db.locations.ensureIndex({lat_long: "2d"}) // finding places: > db.locations.find({lat_long: {$near:[52.53, 13.4]}}) // with regular expressions: > db.locations.find({name: /^Lin/}) // by tag: > db.locations.find({tag: "dumplings"})

Slide 24

Slide 24 text

24 Atomic operators: $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit

Slide 25

Slide 25 text

25 // initial data load: > db.locations.insert(location_3) // adding a tip with update: > db.locations.update( {name: "Lotus Flower"}, {$push: { tips: { user: “Sridhar", date: ISODate("2012-09-21T11:52:27.442Z"), tip: "The sesame dumplings are awesome!"} }})

Slide 26

Slide 26 text

26 > db.locations.findOne() { name: "Lotus Flower", address: "123 University Ave", city: "Palo Alto", zipcode: 94012, tags: ["restaurant", "dumplings"], lat_long: [52.5184, 13.387], tips:[{ user: “Sridhar", date: ISODate("2012-09-23T11:52:27.442Z"), tip: "The sesame dumplings are awesome!" }] }

Slide 27

Slide 27 text

27 "As a user I want to be able to 'checkin' to a location" Checkins – User should be able to 'check in' to a location – Want to be able to generate statistics: • Recent checkins • Popular locations

Slide 28

Slide 28 text

28 > user_1 = { _id: “[email protected]", name: “Sridhar", twitter: “snanjund", checkins: [ {location: "Lotus Flower", ts: ISODate("2012-09-21T11:52:27.442Z")}, {location: “Sheraton", ts: ISODate("2012-09-22T07:15:00.442Z")} ] } > db.users.ensureIndex({"checkins.location": 1}) > db.users.find({"checkins.location": "Lotus Flower"})

Slide 29

Slide 29 text

29 // find all users who've checked in here: > db.users.find({"checkins.location":"Lotus Flower"})

Slide 30

Slide 30 text

30 // find all users who've checked in here: > db.users.find({"checkins.location":"Lotus Flower"}) // find the last 10 checkins here? > db.users.find({"checkins.location":"Lotus Flower"}) .sort({"checkins.ts": -1}).limit(10)

Slide 31

Slide 31 text

31 // find all users who've checked in here: > db.users.find({"checkins.location":"Lotus Flower"}) // find the last 10 checkins here: - Warning! > db.users.find({"checkins.location":"Lotus Flower"}) .sort({"checkins.ts": -1}).limit(10) Hard to query for last 10

Slide 32

Slide 32 text

32 > user_2 = { _id: “[email protected]", name: “Sridhar", twitter: “snanjund", } > checkin_1 = { location: location_id, user: user_id, ts: ISODate("2012-09-21T11:52:27.442Z") } > db.checkins.ensureIndex({user: 1}) > db.checkins.find({user: user_id})

Slide 33

Slide 33 text

33 // find all users who've checked in here: > location_id = db.locations.find({"name":"Lotus Flower"}) > u_ids = db.checkins.find({location: location_id}, {_id: -1, user: 1}) > users = db.users.find({_id: {$in: u_ids}}) // find the last 10 checkins here: > db.checkins.find({location: location_id}) .sort({ts: -1}).limit(10) // count how many checked in today: > db.checkins.find({location: location_id, ts: {$gt: midnight}} ).count()

Slide 34

Slide 34 text

34 // Find most popular locations > agg = db.checkins.aggregate( {$match: {ts: {$gt: now_minus_3_hrs}}}, {$group: {_id: "$location", numEntries: {$sum: 1}}} ) > agg.result [{"_id": "Lotus Flower", "numEntries" : 17}]

Slide 35

Slide 35 text

35 // Find most popular locations > map_func = function() { emit(this.location, 1); } > reduce_func = function(key, values) { return Array.sum(values); } > db.checkins.mapReduce(map_func, reduce_func, {query: {ts: {$gt: now_minus_3_hrs}}, out: "result"}) > db.result.findOne() {"_id": "Lotus Flower", "value" : 17}

Slide 36

Slide 36 text

36 Deployment

Slide 37

Slide 37 text

37 P • Single server - need a strong backup plan

Slide 38

Slide 38 text

38 • Single server - need a strong backup plan • Replica sets - High availability - Automatic failover P P S S

Slide 39

Slide 39 text

39 • Single server - need a strong backup plan • Replica sets - High availability - Automatic failover • Sharded - Horizontally scale - Auto balancing P S S P S S P P S S

Slide 40

Slide 40 text

40 User Data Management High Volume Data Feeds Content Management Operational Intelligence E-Commerce

Slide 41

Slide 41 text

41

Slide 42

Slide 42 text

42 @mongodb © Copyright 2010 10gen Inc. conferences, appearances, and meetups http://www.10gen.com/events http://bit.ly/mongofb Facebook | Twitter | LinkedIn http://linkd.in/joinmongo download at mongodb.org We’re Hiring ! [email protected] @snanjund