Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to MongoDB

Sridhar Nanjundeswaran
February 24, 2012
120

Intro to MongoDB

Presented at Microsoft Tech Days Paris, Feb 2012

Sridhar Nanjundeswaran

February 24, 2012
Tweet

Transcript

  1. MongoDB – An Introduction February 8, 2012 Sridhar Nanjundeswaran Software

    Engineer, 10gen Inc. @snanjund [email protected] © Copyright 2010 10gen Inc.
  2. Overview 1. What is MongoDB? • Why MongoDB? • Overview

    • Terminology 2. CRUD 3. Schema Design and Indexing 4. MongoDB on Windows
  3. Why MongoDB? • Issues faced with traditional RDBMS • Costs

    dramatically increase as you scale up • Productivity goes down • Advantages of MongoDB • Application evolution • Sharding for write throughput
  4. MongoDB Highlights • Internal storage – Binary JSON (www.bsonspec.org) •

    Write ahead journal • Data safety • Fast recovery • Replication for HA and data safety • Single master system • Sharding for scaling • Versioning scheme indicates production readiness
  5. MongoDB Storage Management • Memory-mapped files – O/S responsible for

    writing dirty blobs • Files are allocated as needed • Writing to disk • Default flush interval 60 seconds • O/S may flush sooner • Journaling • Group commits to Journal • Journal automatically applied on re-start (fast recovery)
  6. Deployment Models • Single instance • Not recommended for production

    • Replica Sets • Sharding (with replication)
  7. Terminology RDBMS Mongo Table, View Collection Row(s) JSON Document Index

    Index Join Embedded Document Partition Shard Partition Key Shard Key
  8. JSON documents • JavaScript Object Notation • Simple types only

    number, string, arrays, embedded objects • http://www.json.org/ • BSON – Binary JSON • Adds additional types such as Date, BinData • http://bsonspec.org/
  9. Simple JSON Document Blog Post Document p = { author:

    “sridhar”, date: new Date(), title: “Using the C# driver with MongoDB”, tags: [“NoSQL”, “Mongo”, “MongoDB”]}
  10. More complicated JSON document { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author :

    “sridhar", date : “Mon Jul 11 2011 19:47:11 GMT-0700 (PDT)", text : “Using the C# driver with MongoDB", tags : [ “NoSQL", “Mongo", “MongoDB" ], comments : [ { author : "Fred", date : “Mon Jul 11 2011 20:51:03 GMT-0700 (PDT)", text : “Interesting blog post" } ]}
  11. Create Blog Post Document > p = { author: “sridhar”,

    date: new Date(), title: “Using the C# driver with MongoDB”, tags: [“Mongo”, “MongoDB”]} > db.posts.insert(p)
  12. Read > db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : “sridhar",

    date : ISODate("2012-02-08T06:34:59.719Z"), title: “Using the C# driver with MongoDB”, tags: [“Mongo”, “MongoDB”]}
  13. Query Operators • Conditional Operators • $all, $exists, $mod, $ne,

    $in, $nin, $nor, $or, $size, $type • $lt, $lte, $gt, $gte // find posts with any tags > db.posts.find( {tags: {$exists: true }} ) // find posts matching a regular expression > db.posts.find( {author: /^sri*/i } ) // count posts by author > db.posts.find( {author: ‘sridhar’} ).count()
  14. Update > db.posts.update({author:”sridhar”}, {$set:{title:“Using the .Net driver with MongoDB”}}) >

    db.posts.update({author:”sridhar”}, {$set:{talk:1}}) > db.posts.update({author:”sridhar”}, {$push:{tags:”NoSQL”}})
  15. Atomic Operations • $set, $unset, $inc, $push, $pushAll, $pull, $pullAll,

    $bit > comment = { author: “fred”, date: new Date(), text: “Interesting blog post”} > db.posts.update( { _id: “...” }, $push: {comments: comment} );
  16. Single Table Inheritance - RDBMS • shapes table id type

    area Radius d length width 1 circle 3.14 1 2 square 4 2 3 rect 10 5 2
  17. Single Table Inheritance - MongoDB > db.shapes.find() { _id: "1",

    type: "circle",area: 3.14, radius: 1} { _id: "2", type: "square",area: 4, d: 2} { _id: "3", type: "rect", area: 10, length: 5, width: 2} // find shapes where radius > 0 > db.shapes.find({radius: {$gt: 0}}) // create index > db.shapes.ensureIndex({radius: 1})
  18. One to Many • One to Many relationships. • degree

    of association between objects • containment life-cycle
  19. One to Many • Embedded Array / Array Keys •

    Single document • slice operator to return subset of array • some queries harder • e.g find latest comments across all documents blogs: { author : "Hergé", date : ISODate("2011-09-18T09:56:06.298Z"), comments : [ { author : “Sridhar", date : ISODate("2011-09-19T09:56:06.298Z"), text : "great book", replies: [ { author : “James”, ...} ] } ]}
  20. One to Many • Normalized (2 collections)  most flexible

     more queries blogs: { author : "Hergé", date : ISODate("2011-09-18T09:56:06.298Z"), comments : [ {comment : ObjectId(“1”)} ]} comments : { _id : “1”, author : “Sridhar", date : ISODate("2011-09-19T09:56:06.298Z")}
  21. Trees • Full Tree in Document { comments: [ {

    author: “Sridhar”, text: “...”, replies: [ {author: “Renaud”, text: “...”, replies: []} ]} ] } • Pros: Single Document, Performance, Intuitive • Cons: Hard to search, Partial Results, 16MB limit •
  22. Trees • Parent Links • - Each node is stored

    as a document • - Contains the id of the parent • Child Links • - Each node contains the id’s of the children • - Can support graphs (multiple parents / child)
  23. Array of Ancestors • Store all Ancestors of a node

    { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } //find all descendants of b: > db.tree2.find({ancestors: ‘b’}) //find all direct descendants of b: > db.tree2.find({parent: ‘b’})
  24. Indexing • Similar to RDBMS • db.posts.createIndex({author:1}) • 1 –

    is ascending • -1 – is descending • Different types of indexes • Unique, Compound, Sparse • Index embedded fields • Multikey • Geospatial
  25. What’s different • Multikey • Index an array field •

    Index entry created per array entry db.posts.ensureIndex({“comments.author”:1}) • Useful for creating document key searches • Unique sparse indexes • Null and not present different • Covered indexes • Remember to exclude _id from projection
  26. Geospatial • Geo-has stored in B-Tree • 1st two values

    indexed • Can be array or subdocuments
  27. 4. MongoDB on Windows • Current production release is 2.0.2

    • Prebuilt binaries for 32 and 64 bit • Released the same time as other OS’ • http://www.mongodb.org/downloads • 64 bit recommended and to be used in production • Windows feature set on par with other OS’
  28. Windows Service • Inbuilt service support in mongod • --install

    (uses Local System Account) • --remove • Run as admin in Win7, Win2k8 • Ensure service user has access to dbpath
  29. Monitoring on Windows • MMS – MongoDB Monitoring Service •

    Hosted monitoring by 10gen • Task Manager • Quick monitoring • Memory shows resident not mapped • Perfmon – monitor network, disk i/o etc. • Mongostat.exe • Wireshark can be used for packet level monitoring • http://wiki.wireshark.org/Mongo
  30. Client on Windows • Can connect to server running Windows

    or other OS’ • Drivers work on Windows (e.g.) • .Net (C#) • Python • Java • Perl
  31. .Net and MongoDB • Current version is 1.3.1 • Full

    feature .Net driver from 10gen • Written in C# - built in ODM • Used from C#, VB, Powershell and F# • 2 dlls (Bson and Driver) • Download • Zip or msi from Language Center • Nuget package available • LINQ support coming soon
  32. MongoDB on Azure • Uses standard MongoDB binaries • Code

    is open sourced • Works with VS Express • Automatic replica set initiation on deploy • Survives reboots of instances • Integration with Azure diagnostics • Data persisted on blob storage
  33. The Solution • Source • https://github.com/mongodb/mongo-azure • Documentation • http://www.mongodb.org/display/DOCS/MongoD

    B+on+Azure • Issues • mongodb-user google group • #mongodb IRC • https://jira.mongodb.org/browse/AZURE
  34. The Future • Scale out using replica sets • MongoDB

    Monitoring • Backup and Recovery • Sharding
  35. MongoDB – What’s next • Here at Microsoft Tech Days

    Level 2, Booth 96 • MongoDB Paris – June 14th Register now: www.10gen.com/events/mongodb-paris • Join the Paris MongoDB User Group www.meetup.com/Paris-MongoDB-User-Group/ • Sign up to the EMEA Newsletter http://www.10gen.com/signup
  36. drivers at mongodb.org REST ActionScript3 C# and .NET Clojure ColdFusion

    Delphi Erlang Go: gomongo Groovy Haskell Javascript Lua C C# C++ Erlang Haskell Java Javascript Perl PHP Python Ruby node.js Objective C PHP PowerShell Blog post Python Ruby Scala Scheme (PLT) Smalltalk: Dolphin Smalltalk Community Supported mongodb.org Supported