Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to MongoDB

Avatar for Sridhar Nanjundeswaran Sridhar Nanjundeswaran
February 24, 2012
140

Intro to MongoDB

Presented at Microsoft Tech Days Paris, Feb 2012

Avatar for Sridhar Nanjundeswaran

Sridhar Nanjundeswaran

February 24, 2012
Tweet

Transcript

  1. MongoDB – An Introduction February 8, 2012 Sridhar Nanjundeswaran Software

    Engineer, 10gen Inc. @snanjund sridhar@10gen.com © Copyright 2010 10gen Inc.
  2. Overview 1. What is MongoDB? • Why MongoDB? • Overview

    • Terminology 2. CRUD 3. Schema Design and Indexing 4. MongoDB on Windows
  3. Why MongoDB? • Issues faced with traditional RDBMS • Costs

    dramatically increase as you scale up • Productivity goes down • Advantages of MongoDB • Application evolution • Sharding for write throughput
  4. MongoDB Highlights • Internal storage – Binary JSON (www.bsonspec.org) •

    Write ahead journal • Data safety • Fast recovery • Replication for HA and data safety • Single master system • Sharding for scaling • Versioning scheme indicates production readiness
  5. MongoDB Storage Management • Memory-mapped files – O/S responsible for

    writing dirty blobs • Files are allocated as needed • Writing to disk • Default flush interval 60 seconds • O/S may flush sooner • Journaling • Group commits to Journal • Journal automatically applied on re-start (fast recovery)
  6. Deployment Models • Single instance • Not recommended for production

    • Replica Sets • Sharding (with replication)
  7. Terminology RDBMS Mongo Table, View Collection Row(s) JSON Document Index

    Index Join Embedded Document Partition Shard Partition Key Shard Key
  8. JSON documents • JavaScript Object Notation • Simple types only

    number, string, arrays, embedded objects • http://www.json.org/ • BSON – Binary JSON • Adds additional types such as Date, BinData • http://bsonspec.org/
  9. Simple JSON Document Blog Post Document p = { author:

    “sridhar”, date: new Date(), title: “Using the C# driver with MongoDB”, tags: [“NoSQL”, “Mongo”, “MongoDB”]}
  10. More complicated JSON document { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author :

    “sridhar", date : “Mon Jul 11 2011 19:47:11 GMT-0700 (PDT)", text : “Using the C# driver with MongoDB", tags : [ “NoSQL", “Mongo", “MongoDB" ], comments : [ { author : "Fred", date : “Mon Jul 11 2011 20:51:03 GMT-0700 (PDT)", text : “Interesting blog post" } ]}
  11. Create Blog Post Document > p = { author: “sridhar”,

    date: new Date(), title: “Using the C# driver with MongoDB”, tags: [“Mongo”, “MongoDB”]} > db.posts.insert(p)
  12. Read > db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : “sridhar",

    date : ISODate("2012-02-08T06:34:59.719Z"), title: “Using the C# driver with MongoDB”, tags: [“Mongo”, “MongoDB”]}
  13. Query Operators • Conditional Operators • $all, $exists, $mod, $ne,

    $in, $nin, $nor, $or, $size, $type • $lt, $lte, $gt, $gte // find posts with any tags > db.posts.find( {tags: {$exists: true }} ) // find posts matching a regular expression > db.posts.find( {author: /^sri*/i } ) // count posts by author > db.posts.find( {author: ‘sridhar’} ).count()
  14. Update > db.posts.update({author:”sridhar”}, {$set:{title:“Using the .Net driver with MongoDB”}}) >

    db.posts.update({author:”sridhar”}, {$set:{talk:1}}) > db.posts.update({author:”sridhar”}, {$push:{tags:”NoSQL”}})
  15. Atomic Operations • $set, $unset, $inc, $push, $pushAll, $pull, $pullAll,

    $bit > comment = { author: “fred”, date: new Date(), text: “Interesting blog post”} > db.posts.update( { _id: “...” }, $push: {comments: comment} );
  16. Single Table Inheritance - RDBMS • shapes table id type

    area Radius d length width 1 circle 3.14 1 2 square 4 2 3 rect 10 5 2
  17. Single Table Inheritance - MongoDB > db.shapes.find() { _id: "1",

    type: "circle",area: 3.14, radius: 1} { _id: "2", type: "square",area: 4, d: 2} { _id: "3", type: "rect", area: 10, length: 5, width: 2} // find shapes where radius > 0 > db.shapes.find({radius: {$gt: 0}}) // create index > db.shapes.ensureIndex({radius: 1})
  18. One to Many • One to Many relationships. • degree

    of association between objects • containment life-cycle
  19. One to Many • Embedded Array / Array Keys •

    Single document • slice operator to return subset of array • some queries harder • e.g find latest comments across all documents blogs: { author : "Hergé", date : ISODate("2011-09-18T09:56:06.298Z"), comments : [ { author : “Sridhar", date : ISODate("2011-09-19T09:56:06.298Z"), text : "great book", replies: [ { author : “James”, ...} ] } ]}
  20. One to Many • Normalized (2 collections)  most flexible

     more queries blogs: { author : "Hergé", date : ISODate("2011-09-18T09:56:06.298Z"), comments : [ {comment : ObjectId(“1”)} ]} comments : { _id : “1”, author : “Sridhar", date : ISODate("2011-09-19T09:56:06.298Z")}
  21. Trees • Full Tree in Document { comments: [ {

    author: “Sridhar”, text: “...”, replies: [ {author: “Renaud”, text: “...”, replies: []} ]} ] } • Pros: Single Document, Performance, Intuitive • Cons: Hard to search, Partial Results, 16MB limit •
  22. Trees • Parent Links • - Each node is stored

    as a document • - Contains the id of the parent • Child Links • - Each node contains the id’s of the children • - Can support graphs (multiple parents / child)
  23. Array of Ancestors • Store all Ancestors of a node

    { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } //find all descendants of b: > db.tree2.find({ancestors: ‘b’}) //find all direct descendants of b: > db.tree2.find({parent: ‘b’})
  24. Indexing • Similar to RDBMS • db.posts.createIndex({author:1}) • 1 –

    is ascending • -1 – is descending • Different types of indexes • Unique, Compound, Sparse • Index embedded fields • Multikey • Geospatial
  25. What’s different • Multikey • Index an array field •

    Index entry created per array entry db.posts.ensureIndex({“comments.author”:1}) • Useful for creating document key searches • Unique sparse indexes • Null and not present different • Covered indexes • Remember to exclude _id from projection
  26. Geospatial • Geo-has stored in B-Tree • 1st two values

    indexed • Can be array or subdocuments
  27. 4. MongoDB on Windows • Current production release is 2.0.2

    • Prebuilt binaries for 32 and 64 bit • Released the same time as other OS’ • http://www.mongodb.org/downloads • 64 bit recommended and to be used in production • Windows feature set on par with other OS’
  28. Windows Service • Inbuilt service support in mongod • --install

    (uses Local System Account) • --remove • Run as admin in Win7, Win2k8 • Ensure service user has access to dbpath
  29. Monitoring on Windows • MMS – MongoDB Monitoring Service •

    Hosted monitoring by 10gen • Task Manager • Quick monitoring • Memory shows resident not mapped • Perfmon – monitor network, disk i/o etc. • Mongostat.exe • Wireshark can be used for packet level monitoring • http://wiki.wireshark.org/Mongo
  30. Client on Windows • Can connect to server running Windows

    or other OS’ • Drivers work on Windows (e.g.) • .Net (C#) • Python • Java • Perl
  31. .Net and MongoDB • Current version is 1.3.1 • Full

    feature .Net driver from 10gen • Written in C# - built in ODM • Used from C#, VB, Powershell and F# • 2 dlls (Bson and Driver) • Download • Zip or msi from Language Center • Nuget package available • LINQ support coming soon
  32. MongoDB on Azure • Uses standard MongoDB binaries • Code

    is open sourced • Works with VS Express • Automatic replica set initiation on deploy • Survives reboots of instances • Integration with Azure diagnostics • Data persisted on blob storage
  33. The Solution • Source • https://github.com/mongodb/mongo-azure • Documentation • http://www.mongodb.org/display/DOCS/MongoD

    B+on+Azure • Issues • mongodb-user google group • #mongodb IRC • https://jira.mongodb.org/browse/AZURE
  34. The Future • Scale out using replica sets • MongoDB

    Monitoring • Backup and Recovery • Sharding
  35. MongoDB – What’s next • Here at Microsoft Tech Days

    Level 2, Booth 96 • MongoDB Paris – June 14th Register now: www.10gen.com/events/mongodb-paris • Join the Paris MongoDB User Group www.meetup.com/Paris-MongoDB-User-Group/ • Sign up to the EMEA Newsletter http://www.10gen.com/signup
  36. drivers at mongodb.org REST ActionScript3 C# and .NET Clojure ColdFusion

    Delphi Erlang Go: gomongo Groovy Haskell Javascript Lua C C# C++ Erlang Haskell Java Javascript Perl PHP Python Ruby node.js Objective C PHP PowerShell Blog post Python Ruby Scala Scheme (PLT) Smalltalk: Dolphin Smalltalk Community Supported mongodb.org Supported