Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to MongoDB

Sridhar Nanjundeswaran
February 24, 2012
120

Intro to MongoDB

Presented at Microsoft Tech Days Paris, Feb 2012

Sridhar Nanjundeswaran

February 24, 2012
Tweet

Transcript

  1. MongoDB – An Introduction
    February 8, 2012
    Sridhar Nanjundeswaran
    Software Engineer, 10gen Inc.
    @snanjund
    [email protected]
    © Copyright 2010 10gen Inc.

    View full-size slide

  2. Overview
    1. What is MongoDB?
    • Why MongoDB?
    • Overview
    • Terminology
    2. CRUD
    3. Schema Design and Indexing
    4. MongoDB on Windows

    View full-size slide

  3. NoSQL Really Means...
    non-relational
    next-generation
    operational datastores and databases
    ... focus on the “non-relational” bit.

    View full-size slide

  4. Horizontally Scalable
    Architectures
    no joins
    + no complex transactions
    New Data Models

    View full-size slide

  5. What is MongoDB?
    • High performance, horizontally scalable
    document store
    • Open source

    View full-size slide

  6. Why MongoDB?
    • Issues faced with traditional RDBMS
    • Costs dramatically increase as you scale up
    • Productivity goes down
    • Advantages of MongoDB
    • Application evolution
    • Sharding for write throughput

    View full-size slide

  7. MongoDB Highlights
    • Internal storage – Binary JSON
    (www.bsonspec.org)
    • Write ahead journal
    • Data safety
    • Fast recovery
    • Replication for HA and data safety
    • Single master system
    • Sharding for scaling
    • Versioning scheme indicates production
    readiness

    View full-size slide

  8. MongoDB Storage Management
    • Memory-mapped files – O/S responsible for
    writing dirty blobs
    • Files are allocated as needed
    • Writing to disk
    • Default flush interval 60 seconds
    • O/S may flush sooner
    • Journaling
    • Group commits to Journal
    • Journal automatically applied on re-start (fast
    recovery)

    View full-size slide

  9. Deployment Models
    • Single instance
    • Not recommended for production
    • Replica Sets
    • Sharding (with replication)

    View full-size slide

  10. Terminology
    RDBMS Mongo
    Table, View Collection
    Row(s) JSON Document
    Index Index
    Join Embedded Document
    Partition Shard
    Partition Key Shard Key

    View full-size slide

  11. JSON documents
    • JavaScript Object Notation
    • Simple types only number, string, arrays,
    embedded objects
    • http://www.json.org/
    • BSON – Binary JSON
    • Adds additional types such as Date, BinData
    • http://bsonspec.org/

    View full-size slide

  12. Simple JSON Document
    Blog Post Document
    p = { author: “sridhar”,
    date: new Date(),
    title: “Using the C# driver with MongoDB”,
    tags: [“NoSQL”, “Mongo”, “MongoDB”]}

    View full-size slide

  13. More complicated JSON document
    { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
    author : “sridhar",
    date : “Mon Jul 11 2011 19:47:11 GMT-0700 (PDT)",
    text : “Using the C# driver with MongoDB",
    tags : [ “NoSQL", “Mongo", “MongoDB" ],
    comments : [
    {
    author : "Fred",
    date : “Mon Jul 11 2011 20:51:03 GMT-0700 (PDT)",
    text : “Interesting blog post"
    }
    ]}

    View full-size slide

  14. © Copyright 2010 10gen Inc.
    try at try.mongodb.org

    View full-size slide

  15. Create
    Blog Post Document
    > p = { author: “sridhar”,
    date: new Date(),
    title: “Using the C# driver with MongoDB”,
    tags: [“Mongo”, “MongoDB”]}
    > db.posts.insert(p)

    View full-size slide

  16. Read
    > db.posts.find()
    { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
    author : “sridhar",
    date : ISODate("2012-02-08T06:34:59.719Z"),
    title: “Using the C# driver with MongoDB”,
    tags: [“Mongo”, “MongoDB”]}

    View full-size slide

  17. Query Operators
    • Conditional Operators
    • $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type
    • $lt, $lte, $gt, $gte
    // find posts with any tags
    > db.posts.find( {tags: {$exists: true }} )
    // find posts matching a regular expression
    > db.posts.find( {author: /^sri*/i } )
    // count posts by author
    > db.posts.find( {author: ‘sridhar’} ).count()

    View full-size slide

  18. Update
    > db.posts.update({author:”sridhar”},
    {$set:{title:“Using the .Net driver with
    MongoDB”}})
    > db.posts.update({author:”sridhar”},
    {$set:{talk:1}})
    > db.posts.update({author:”sridhar”},
    {$push:{tags:”NoSQL”}})

    View full-size slide

  19. Atomic Operations
    • $set, $unset, $inc, $push, $pushAll, $pull,
    $pullAll, $bit
    > comment = { author: “fred”,
    date: new Date(),
    text: “Interesting blog post”}
    > db.posts.update( { _id: “...” },
    $push: {comments: comment} );

    View full-size slide

  20. Delete
    > db.posts.remove({author:”sridhar”})
    > db.posts.remove({})

    View full-size slide

  21. 3. Schema Design and Indexing

    View full-size slide

  22. Inheritance

    View full-size slide

  23. Single Table Inheritance - RDBMS
    • shapes table
    id type area Radius d length width
    1 circle 3.14 1
    2 square 4 2
    3 rect 10 5 2

    View full-size slide

  24. Single Table Inheritance - MongoDB
    > db.shapes.find()
    { _id: "1", type: "circle",area: 3.14, radius: 1}
    { _id: "2", type: "square",area: 4, d: 2}
    { _id: "3", type: "rect", area: 10, length: 5, width: 2}
    // find shapes where radius > 0
    > db.shapes.find({radius: {$gt: 0}})
    // create index
    > db.shapes.ensureIndex({radius: 1})

    View full-size slide

  25. One to Many
    • One to Many relationships.
    • degree of association between objects
    • containment life-cycle

    View full-size slide

  26. One to Many
    • Embedded Array / Array Keys
    • Single document
    • slice operator to return subset of array
    • some queries harder
    • e.g find latest comments across all documents
    blogs: {
    author : "Hergé",
    date : ISODate("2011-09-18T09:56:06.298Z"),
    comments : [
    {
    author : “Sridhar",
    date : ISODate("2011-09-19T09:56:06.298Z"),
    text : "great book",
    replies: [ { author : “James”, ...} ]
    }
    ]}

    View full-size slide

  27. One to Many
    • Normalized (2 collections)
     most flexible
     more queries
    blogs: {
    author : "Hergé",
    date : ISODate("2011-09-18T09:56:06.298Z"),
    comments : [
    {comment : ObjectId(“1”)}
    ]}
    comments : { _id : “1”,
    author : “Sridhar",
    date : ISODate("2011-09-19T09:56:06.298Z")}

    View full-size slide

  28. Trees
    • Full Tree in Document
    { comments: [
    { author: “Sridhar”, text: “...”,
    replies: [
    {author: “Renaud”, text: “...”,
    replies: []}
    ]}
    ]
    }
    • Pros: Single Document, Performance, Intuitive
    • Cons: Hard to search, Partial Results, 16MB limit

    View full-size slide

  29. Trees
    • Parent Links
    • - Each node is stored as a document
    • - Contains the id of the parent
    • Child Links
    • - Each node contains the id’s of the children
    • - Can support graphs (multiple parents / child)

    View full-size slide

  30. Array of Ancestors
    • Store all Ancestors of a node
    { _id: "a" }
    { _id: "b", ancestors: [ "a" ], parent: "a" }
    { _id: "c", ancestors: [ "a", "b" ], parent: "b" }
    { _id: "d", ancestors: [ "a", "b" ], parent: "b" }
    { _id: "e", ancestors: [ "a" ], parent: "a" }
    { _id: "f", ancestors: [ "a", "e" ], parent: "e" }
    //find all descendants of b:
    > db.tree2.find({ancestors: ‘b’})
    //find all direct descendants of b:
    > db.tree2.find({parent: ‘b’})

    View full-size slide

  31. Indexing
    • Similar to RDBMS
    • db.posts.createIndex({author:1})
    • 1 – is ascending
    • -1 – is descending
    • Different types of indexes
    • Unique, Compound, Sparse
    • Index embedded fields
    • Multikey
    • Geospatial

    View full-size slide

  32. What’s different
    • Multikey
    • Index an array field
    • Index entry created per array entry
    db.posts.ensureIndex({“comments.author”:1})
    • Useful for creating document key searches
    • Unique sparse indexes
    • Null and not present different
    • Covered indexes
    • Remember to exclude _id from projection

    View full-size slide

  33. Geospatial
    • Geo-has stored in B-Tree
    • 1st two values indexed
    • Can be array or subdocuments

    View full-size slide

  34. 4. MongoDB on Windows
    • Current production release is 2.0.2
    • Prebuilt binaries for 32 and 64 bit
    • Released the same time as other OS’
    • http://www.mongodb.org/downloads
    • 64 bit recommended and to be used in
    production
    • Windows feature set on par with other OS’

    View full-size slide

  35. Windows Service
    • Inbuilt service support in mongod
    • --install (uses Local System Account)
    • --remove
    • Run as admin in Win7, Win2k8
    • Ensure service user has access to dbpath

    View full-size slide

  36. Monitoring on Windows
    • MMS – MongoDB Monitoring Service
    • Hosted monitoring by 10gen
    • Task Manager
    • Quick monitoring
    • Memory shows resident not mapped
    • Perfmon – monitor network, disk i/o etc.
    • Mongostat.exe
    • Wireshark can be used for packet level
    monitoring
    • http://wiki.wireshark.org/Mongo

    View full-size slide

  37. Client on Windows
    • Can connect to server running Windows or
    other OS’
    • Drivers work on Windows (e.g.)
    • .Net (C#)
    • Python
    • Java
    • Perl

    View full-size slide

  38. .Net and MongoDB
    • Current version is 1.3.1
    • Full feature .Net driver from 10gen
    • Written in C# - built in ODM
    • Used from C#, VB, Powershell and F#
    • 2 dlls (Bson and Driver)
    • Download
    • Zip or msi from Language Center
    • Nuget package available
    • LINQ support coming soon

    View full-size slide

  39. MongoDB on Azure
    • Uses standard MongoDB binaries
    • Code is open sourced
    • Works with VS Express
    • Automatic replica set initiation on deploy
    • Survives reboots of instances
    • Integration with Azure diagnostics
    • Data persisted on blob storage

    View full-size slide

  40. The Solution
    • Source
    • https://github.com/mongodb/mongo-azure
    • Documentation
    • http://www.mongodb.org/display/DOCS/MongoD
    B+on+Azure
    • Issues
    • mongodb-user google group
    • #mongodb IRC
    • https://jira.mongodb.org/browse/AZURE

    View full-size slide

  41. The Future
    • Scale out using replica sets
    • MongoDB Monitoring
    • Backup and Recovery
    • Sharding

    View full-size slide

  42. MongoDB – What’s next
    • Here at Microsoft Tech Days
    Level 2, Booth 96
    • MongoDB Paris – June 14th
    Register now: www.10gen.com/events/mongodb-paris
    • Join the Paris MongoDB User Group
    www.meetup.com/Paris-MongoDB-User-Group/
    • Sign up to the EMEA Newsletter
    http://www.10gen.com/signup

    View full-size slide

  43. drivers at mongodb.org
    REST
    ActionScript3
    C# and .NET
    Clojure
    ColdFusion
    Delphi
    Erlang
    Go: gomongo
    Groovy
    Haskell
    Javascript
    Lua
    C
    C#
    C++
    Erlang
    Haskell
    Java
    Javascript
    Perl
    PHP
    Python
    Ruby
    node.js
    Objective C
    PHP
    PowerShell
    Blog post
    Python
    Ruby
    Scala
    Scheme (PLT)
    Smalltalk: Dolphin
    Smalltalk
    Community Supported
    mongodb.org Supported

    View full-size slide

  44. @mongodb
    conferences, appearances, and meetups
    http://www.10gen.com/events
    http://bit.ly/mongofb
    Facebook | Twitter | LinkedIn
    http://linkd.in/joinmongo
    download at mongodb.org

    View full-size slide