Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Intro to MongoDB

Intro to MongoDB

Slides for presentation at NoSQLEast in Cambridge, UK (http://www.meetup.com/NoSQL-East/events/159368972/).

James Tan

June 09, 2014
Tweet

More Decks by James Tan

Other Decks in Technology

Transcript

  1. MongoDB is a ___________ database •  Document •  Open source

    •  High performance •  Horizontally scalable •  Full featured
  2. Document Database •  Not for .PDF & .DOC files • 

    A document is essentially an associative array •  Document = JSON object •  Document = PHP Array •  Document = Python Dict •  Document = Ruby Hash •  etc.
  3. Open Source •  MongoDB is an open source project • 

    On GitHub •  Licensed under the AGPL •  Started & sponsored by MongoDB Inc (formerly 10gen) •  Commercial licenses available •  Contributions welcome
  4. High Performance •  Written in C++ •  Extensive use of

    memory-mapped files i.e. read-through write-through memory caching •  Runs nearly everywhere •  Data serialized as BSON (fast parsing) •  Full support for primary & secondary indexes •  Document model = less work
  5. Horizontally Scalable (scale out) Shard N Shard 3 Shard 2

    Shard 1 Horizontally Scalable Auto-Sharding •  Increase capacity as you go •  Commodity and cloud architectures •  Improved operational simplicity and cost visibility
  6. Full Featured Rich Queries •  Find Paul’s cars •  Find

    everybody in London with a car built between 1970 and 1980 Geospatial •  Find all of the car owners within 5km of Trafalgar Sq. Text Search •  Find all the cars described as having leather seats Aggregation •  Calculate the average value of Paul’s car collection Map Reduce •  What is the ownership pattern of colors by geography over time? (is purple trending up in China?) { ! first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: [45.123,47.232],! cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }! }!
  7. 7,000,000+ MongoDB Downloads 200,000+ Online Education Registrants 35,000+ MongoDB Management

    Service (MMS) Users 30,000+ MongoDB User Group Members 20,000+ MongoDB Days Attendees Global Community
  8. MongoDB Use Cases Big Data Product & Asset Catalogs Security

    & Fraud Internet of Things Database-as-a- Service Mobile Apps Customer Data Management Single View Social & Collaboration Content Management Intelligence Agencies Top Investment and Retail Banks Top US Retailer Top Global Shipping Company Top Industrial Equipment Manufacturer Top Media Company Top Investment and Retail Banks
  9. Terminology RDBMS MongoDB Table, View ➜ Collection Row ➜ Document

    Index ➜ Index Join ➜ Embedded Document Foreign Key ➜ Reference Partition ➜ Shard
  10. Typical (relational) ERD User ·Name ·Email address Category ·Name ·URL

    Comment ·Comment ·Date ·Author Article ·Name ·Slug ·Publish date ·Text Tag ·Name ·URL
  11. MongoDB ERD User ·Name ·Email address Article ·Name ·Slug ·Publish

    date ·Text ·Author Comment[] ·Comment ·Date ·Author Tag[] ·Value Category[] ·Value Linking vs embedding
  12. Sample Article document { ! _id: ObjectId(…),! name: ‘My first

    blog post’,! slug: ‘/2014_01_23_My_first_blog_post’,! published: ISODate(‘2014-01-23T12:23:42‘),! user_id: ObjectId(…),! text: ‘This is the very first blog post…’,! author: ‘John Doe’,! tags: [ ‘NoSQL’, ’MongoDB’ ],! categories: [ ‘Work’ ]! comments: [
 { name: ‘Harry’,
 text: ‘Nice blog post!’,! ts: ISODate(‘2014-02-03T11:41:25‘) },! …! ]! }! Fields Fields can contain an array of sub- documents Fields can contain arrays Typed field values Date
  13. Node 1 Secondary Config Server Node 1 Secondary Config Server

    Node 1 Secondary Config Server Shard Shard Shard Mongos App Server Mongos App Server Mongos App Server Deployment – Sharded cluster
  14. Control of reads and writes •  Read preferences: primary (default),

    primaryPreferred, secondary, secondaryPreferred, nearest! •  Write concerns: (e.g. w=1, w=majority, j=1, wtimeout=<ms>) •  Tagged reads and writes (to specific/subset of nodes) •  Tag aware sharding