Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MongoDB

 MongoDB

Overview of MongoDB in only 10 minutes

saracubillas

April 01, 2014
Tweet

More Decks by saracubillas

Other Decks in Technology

Transcript

  1. What is MongoDB? • Document database o Data is structured

    as schema-less JSON documents • One of the most popular NoSQL solutions • Cross-platform and open source o written in C++ o supports Windows, Linux, Mac OS X, Solaris
  2. Features (I) • Document-based storage and querying o Queries themselves

    are JSON documents • Full Index Support o Allows indexing on any attribute, just like in a traditional SQL solution • Replication & High Availability o Supports mirroring of data for scalability
  3. Features (II) • Auto-Sharding (horizontal scaling) o Large data sets

    can be divided and distributed over multiple shards • Fast In-Place Updates o Update operations are atomic for contention-free performance • Integrated Map/Reduce framework o Can perform map/reduce operations on top of the data
  4. History • First developed by 10gen (later MongoDB, Inc.) in

    2007 • Name comes from “humongous” • Became open source in 2009 • Latest stable release (2.4.9) released Jan 2014
  5. Basic Ideas { _id: 1234, author: { name: “Bob Jones”,

    email: “[email protected]” }, post: “In these troubled times I like to ...“, date: { $date: “2014-03-12 13:23UTC” }, location: [ -121.2322, 48.1223222 ], rating: 2.2, comments: [ { user: “[email protected]”, upVotes: 22, downVotes: 14, text: “Great point! I agree” }, { user: “[email protected]”, upVotes: 421, downVotes: 22, text: “You are a...” } ], tags: [ “databases”, “mongo” ] } • Collections of JSON objects • Embed objects within a single document • Flexible schema • References
  6. Query Example db.posts.find({ author.name: “mike” }) db.posts.find({ rating: { $gt:

    2 }}) db.posts.find({ tags: “software” }) db.posts.find().sort({date: -1}).limit(10) // select * from posts where ‘economy’ in tags order by ts DESC db.posts find( {tags :‘economy’}) .sort({ts :-1 }).limit(10); http://try.mongodb.org/
  7. Note on internals • documents stored as BSON (Binary JSON)

    • memory-mapped files • indexes are B-Trees http://bsonspec.org {_id: ObjectId(XXXXXXXXX), hello: “world”} \x27\x00\x00\x07 _i d\x00 X X X X X X X X\x02 h e l l o\x00\x06\x00 \x00\x00 w o r l d\x00\x00
  8. Cassandra (1.2) Best used: • When you write more than

    you read (logging). • If every component of the system must be in Java. • If you require Availability + Partition Tolerance For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is data analysis. MongoDB (2.2) Best used: • If you need dynamic queries. • If you prefer to define indexes, not map/reduce functions. • If you need good performance on a big DB. • If you require Consistency + Partition Tolerance For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back. source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis VS
  9. Why (and why not) MongoDB? • If you need dynamic

    queries • If you prefer to define indexes, not map/reduce functions • If you need good performance on a big DB • If you wanted CouchDB, but your data changes too much, filling up disks • It lacks transactions, so if you're a bank, don’t use it • It doesn't support SQL • It doesn't have any built-in revisioning like CouchDB • It doesn't have real full text searching features
  10. Production Users •Archiving - Craigslist •Content Management - MTV Networks

    •E-Commerce - Customink •Real-time Analytics - intuit •Social Networking - Foursquare
  11. Long-term goals for MongoDB To add new features as: •

    Natural language processing • Full text search engine • More real-time search in data
  12. Personal conclusion • Getting up to speed with MongoDB (document

    oriented and schema free) • Advanced usage (tons of features) • Administration (Easy to admin,replication,sharding) • Advanced usage (Index & aggregation) • BSON and Memory-Mapped • There are times where not all clients can read or write. CP (Consistency and Partition Tolerance).
  13. References • MongoDB.org (https://www.mongodb.org/) • Wikipedia: MongoDB (http://en.wikipedia.org/wiki/MongoDB) • DB-Engines

    Ranking (http://db-engines.com/en/ranking) • Interview about the future of MongoDB ( http://strata.oreilly.com/2012/11/the-future-of-mongodb.html) • MongoDB Inside and Outside by Kyle Banker (http://vimeo.com/13211523) • How This Web Site Uses MongoDB ( http://www.businessinsider.com/how-we-use-mongodb-2009-11) • Cassandra and MongoDB comparison (http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb- vs-redis)