Slide 1

Slide 1 text

in 10 minutes Mohannad El Dafrawy Sara Rodriguez Lino Valdivia Jr

Slide 2

Slide 2 text

What is MongoDB? • Document database o Data is structured as schema-less JSON documents • One of the most popular NoSQL solutions • Cross-platform and open source o written in C++ o supports Windows, Linux, Mac OS X, Solaris

Slide 3

Slide 3 text

Features (I) • Document-based storage and querying o Queries themselves are JSON documents • Full Index Support o Allows indexing on any attribute, just like in a traditional SQL solution • Replication & High Availability o Supports mirroring of data for scalability

Slide 4

Slide 4 text

Features (II) • Auto-Sharding (horizontal scaling) o Large data sets can be divided and distributed over multiple shards • Fast In-Place Updates o Update operations are atomic for contention-free performance • Integrated Map/Reduce framework o Can perform map/reduce operations on top of the data

Slide 5

Slide 5 text

History • First developed by 10gen (later MongoDB, Inc.) in 2007 • Name comes from “humongous” • Became open source in 2009 • Latest stable release (2.4.9) released Jan 2014

Slide 6

Slide 6 text

Basic Ideas { _id: 1234, author: { name: “Bob Jones”, email: “[email protected]” }, post: “In these troubled times I like to ...“, date: { $date: “2014-03-12 13:23UTC” }, location: [ -121.2322, 48.1223222 ], rating: 2.2, comments: [ { user: “[email protected]”, upVotes: 22, downVotes: 14, text: “Great point! I agree” }, { user: “[email protected]”, upVotes: 421, downVotes: 22, text: “You are a...” } ], tags: [ “databases”, “mongo” ] } ● Collections of JSON objects ● Embed objects within a single document ● Flexible schema ● References

Slide 7

Slide 7 text

Query Example db.posts.find({ author.name: “mike” }) db.posts.find({ rating: { $gt: 2 }}) db.posts.find({ tags: “software” }) db.posts.find().sort({date: -1}).limit(10) // select * from posts where ‘economy’ in tags order by ts DESC db.posts find( {tags :‘economy’}) .sort({ts :-1 }).limit(10); http://try.mongodb.org/

Slide 8

Slide 8 text

Note on internals • documents stored as BSON (Binary JSON) • memory-mapped files • indexes are B-Trees http://bsonspec.org {_id: ObjectId(XXXXXXXXX), hello: “world”} \x27\x00\x00\x07 _i d\x00 X X X X X X X X\x02 h e l l o\x00\x06\x00 \x00\x00 w o r l d\x00\x00

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Cassandra (1.2) Best used: • When you write more than you read (logging). • If every component of the system must be in Java. • If you require Availability + Partition Tolerance For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is data analysis. MongoDB (2.2) Best used: • If you need dynamic queries. • If you prefer to define indexes, not map/reduce functions. • If you need good performance on a big DB. • If you require Consistency + Partition Tolerance For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back. source: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis VS

Slide 11

Slide 11 text

Why (and why not) MongoDB? • If you need dynamic queries • If you prefer to define indexes, not map/reduce functions • If you need good performance on a big DB • If you wanted CouchDB, but your data changes too much, filling up disks • It lacks transactions, so if you're a bank, don’t use it • It doesn't support SQL • It doesn't have any built-in revisioning like CouchDB • It doesn't have real full text searching features

Slide 12

Slide 12 text

Production Users •Archiving - Craigslist •Content Management - MTV Networks •E-Commerce - Customink •Real-time Analytics - intuit •Social Networking - Foursquare

Slide 13

Slide 13 text

Long-term goals for MongoDB To add new features as: • Natural language processing • Full text search engine • More real-time search in data

Slide 14

Slide 14 text

Personal conclusion • Getting up to speed with MongoDB (document oriented and schema free) • Advanced usage (tons of features) • Administration (Easy to admin,replication,sharding) • Advanced usage (Index & aggregation) • BSON and Memory-Mapped • There are times where not all clients can read or write. CP (Consistency and Partition Tolerance).

Slide 15

Slide 15 text

References • MongoDB.org (https://www.mongodb.org/) • Wikipedia: MongoDB (http://en.wikipedia.org/wiki/MongoDB) • DB-Engines Ranking (http://db-engines.com/en/ranking) • Interview about the future of MongoDB ( http://strata.oreilly.com/2012/11/the-future-of-mongodb.html) • MongoDB Inside and Outside by Kyle Banker (http://vimeo.com/13211523) • How This Web Site Uses MongoDB ( http://www.businessinsider.com/how-we-use-mongodb-2009-11) • Cassandra and MongoDB comparison (http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb- vs-redis)