Slide 1

Slide 1 text

An Introduction To NoSQL & MongoDB Lee Theobald Twitter: @Leesy Email: [email protected]

Slide 2

Slide 2 text

NoSQL A form of database management system that is non- relational. Systems are often schema less, avoid joins & are easy to scale. The term NoSQL was coined in 1998 by Carlo Strozzi and then again in early 2009 with the no:sql(east) conference A better term would have been “NoREL” but NoSQL caught on. Think of it more as meaning “Not Only SQL”

Slide 3

Slide 3 text

But Why Choose NoSQL? Amount of data stored is on the up & up. Facebook is rumoured to hold over 50TB of data in their NoSQL system for their inbox search The data we store is more complex than 15 years ago. Easy Distribution With all this data is needs to be easy to be able to add/remove servers without any disruption of service.

Slide 4

Slide 4 text

Choose Your Flavour Key-Value Store Graph BigTable Document Store

Slide 5

Slide 5 text

Key-Value Store Data is stored in (unsurpisingly) key/value pairs. Designed to handle lots of data and heavy load Based on a Amazon’s Dynamo Paper Example: Voldermort ( http://project-voldemort.com /) - Developed by the guys at LinkedIn Key Value Name Joe Bloggs Age 42 Occupation Stunt Double Height 175cm Weight 77kg

Slide 6

Slide 6 text

Graph Focuses on modeling data & associated connections Inspired by mathematical Graph Theory. Example: FlockDB (http: //github.com/twitter/ flockdb) – developed by Twitter

Slide 7

Slide 7 text

BigTable / Column Families Based on the BigTable paper from Google Data is grouped by columns, not rows. Example: Cassandra ( http://cassandra.apache.org /) – Originally developed by Facebook, now and Apache project. ColumnFamily Row Key Column Name Key Key Key Value Value Value Column Name Key Key Key Value Value Value

Slide 8

Slide 8 text

Document Store Data stored as whole documents. JSON & XML are popular formats Maps well to an Object Orientated programming model Example: CouchDB ( http://couchdb.apache.org /) or … { “id”: “123”, “name”: “Oliver Clothesoff”, “dob”: { “year”: 1985, “month”: 5, “day”: 12 } }

Slide 9

Slide 9 text

MongoDB! Short for humongous Open source with development lead by 10Gen Document Based Schema-less Highly Scalable MapReduce Easy Replication & Sharding

Slide 10

Slide 10 text

Familiar Structure A MongoDB instance is made up of a number of databases. These contain a number of collections & you can have collections nested under other collections. Compare it to MySQL which has databases and tables.

Slide 11

Slide 11 text

Inserts – As Easy As Pie use cookbook; db.recipes.save({ “name”: “Cherry Pie”, “ingredients”: [“cherries”, “pie”], “cooking_time”: 30 });

Slide 12

Slide 12 text

Searching – A Piece Of Cake! db.recipes.find({ “cooking_time”: { “$gte”: 10, “$lt”: 30 } } db.recipes.findOne()

Slide 13

Slide 13 text

Some More Advanced Syntax Limiting Results db.find().limit(10); Skipping results db.find().skip(5); Sorting db.find().sort({cooking_time: -1}); Cursors: var cur = db.find().cursor(); cur.forEach( function(x) { print(tojson(x)); });

Slide 14

Slide 14 text

MapReduce Great way of doing bulk manipulation or aggregation. 2 or 3 functions written in JavaScript that execute on the server. An example use could be generating a list of top queries from some search logs.

Slide 15

Slide 15 text

Map Function Takes some input of the form of key/value pairs, performs some calculations and returns 0 or more key/value pairs map = function() { if (!this.query) { return; } emit (this.query, {count: 1}); }

Slide 16

Slide 16 text

Reduce Function Takes the results from the map function, does something (normally combine the results) and produces output in key/value pairs reduce = function(key, values) { var count = 0; values.forEach(function(v) { count += v[‘count’]; } return {count: count;} }

Slide 17

Slide 17 text

Replica Sets Master/Slave configuration If your primary server goes down, one of the secondary ones takes over automatically Extremely easy to setup

Slide 18

Slide 18 text

Auto Sharding – Horizontal Scaling Click to edit Master text styles Second level Third level Fourth level Fifth level

Slide 19

Slide 19 text

Other Features GridFS support – Distributed file storage Geospatial indexing It’s constantly in development so new features are being worked on all the time!

Slide 20

Slide 20 text

Why Not Try It Yourself Download it at: http://www.mongodb.org Online tutorial at:http ://www.mongodb.org/display/DOCS/Tutorial Some handy MongoDB sites: MongoDB Cookbook: http://cookbook.mongodb.org/ Kyle Banker’s blog: http://kylebanker.com/blog/ There’s also a load of handy reference cards, stickers and other MongoDB freebies up front!

Slide 21

Slide 21 text

Click icon to add picture Thanks For Listening Any questions? Click icon to add picture