DIBI workshop intro

MongoDB technical introduction Ross Lawley - @RossC0 #dibi13

•  Welcome! •  Who’s who? •  Introduction to MongoDB • 
Tutorial & exercises •  High Availability tutorial if time Agenda

About MongoDB •  Background –  Founded in 2007 –  First
release of MongoDB in 2009 –  $231M in funding •  MongoDB –  Core server –  Native drivers •  Subscriptions, Consulting, Training •  Monitoring (MMS)

Relational Databases

RDBMS Strengths •  Data stored is very compact •  Rigid
schemas have led to powerful query capabilities •  Data is optimised for joins and storage •  Robust ecosystem of tools, libraries, integrations •  40+ years old!

Enter “Big Data” •  Gartner deﬁnes it with 3Vs • 
Volume –  Vast amounts of data being collected •  Variety –  Evolving data –  Uncontrolled formats, no single schema –  Unknown at design time •  Velocity –  Inbound data speed –  Fast read/write operations –  Low latency

Mapping Big Data to RDBMS •  Difﬁcult to store uncontrolled
data formats •  Scaling via big iron or custom data marts/partitioning schemes •  Schema must be known at design time •  Impedance mismatch with agile development and deployment techniques •  Doesn’t map well to native language constructs

MongoDB Features

Goals •  Scale horizontally over commodity systems •  Incorporate what
works for RDBMSs –  Rich data models, ad-hoc queries, full indexes •  Drop what doesn’t work well –  Multi-row transactions, complex joins •  Do not homogenize APIs •  Match agile development and deployment workﬂows

Key Features •  Data stored as documents (JSON) –  Schema-free
•  Full CRUD support (Create, Read, Update, Delete) –  Atomic in-place updates –  Ad-hoc queries: Equality, RegEx, Ranges, Geospatial •  Secondary indexes •  Replication – redundancy, failover •  Sharding – partitioning for read/write scalability

Document Oriented, Schema Free {name: "will", eyes: "blue", birthplace: "NY",
aliases: ["bill"], gender: "Male", boss: "ben"} {name: "tina", birthplace: "NCE", boss: "ben"} {name: "ross", boss: "ben"} {name: "ben", hat: "yes"} {name: "matt", pizza: "DiGiorno", age: 28}

BSON – bsonspec.org

Extent allocation foo.0 foo.1 foo.2 00000000000 00000000000 00000000000 00000000000 00000000000
00000000000 00000000000 00000000000 preallocated space 0000000000 0000 foo.$freelist foo.baz foo.bar foo.test allocated per namespace: ns details stored in foo.ns

Record Allocation Deleted Record (Size, Offset, Next) BSON Data Header
(Size, Offset, Next, Prev) Padding ... ...

Seek = 5+ ms Read = really really fast Disk
seeks and data locality User Comment Article

Article User Comment Comment Comment Comment Comment Disk seeks and
data locality

MongoDB Security •  SSL –  Between your app and MongoDB
–  Between nodes in MongoDB cluster •  Authorization at the database level –  Read Only / Read + Write / Administrator •  Roadmap –  2.4: Pluggable Authentication –  2.6: Cell level security

Working with MongoDB

user = { username: "ross", first_name: "Ross", last_name: "Lawley"} >
db.users.insert(user) Create (Insert)

> db.users.findOne() { "_id" : ObjectId("50ed3c5cab4ef39dc735664b"), "username" : "ross", "first_name"
: "Ross", "last_name" : "Lawley" } Read (Query)

_id •  _id is the primary key in MongoDB • 
Automatically indexed •  Automatically created as an ObjectId if not provided •  Any unique immutable value could be used

ObjectId •  ObjectId is a special 12 byte value • 
Guaranteed to be unique across your cluster ObjectId("50ed3c5cab4ef39dc735664b") |-------------||---------||-----||----------| ts mac pid inc

// find users with any tags > db.users.find( {tags: {$exists:
true }} ) // find users matching a regular expression > db.users.find( {username: /^ro*/i } ) // count posts by author > db.users.find( {username: "Ross"} ).count() Query Operators Conditional Operators –  $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type –  $lt, $lte, $gt, $gte"

> tags = ["superuser", "db_admin"] > address = { street:
"Scrutton Street", city: "London" } > db.users.update({}, {"$pushAll": {"tags": tags}, "$set": {"address": address}, "$inc": {"tag_count": 2}}) Update

> db.users.findOne() { "_id" : ObjectId("50ed3c5cab4ef39dc735664b"), "address" : { "street"
: "Zetland House", "city" : "London" }, "first_name" : "Ross", "last_name" : "Lawley", "tag_count" : 2, "tags" : [ "superuser", "db_admin" ], "username" : "ross" Read (Query)

Atomic operators •  Scalar –  $set, $unset, $inc, •  Array
–  $push, $pushAll, $pull, $pullAll, $addToSet"

// 1 means ascending, -1 means descending > db.users.ensureIndex({username: 1})
> db.users.find({username: "ross"}).explain() // Multi-key indexes > db.users.ensureIndex({tags: 1}) // index nested field > db.users.ensureIndex({"address.city": 1}) // Compound indexes > db.users.ensureIndex({ "username": 1, "address.city": 1 }) Secondary Indexes

Enough talk, Lets get started!

DIBI workshop intro

DIBI workshop intro

rozza

More Decks by rozza

Other Decks in Technology

Featured

Transcript

MongoDB technical introduction Ross Lawley - @RossC0 #dibi13

•  Welcome! •  Who’s who? •  Introduction to MongoDB •

About MongoDB •  Background –  Founded in 2007 –  First

Relational Databases

Relational Databases

RDBMS Strengths •  Data stored is very compact •  Rigid

Enter “Big Data” •  Gartner deﬁnes it with 3Vs •

Mapping Big Data to RDBMS •  Difﬁcult to store uncontrolled

MongoDB Features

Goals •  Scale horizontally over commodity systems •  Incorporate what

Key Features •  Data stored as documents (JSON) –  Schema-free

Document Oriented, Schema Free {name: "will", eyes: "blue", birthplace: "NY",

BSON – bsonspec.org

Extent allocation foo.0 foo.1 foo.2 00000000000 00000000000 00000000000 00000000000 00000000000

Record Allocation Deleted Record (Size, Offset, Next) BSON Data Header

Seek = 5+ ms Read = really really fast Disk

Article User Comment Comment Comment Comment Comment Disk seeks and

MongoDB Security •  SSL –  Between your app and MongoDB

Working with MongoDB

user = { username: "ross", first_name: "Ross", last_name: "Lawley"} >

> db.users.findOne() { "_id" : ObjectId("50ed3c5cab4ef39dc735664b"), "username" : "ross", "first_name"

_id •  _id is the primary key in MongoDB •

ObjectId •  ObjectId is a special 12 byte value •

// find users with any tags > db.users.find( {tags: {$exists:

> tags = ["superuser", "db_admin"] > address = { street:

> db.users.findOne() { "_id" : ObjectId("50ed3c5cab4ef39dc735664b"), "address" : { "street"

Atomic operators •  Scalar –  $set, $unset, $inc, •  Array

// 1 means ascending, -1 means descending > db.users.ensureIndex({username: 1})

Enough talk, Lets get started!