Common Use Cases for MongoDB

Software Engineer, MongoDB Brandon Black @brandonmblack Common Use Cases for
MongoDB

What is MongoDB?

MongoDB is a ___________ database •  Document •  Open source
•  High performance •  Horizontally scalable •  Full featured

Document Database •  Not for .PDF & .DOC ﬁles • 
A document is essentially an associative array •  Document = JSON object •  Document = PHP Array •  Document = Python Dict •  Document = Ruby Hash

Open Source •  MongoDB is an open source project • 
Available on GitHub •  Licensed under the AGPL •  Started & sponsored by MongoDB, Inc. (10gen) •  Commercial licenses available •  Contributions welcome

High Performance •  Written in C++ •  Extensive use of
memory-mapped ﬁles i.e. read-through write-through memory caching. •  Runs nearly everywhere •  Data serialized as BSON (fast parsing) •  Full support for primary & secondary indexes •  Document model = less work

Shard N Shard 3 Shard 2 Shard 1 Horizontally Scalable

Database Landscape Depth of Functionality Scalability & Performance Memcached MongoDB
RDBMS

Full Featured •  Ad Hoc queries •  Real time aggregation
•  Rich query capabilities •  Strongly consistent •  Geospatial features •  Native support for most programming languages •  Flexible schema

mongodb.org/downloads

NoSQL and MongoDB

NoSQL Features Flexible Data Models •  Lists, embedded objects • 
Sparse data •  Semi-structured data •  Agile development High Data Throughput • Reads • Writes Big Data •  Aggregate Data Size •  Number of Objects Low Latency •  For reads and writes •  Millisecond Latency Cloud Computing •  Runs everywhere •  No special hardware Commodity Hardware •  Ethernet •  Local data storage •  JSON Based •  Dynamic Schemas •  Replica Sets to scale reads •  Sharding to scale writes •  1000s of shards in a single DB •  Data partitioning •  Designed for “typical” OS and local ﬁle system •  Scale-out to overcome hardware limitations •  In-memory cache •  Scale-out working set

Use Cases

High Volume Data Feeds •  More machines, more sensors, more
data •  Variably structured Machine Generated Data •  High frequency trading •  Daily closing price Securities Data •  Multiple data sources •  Each changes their format consistently •  Student Scores, Telecom logs Social Media

High Volume Data Feeds Data Sources Asynchronous Writes Flexible document
model can adapt to changes in data format Write to memory with periodic disk ﬂush Data Sources Data Sources Data Sources Scale writes over multiple shards

Operational Intelligence •  Large volume of users •  Very strict
latency requirements Ad Targeting •  Expose data to millions of customers •  Reports on large volumes of data •  Reports that update in real time Real-Time Dashboards •  Join the conversation Social Media Monitoring

Operational Intelligence Dashboards API Low latency reads Parallelize queries across
replicas and shards In database aggregation Flexible schema adapts to changing input data Can use same cluster to collect, store and report on data

{ cookie_id: “1234512413243”, advertiser:{
apple: { actions: [ { impression: ‘ad1’, time: 123 }, { impression: ‘ad2’, time: 232 }, { click: ‘ad2’, time: 235 }, { add_to_cart: ‘laptop’, sku: ‘asdf23f’, time: 254 }, { purchase: ‘laptop’, time: 354 } ] … Behavioral Proﬁles 1 2 3 See Ad See Ad 4 Click Convert Rich proﬁles collecting multiple complex actions Scale out to support high throughput of activities tracked Dynamic schemas make it easy to

Metadata •  Diverse product portfolio •  Complex querying and ﬁltering
Product Catalogue •  Data mining Data analysis •  Retina Scans •  Fingerprints Biometric

Metadata { ISBN: “00e8da9b”, type: “Book”,
country: “Egypt”, title: “Ancient Egypt” } { type: “Artefact”, medium: “Ceramic”, country: “Egypt”, year: “3000 BC” } Flexible data model for similar but different objects Indexing and rich query API for easy searching and sorting db.archives. find({ “country”: “Egypt” });

Content Management •  Comments and user generated content •  Personalization
of content, layout News Site •  Generate layout on the ﬂy •  No need to cache static pages Multi-device rendering •  Store large objects •  Simpler modeling of metadata Sharing

Content Management { camera: “Nikon d4”, location: [
-‐122.418333, 37.775 ] } { camera: “Canon 5d mkII”, people: [ “Jim”, “Carol” ], taken_on: ISODate("2012-‐03-‐07T18:32:35.002Z") } { origin: “facebook.com/photos/xwdf23fsdf”, license: “Creative Commons CC0”, size: { dimensions: [ 124, 52 ], units: “pixels” } } Flexible data model for similar but different objects Horizontal scalability for large data sets Geo spatial indexing for location-based searches GridFS for large object storage

Is My Use Case a Good Fit For MongoDB?

Application Why MongoDB Might be a good fit Large number
of objects to store Sharding lets you split objects across multiple servers High write or read throughput Sharding + Replication lets you scale read and write traffic across multiple servers Low latency access Memory mapped storage engine caches documents in RAM, enabling in-memory performance. Data locality of documents can significantly improve latency over join-based approaches Variable data in objects Dynamic schema and JSON data model enable flexible data storage without sparse tables or complex joins Cloud based deployment Sharding and replication let you work around hardware limitations in clouds.

What Next?

Online Documentation http://docs.mongodb.org/ MongoDB Blog http://blog.mongodb.org/

Free Online Courses http://education.mongodb.com/ Events & Webinars http://www.mongodb.com/events Presentations http://www.mongodb.com/presentations

Software Engineer, MongoDB Brandon Black @brandonmblack Thank You http://github.com/brandonblack/presentations http://speakerdeck.com/brandonblack
Slides Available:

Common Use Cases for MongoDB

Common Use Cases for MongoDB

Brandon Black

More Decks by Brandon Black

Other Decks in Programming

Featured

Transcript

Software Engineer, MongoDB Brandon Black @brandonmblack Common Use Cases for

+

What is MongoDB?

MongoDB is a ___________ database •  Document •  Open source

Document Database •  Not for .PDF & .DOC ﬁles •

Open Source •  MongoDB is an open source project •

High Performance •  Written in C++ •  Extensive use of

Shard N Shard 3 Shard 2 Shard 1 Horizontally Scalable

Database Landscape Depth of Functionality Scalability & Performance Memcached MongoDB

Full Featured •  Ad Hoc queries •  Real time aggregation

mongodb.org/downloads

NoSQL and MongoDB

NoSQL Features Flexible Data Models •  Lists, embedded objects •

Use Cases

High Volume Data Feeds •  More machines, more sensors, more

High Volume Data Feeds Data Sources Asynchronous Writes Flexible document

Operational Intelligence •  Large volume of users •  Very strict

Operational Intelligence Dashboards API Low latency reads Parallelize queries across

{ cookie_id: “1234512413243”, advertiser:{

Metadata •  Diverse product portfolio •  Complex querying and ﬁltering

Metadata { ISBN: “00e8da9b”, type: “Book”,

Content Management •  Comments and user generated content •  Personalization

Content Management { camera: “Nikon d4”, location: [

Is My Use Case a Good Fit For MongoDB?

Application Why MongoDB Might be a good ﬁt Large number

What Next?

Online Documentation http://docs.mongodb.org/ MongoDB Blog http://blog.mongodb.org/

Free Online Courses http://education.mongodb.com/ Events & Webinars http://www.mongodb.com/events Presentations http://www.mongodb.com/presentations

Software Engineer, MongoDB Brandon Black @brandonmblack Thank You http://github.com/brandonblack/presentations http://speakerdeck.com/brandonblack