Common MongoDB Use Cases Webinar - April 5, 2012

When to use MongoDB

Part 1 of a series Real-Time Analytics with Mongodb April
12th Content Management with MongoDB May 17th @forjared

Today Last 10 years Emerging NoSQL Space RDBMS Data Warehou
se NoSQL RDBMS Data Warehou se The beginning RDBMS

Qualities of NoSQL Workloads Flexible data models • Lists, Nested Objects
• Sparse schemas • Semi-structured data • Agile Development High Throughput • Lots of reads • Lots of writes Large Data Sizes • Aggregate data size • Number of objects Low Latency • Both reads and writes • Millisecond latency Cloud Computing • Run anywhere • No assumptions about hardware • No / Few Knobs Commodity Hardware • Ethernet • Local disks

MongoDB was designed for this Flexible data models • Lists, Nested
Objects • Sparse schemas • Semi-structured data • Agile Development High Throughput • Lots of reads • Lots of writes Large Data Sizes • Aggregate data size • Number of objects Low Latency • Both reads and writes • Millisecond latency Cloud Computing • Run anywhere • No assumptions about hardware • No / Few Knobs Commodity Hardware • Ethernet • Local disks •  JSON based object model •  Dynamic schemas •  Replica Sets to scale reads •  Sharding to scale writes •  1000’s of shards in a single DB •  Partitioning of data •  In-memory cache •  Scale-out working set •  Scale-out to overcome hardware limitations •  Designed for “typical” OS and local file system

Example customers User Data Management High Volume Data Feeds
Content Management Opera9onal Intelligence Product Data Management

USE CASES THAT LEVERAGE NOSQL

High Volume Data Feeds •  More machines, more sensors, more
data •  Variably structured Machine Generated Data •  High frequency trading Stock Market Data •  Multiple sources of data •  Each changes their format constantly Social Media Firehose

High Volume Data Feed Data Sources Asynchronous writes Flexible document
model can adapt to changes in sensor format Write to memory with periodic disk flush Data Sources Data Sources Data Sources Scale writes over multiple shards

Operational Intelligence •  Large volume of state about users • 
Very strict latency requirements Ad Targeting •  Expose report data to millions of customers •  Report on large volumes of data •  Reports that update in real time Customer Facing Dashboards •  Need to join the conversation _now_ Social Media Monitoring

Operational Intelligence Dashboards API Low latency reads Parallelize queries across
replicas and shards In database aggregation Flexible schema adapts to changing input data Can use same cluster to collect, store, and report on data

Behavioral Profiles 1 2 3 See Ad See Ad 4
Click Convert { cookie_id: “1234512413243”, advertiser:{ apple: { actions: [ { impression: ‘ad1’, time: 123 }, { impression: ‘ad2’, time: 232 }, { click: ‘ad2’, time: 235 }, { add_to_cart: ‘laptop’, sku: ‘asdf23f’, time: 254 }, { purchase: ‘laptop’, time: 354 } ] } } } Rich profiles collecting multiple complex actions Scale out to support high throughput of activities tracked Indexing and querying to support matching, frequency capping Dynamic schemas make it easy to track vendor specific attributes

Product Data •  Diverse product portfolio •  Complex querying and
filtering E-Commerce Product Catalog •  Scale for short bursts of high volume traffic •  Scalable, but consistent view of inventory Flash Sales

Product Data { sku: “00e8da9b”, type: “MP3”,
details: { artist: “John Coltrane”, title: “A love supreme”, length: 123 } } { sku: “00a9f3a”, type: “Book”, details: { author: “David Eggers”, title: “You shall know our velocity”, isbn: “0-‐9703355-‐5-‐5” } } Flexible data model for similar, but different objects Indexing and rich query API for easy searching and sorting db.products. find({ “details.author”: “David Eggers” }). sort({ “title” : -‐1 });

Content Management •  Comments and user generated content •  Personalization
of content, layout News Site •  Generate layout on the fly for each device that connects •  No need to cache static pages Multi-Device rendering •  Store large objects •  Simple modeling of metadata Sharing

Content Management { camera: “Nikon d4”, location: [
-‐122.418333, 37.775 ] } { camera: “Canon 5d mkII”, people: [ “Jim”, “Carol” ], taken_on: ISODate("2012-‐03-‐07T18:32:35.002Z") } { origin: “facebook.com/photos/xwdf23fsdf”, license: “Creative Commons CC0”, size: { dimensions: [ 124, 52 ], units: “pixels” } } Flexible data model for similar, but different objects Horizontal scalability for large data sets Geo spatial indexing for location based searches GridFS for large object storage

User Data Management •  User state and session management Video
Games •  Scale out to large graphs •  Easy to search and process Social Graphs •  Authentication, Authorization and Accounting Identity Management

User Game State Flexible documents supports new game features without
schema migration Sharding enables whole data set to be in memory, ensuring low latency JSON data model maps well to HTML5/ JS & Flash based clients Easy to store entire player state in a single document.

Social Graph Social Graphs Documents enable disk locality of all
profile data for a user Sharding partitions user profiles across available servers Native support for Arrays makes it easy to store connections inside user profile

IS MY USE CASE A GOOD FIT FOR MONGODB?

Good fits for MongoDB Application Characteristic Why MongoDB might be
a good fit Large number of objects to store Sharding lets you split objects across multiple servers High write or read throughput Sharding + Replication lets you scale read and write traffic across multiple servers Low Latency Access Memory Mapped storage engine caches documents in RAM, enabling in-memory performance. Data locality of documents can significantly improve latency over join based approaches Variable data in objects Dynamic schema and JSON data model enable flexible data storage without sparse tables or complex joins Cloud based deployment Sharding and replication let you work around hardware limitations in clouds.

Thanks! Real-Time Analytics April 12th MongoDB and AWS CloudFormation April
25th New Aggregation Framework May 10th Content Management May 17th

Common MongoDB Use Cases Webinar - April 5, 2012

Common MongoDB Use Cases Webinar - April 5, 2012

mongodb

More Decks by mongodb

Featured

Transcript

When to use MongoDB

Part 1 of a series Real-Time Analytics with Mongodb April

Today Last 10 years Emerging NoSQL Space RDBMS Data Warehou

Qualities of NoSQL Workloads Flexible data models • Lists, Nested Objects

MongoDB was designed for this Flexible data models • Lists, Nested

Example customers User Data Management High Volume Data Feeds

USE CASES THAT LEVERAGE NOSQL

High Volume Data Feeds •  More machines, more sensors, more

High Volume Data Feed Data Sources Asynchronous writes Flexible document

Operational Intelligence •  Large volume of state about users •

Operational Intelligence Dashboards API Low latency reads Parallelize queries across

Behavioral Profiles 1 2 3 See Ad See Ad 4

Product Data •  Diverse product portfolio •  Complex querying and

Product Data { sku: “00e8da9b”, type: “MP3”,

Content Management •  Comments and user generated content •  Personalization

Content Management { camera: “Nikon d4”, location: [

User Data Management •  User state and session management Video

User Game State Flexible documents supports new game features without

Social Graph Social Graphs Documents enable disk locality of all

IS MY USE CASE A GOOD FIT FOR MONGODB?

Good fits for MongoDB Application Characteristic Why MongoDB might be

Thanks! Real-Time Analytics April 12th MongoDB and AWS CloudFormation April