Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Common MongoDB Use Cases Webinar - April 5, 2012

mongodb
April 05, 2012
1.6k

Common MongoDB Use Cases Webinar - April 5, 2012

MongoDB works differently than other databases. It's document oriented data model, range based partitioning, and strong consistency model are well suited to some problems, and less well suited to others. In this webinar, we'll go through real world use cases of MongoDB that take advantage of these unique features. We'll cover specific customers using MongoDB, how they implemented their solution and how you can build similar solutions for your own organization.

mongodb

April 05, 2012
Tweet

Transcript

  1. Part 1 of a series Real-Time Analytics with Mongodb April

    12th Content Management with MongoDB May 17th @forjared
  2. Today Last 10 years Emerging NoSQL Space RDBMS Data Warehou

    se NoSQL RDBMS Data Warehou se The beginning RDBMS
  3. Qualities of NoSQL Workloads Flexible data models • Lists, Nested Objects

    • Sparse schemas • Semi-structured data • Agile Development High Throughput • Lots of reads • Lots of writes Large Data Sizes • Aggregate data size • Number of objects Low Latency • Both reads and writes • Millisecond latency Cloud Computing • Run anywhere • No assumptions about hardware • No / Few Knobs Commodity Hardware • Ethernet • Local disks
  4. MongoDB was designed for this Flexible data models • Lists, Nested

    Objects • Sparse schemas • Semi-structured data • Agile Development High Throughput • Lots of reads • Lots of writes Large Data Sizes • Aggregate data size • Number of objects Low Latency • Both reads and writes • Millisecond latency Cloud Computing • Run anywhere • No assumptions about hardware • No / Few Knobs Commodity Hardware • Ethernet • Local disks •  JSON based object model •  Dynamic schemas •  Replica Sets to scale reads •  Sharding to scale writes •  1000’s of shards in a single DB •  Partitioning of data •  In-memory cache •  Scale-out working set •  Scale-out to overcome hardware limitations •  Designed for “typical” OS and local file system
  5. Example customers User  Data  Management   High  Volume  Data  Feeds

        Content  Management   Opera9onal  Intelligence   Product  Data  Management  
  6. High Volume Data Feeds •  More machines, more sensors, more

    data •  Variably structured Machine Generated Data •  High frequency trading Stock Market Data •  Multiple sources of data •  Each changes their format constantly Social Media Firehose
  7. High Volume Data Feed Data Sources Asynchronous writes Flexible document

    model can adapt to changes in sensor format Write to memory with periodic disk flush Data Sources Data Sources Data Sources Scale writes over multiple shards
  8. Operational Intelligence •  Large volume of state about users • 

    Very strict latency requirements Ad Targeting •  Expose report data to millions of customers •  Report on large volumes of data •  Reports that update in real time Customer Facing Dashboards •  Need to join the conversation _now_ Social Media Monitoring
  9. Operational Intelligence Dashboards API Low latency reads Parallelize queries across

    replicas and shards In database aggregation Flexible schema adapts to changing input data Can use same cluster to collect, store, and report on data
  10. Behavioral Profiles 1 2 3 See Ad See Ad 4

    Click Convert {  cookie_id:  “1234512413243”,      advertiser:{              apple:  {                  actions:  [                        {  impression:  ‘ad1’,  time:  123  },                        {  impression:  ‘ad2’,  time:  232  },                        {  click:  ‘ad2’,  time:  235  },                        {  add_to_cart:  ‘laptop’,                            sku:  ‘asdf23f’,                              time:  254  },                        {  purchase:  ‘laptop’,  time:  354  }                    ]              }      }   }   Rich profiles collecting multiple complex actions Scale out to support high throughput of activities tracked Indexing and querying to support matching, frequency capping Dynamic schemas make it easy to track vendor specific attributes
  11. Product Data •  Diverse product portfolio •  Complex querying and

    filtering E-Commerce Product Catalog •  Scale for short bursts of high volume traffic •  Scalable, but consistent view of inventory Flash Sales
  12. Product Data {  sku:  “00e8da9b”,      type:  “MP3”,  

       details:  {                  artist:  “John  Coltrane”,                title:  “A  love  supreme”,                  length:  123      }   }   {  sku:  “00a9f3a”,      type:  “Book”,      details:  {                  author:  “David  Eggers”,                title:  “You  shall  know  our  velocity”,                  isbn:  “0-­‐9703355-­‐5-­‐5”      }   }   Flexible data model for similar, but different objects Indexing and rich query API for easy searching and sorting db.products.        find({  “details.author”:  “David  Eggers”  }).        sort({  “title”  :  -­‐1  });  
  13. Content Management •  Comments and user generated content •  Personalization

    of content, layout News Site •  Generate layout on the fly for each device that connects •  No need to cache static pages Multi-Device rendering •  Store large objects •  Simple modeling of metadata Sharing
  14. Content Management {  camera:  “Nikon  d4”,      location:  [

     -­‐122.418333,  37.775  ]     }   {  camera:  “Canon  5d  mkII”,      people:  [  “Jim”,  “Carol”  ],        taken_on:  ISODate("2012-­‐03-­‐07T18:32:35.002Z")   }   {  origin:  “facebook.com/photos/xwdf23fsdf”,      license:  “Creative  Commons  CC0”,        size:  {              dimensions:  [  124,  52  ],            units:  “pixels”      }   }   Flexible data model for similar, but different objects Horizontal scalability for large data sets Geo spatial indexing for location based searches GridFS for large object storage
  15. User Data Management •  User state and session management Video

    Games •  Scale out to large graphs •  Easy to search and process Social Graphs •  Authentication, Authorization and Accounting Identity Management
  16. User Game State Flexible documents supports new game features without

    schema migration Sharding enables whole data set to be in memory, ensuring low latency JSON data model maps well to HTML5/ JS & Flash based clients Easy to store entire player state in a single document.
  17. Social Graph Social Graphs Documents enable disk locality of all

    profile data for a user Sharding partitions user profiles across available servers Native support for Arrays makes it easy to store connections inside user profile
  18. Good fits for MongoDB Application Characteristic Why MongoDB might be

    a good fit Large number of objects to store Sharding lets you split objects across multiple servers High write or read throughput Sharding + Replication lets you scale read and write traffic across multiple servers Low Latency Access Memory Mapped storage engine caches documents in RAM, enabling in-memory performance. Data locality of documents can significantly improve latency over join based approaches Variable data in objects Dynamic schema and JSON data model enable flexible data storage without sparse tables or complex joins Cloud based deployment Sharding and replication let you work around hardware limitations in clouds.
  19. Thanks! Real-Time Analytics April 12th MongoDB and AWS CloudFormation April

    25th New Aggregation Framework May 10th Content Management May 17th