Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Benefits of using MongoDB: Reduce Complexity & ...

Vinova
April 03, 2011
160

Benefits of using MongoDB: Reduce Complexity & Adapt to Changes

Vinova

April 03, 2011
Tweet

Transcript

  1. Agenda What’s MongoDB? Why MongoDB reduce complexity? Why MongoDB adapt

    to changes better? Case studies • • • •
  2. What’s MongoDB? mongodb.org “MongoDB (from "humongous") is a scalable, high-

    performance, open source, document-oriented database”
  3. MongoDB Features JSON style documents Index on any attribute Rich

    queries In-place update Auto-sharding • • • • • Map / Reduce GridFS to store files Server-side JavaScript Capped collections Full-text-search (coming soon) • • • • •
  4. MongoDB's flexibility data structure, ability to index & query data,

    and auto-sharding make it a strong tool that adapt to changes well. It also help to reduce complexity comparing to tradition RDBMS.
  5. Why MongoDB reduce complexity? Get rid of migrations Get rid

    of relationships (most of) Reduce number of database requests JSON (client, server, and database) • • • •
  6. Get rid of migrations No create table No alter column

    No add column No change column • • • •
  7. Get rid of relationships Many one-to-one and one-to-many relationships is

    not necessary User :has_one :setting User :has_many :addresses User :has_many :roles Post :has_many :tags • • • • •
  8. Adapt to changes Changes in schema Changes in data &

    algorithms Changes for performance & scaling • • •
  9. Changes in schema In modern apps, schema changes quite often

    (weekly, monthly ...) Alter tables are expensive in RDBMS Dynamic schema document makes those changes seamlessly • • •
  10. Changes in data & algorithms Atomic, in-place updates are very

    powerful to modify data $inc, $set, $unset, $push, $pop, $rename, $bit Rich queries and aggregators $in, $all, $exists, $size, $type, regexp count(), size(), distinct(), min(), max() Map/Reduce • • •
  11. Changes for performance & scaling Very fast & ready to

    scale => Don’t have to use additional tools (memcached ...) Don’t have to change platforms • • •
  12. Store crawled info as embedded documents Data from 3rd party

    sources Sources and data formats can be changed in the future • •
  13. Store crawled info as embedded documents product = { "_id"

    : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "amazon" : { "asin" : ..., "price" : ..., .... } };
  14. Store crawled info as embedded documents product = { "_id"

    : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "amazon" : { "asin" : ..., "price" : ..., "shipping_cost" : ..., ... } };
  15. Store crawled info as embedded documents product = { "_id"

    : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "amazon" : { "asin" : ..., "price" : ..., "shipping_cost" : ..., .... }, "walmart" : { "price" : ..., ... } };
  16. Product listing Need an extra table to express which product

    is listed in which category and on which month • SQL product_id category_id month 1 2 2011-03 1 2 2011-04
  17. Product listing To query products listed in category 2 and

    month ‘2011-04’ • Product.join(:listings).where('category_id = ? AND month = ?', 2, ‘2011-04’) SQL
  18. Product listing Store listings in product itself product = {

    "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "listings" : [ [1, "2011-01"], [1, "2011-04"], [3, "2011- 01"] ] }; • Mongo
  19. Product listing Store listings in product itself product = {

    "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "listings" : [ [1, "2011-01"], [1, "2011-04"], [3, "2011- 01"] ] }; • Query is simpler Product.where("listings" => [1, '2011-04']) • Mongo
  20. Product listing Store listings in product itself product = {

    "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "listings" : [ [1, "2011-01"], [1, "2011-04"], [3, "2011- 01"] ] }; • Query is simpler Product.where("listings" => [1, '2011-04']) • Can index listings array db.products.ensureIndex({"listings" : 1 • Mongo
  21. Product listing Clearer but more data storage product = {

    "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "name" : "Product ABC", "listings" : [ {"category_id" : 1, "month" : "2011-01" }, {"category_id" : 1, "month" : "2011-04" }, {"category_id" : 3, "month" : "2011- 01" }] }; db.products.find("listings" : {"category_id" : 1, "month" : "2011-04" }) • Mongo
  22. Find unique slug SQL book1 = #<Book id: .., title

    => “Ruby”, ... > book2 = #<Book id: .., title => “Ruby”, ... > book2.uniq_slug => /books/ruby-1 Need n queries to find an unique slug def uniq_slug slug = original_slug = title.to_slug counter = 0 • • • •
  23. Find unique slug Need one query using regexp matching def

    find_uniq_slug original_slug = title.to_slug slug_pattern = /^#{original_slug}(-\d+)?$/ book = where(:slug => slug_pattern). order(:slug.desc).limit(1) if book max_counter = book.slug.match(/-(\d+)$/)[1].to_i "#{original_slug}-#{max_counter + 1}" else original_slug end end db.books.ensureIndex({"slug" : -1 }) • Mongo
  24. Voting A user can only vote each post once up

    / down votes has different points Cached votes_count and votes_point in post for sorting and querying Post.max(:votes_point) Post.order_by(:votes_count.desc) • • • • •
  25. SQL def vote(user_id, post_id, value) # Validate not_voted = Vote.where(:user_id

    => user_id, :post_id => post_id).count == 0 if not_voted # Create a new vote Vote.create( :user_id => user_id, :post_id => post_id, :value => value ) # Get post post = Post.find(post_id) # Update votes_point & votes_count post.votes_point += POINT[value] post.votes_count += 1 post.save end end Voting
  26. SQL def vote(user_id, post_id, value) # Validate not_voted = Vote.where(:user_id

    => user_id, :post_id => post_id).count == 0 if not_voted # Create a new vote Vote.create( :user_id => user_id, :post_id => post_id, :value => value ) # Get post post = Post.find(post_id) # Update votes_point & votes_count post.votes_point += POINT[value] post.votes_count += 1 post.save end end Voting 4 requests
  27. SQL def unvote(user_id, post_id) # Get current vote vote =

    Vote.where(:user_id => user_id, :post_id => post_id).first # Check if voted if vote # Destroy vote vote.destroy # Get post post = Post.find(post_id) # Update votes_point & votes_count post.votes_point -= POINT[vote.value] post.votes_count -= 1 post.save end end Voting
  28. SQL def unvote(user_id, post_id) # Get current vote vote =

    Vote.where(:user_id => user_id, :post_id => post_id).first # Check if voted if vote # Destroy vote vote.destroy # Get post post = Post.find(post_id) # Update votes_point & votes_count post.votes_point -= POINT[vote.value] post.votes_count -= 1 post.save end end Voting 4 requests
  29. Voting Embed votes data to post use arrays to store

    who vote up and who vote down • • Mongo post = { "_id" : ObjectId("4d8ace4b0dc3e43231bb930d"), "title" : "Post ABC", .... "votes" : { "up" : [ user_id_1 ], "down" : [ user_id_2 ], "count" => 2, "point" => -1 } };
  30. def vote(user_id, post_id, value) # Find post with post_id that

    was not up voted or down voted by user_id query = { 'post_id' => post_id, 'votes.up' => { '$ne' => user_id }, 'votes.down' => { '$ne' => user_id } } # Push user_id to votes.up_ids if vote up or votes.down_ids if vote_down # and update votes.point and votes.count update = { '$push' => { (value == :up ? 'votes.up' : 'votes.down') => user_id }, '$inc' => { 'votes.point' => POINT[value], 'votes.count' => +1 } } # Validate, update and get result post = Post.collection.find_and_modify( :query => query, :update => update, :new => true # return post after update votes data ) end Mongo
  31. def vote(user_id, post_id, value) # Find post with post_id that

    was not up voted or down voted by user_id query = { 'post_id' => post_id, 'votes.up' => { '$ne' => user_id }, 'votes.down' => { '$ne' => user_id } } # Push user_id to votes.up_ids if vote up or votes.down_ids if vote_down # and update votes.point and votes.count update = { '$push' => { (value == :up ? 'votes.up' : 'votes.down') => user_id }, '$inc' => { 'votes.point' => POINT[value], 'votes.count' => +1 } } # Validate, update and get result post = Post.collection.find_and_modify( :query => query, :update => update, :new => true # return post after update votes data ) end Mongo one request
  32. def def unvote unvote(user_id, post_id) (user_id, post_id) # Find post

    with post_id that was up voted or down voted by user_id # Find post with post_id that was up voted or down voted by user_id query query = = { { 'post_id' 'post_id' => post_id, => post_id, '$or' '$or' => => { { 'votes.up' 'votes.up' => user_id, => user_id, 'votes.down' 'votes.down' => user_id } => user_id } } } # Pull user_id from both votes.up_ids and votes.down_ids # Pull user_id from both votes.up_ids and votes.down_ids # and update votes.point and votes.count # and update votes.point and votes.count update update = = { { '$pull' '$pull' => => { { 'votes.up' 'votes.up' => user_id, => user_id, 'votes.down' 'votes.down' => user_id => user_id }, }, '$inc' '$inc' => => { { 'votes.point' 'votes.point' => => - -POINT[value], POINT[value], 'votes.count' 'votes.count' => => - -1 1 } } } } # Validate, update and get result # Validate, update and get result post post = = Post.collection.find_and_modify( Post.collection.find_and_modify( :query :query => query, => query, :update :update => update, => update, :new :new => => true true # return post after update votes data # return post after update votes data ) ) end end Mongo
  33. def def unvote unvote(user_id, post_id) (user_id, post_id) # Find post

    with post_id that was up voted or down voted by user_id # Find post with post_id that was up voted or down voted by user_id query query = = { { 'post_id' 'post_id' => post_id, => post_id, '$or' '$or' => => { { 'votes.up' 'votes.up' => user_id, => user_id, 'votes.down' 'votes.down' => user_id } => user_id } } } # Pull user_id from both votes.up_ids and votes.down_ids # Pull user_id from both votes.up_ids and votes.down_ids # and update votes.point and votes.count # and update votes.point and votes.count update update = = { { '$pull' '$pull' => => { { 'votes.up' 'votes.up' => user_id, => user_id, 'votes.down' 'votes.down' => user_id => user_id }, }, '$inc' '$inc' => => { { 'votes.point' 'votes.point' => => - -POINT[value], POINT[value], 'votes.count' 'votes.count' => => - -1 1 } } } } # Validate, update and get result # Validate, update and get result post post = = Post.collection.find_and_modify( Post.collection.find_and_modify( :query :query => query, => query, :update :update => update, => update, :new :new => => true true # return post after update votes data # return post after update votes data ) ) end end Mongo one request
  34. References Introduction to MongoDB http://scribd.com/doc/26506063/Introduction-To-MongoDB http://slideshare.net/jnunemaker/why-mongodb-is-awesome • • Schema Design

    http://slideshare.net/kbanker/mongodb-schema-design-mongo-chicago • Indexing & Query Optimization http://slideshare.net/mongodb/indexing-with-mongodb http://slideshare.net/mongodb/mongodb-indexing-the-details • •