Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rails with Massive Data

Rails with Massive Data

Ruby Tuesday #21 ( Taiwan )

Yi-Ting Cheng

August 19, 2012
Tweet

More Decks by Yi-Ting Cheng

Other Decks in Technology

Transcript

  1. Agenda • Don’t use ActiveRecord • Don’t use ActiveRecord •

    Don’t use ActiveRecord • Don’t use ActiveRecord • Don’t use ActiveRecord • ............ Unless you know what you’re doing 12年8月19日星期日
  2. typical usage posts = Post.where(:board_id => 5) post.each do |post|

    post.board_id = 1 post.save end ~ 1000 data : cool ~ 1000000 data : hell 12年8月19日星期日
  3. problems posts = Post.where(:board_id => 5) post.each do |post| post.board_id

    = 1 post.save end load ~1000000 objects in memory trigger ~1000000 callbacks DB transaction update DB indexes 12年8月19日星期日
  4. problems • memory bloat • too much callbacks • too

    much DB transaction • slow query ( update db indexes) 12年8月19日星期日
  5. update_all posts = Post.where(:board_id => 5) post.each do |post| post.board_id

    = 1 post.save end Post.update_all({:board_id => 1}, {:board_id => 5}) 12年8月19日星期日
  6. find_in_batches Post.find_in_batches(:conditions => "board_id = 5", :batch_size => 1000) do

    |posts| posts.each do |post| post.board_id = 1 post.save end end load only ~1000 objects in memory 12年8月19日星期日
  7. transaction Post.find_in_batches(:conditions => "board_id = 5", :batch_size => 1000) do

    | posts| Post.transaction do posts.each do |post| post.board_id = 1 post.save end end end ~ only 1000 transactions 12年8月19日星期日
  8. sneaky-save (gem) posts = Post.where(:board_id => 5) post.each do |post|

    post.board_id = 1 post.sneaky_save end ~ skip 1000000 * n callbacks 12年8月19日星期日
  9. select only needed posts = Post.where(“id < 10”) Post Load

    (18.8ms) SELECT `posts`.* FROM `posts` WHERE (id < 10) “post.content” ~ 100k 10000 record ~ 1G Post.select("column 1, colum2").where 12年8月19日星期日
  10. move out big data class Post < ActiveRecord::Base has_one :meta

    after_create :create_meta delegate :content, :to => :meta end # -*- encoding : utf-8 -*- # == Schema Information # # Table name: post_data # # id :integer not null, primary key # post_id :integer # content :text # created_at :datetime not null # updated_at :datetime not null # 12年8月19日星期日
  11. add index on foreign key posts = Post.where(:board_id => 5)

    add_index :posts, :board_id 12年8月19日星期日
  12. integer & varchar # -*- encoding : utf-8 -*- #

    == Schema Information # # Table name: post # # id :integer not null, primary key # board_id :integer # content :text # created_at :datetime not null # updated_at :datetime not null # # -*- encoding : utf-8 -*- # == Schema Information # # Table name: post # # id :integer not null, primary key # board_id :string(255) # content :text # created_at :datetime not null # updated_at :datetime not null # ~100x slower 12年8月19日星期日
  13. delete / destroy • destroy is slow • destroy go

    through callbacks 12年8月19日星期日
  14. delete / destroy • delete is also slow..... • DELETE

    update indexes 12年8月19日星期日