Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Large Queries | Philly.rb lightning talk

Large Queries | Philly.rb lightning talk

How I handle large query result sets in ActiveRecord, and introducing my secret weapon, postgresql_cursor.

Allen Fair

July 10, 2012
Tweet

More Decks by Allen Fair

Other Decks in Technology

Transcript

  1. HUGE “FIND” • Post.find(:all) • Post.where(“id>0”).all • Loads and instantiates

    all rows in memory! • Process has memory “creep”... and may be reaped Wednesday, July 11, 2012
  2. find_each & find_in_batches • Yields each BATCH of :batch_size||1000 records

    • Starting point with :start => id? • Only “order by id” (numeric primary key) • find_each: Yields each instantiated row • find_in_batches: Yields array of :batch_size instantiated objects to the block Wednesday, July 11, 2012
  3. Cursor • DECLARE cursorname CURSOR FOR select .... • FETCH

    1000 FROM cursorname • CLOSE cursorname • PG Cursor != “DBI” Cursor (an interator) Wednesday, July 11, 2012
  4. • gem ‘postgresql_cursor’ • Post.where(...).each_row { |hash| process (hash) }

    • Post.where(...).each_instance { |m| m.process! } • Post.find_with_cursor(...) { |h| process(h) } • Post.find_by_sql_with_cursor(sql,...) {|h| do (h) } postgresql_cursor Wednesday, July 11, 2012
  5. PostgreSQL Tuning • WORK_MEM=100MB • Default: 1MB, increases connection size

    • CURSOR_TUPLE_FRACTION=1 • Default: 0.1 (Optimizes for 10% of rows) • Can be set by app during transaction • SET ‘work_mem’ TO ‘100MB’ Wednesday, July 11, 2012
  6. Now you have... • Your full query • Your sorting

    • No re-execution of queries • Joy and Happiness!!!!! Wednesday, July 11, 2012