Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reducing Memory Usage in Ruby

Reducing Memory Usage in Ruby

Presentation about two patches to reduce memory usage in Ruby applications

Aaron Patterson

May 25, 2018
Tweet

More Decks by Aaron Patterson

Other Decks in Technology

Transcript

  1. Only 70% Complete! >> last_year = Date.parse "September 20, 2017"

    => #<Date: 2017-09-20 ((2458017j,0s,0n),+0s,2299161j)> >> this_year = Date.parse "May 31, 2018" => #<Date: 2018-05-31 ((2458270j,0s,0n),+0s,2299161j)> >> ((this_year - last_year) / 365).to_f => 0.6931506849315069 >> sprintf "%f%%", (1.0 - ((this_year - last_year) / 365).to_f) * 100 => "30.684932%"
  2. Malloc Stack Logging $ MallocStackLoggingNoCompact=1 \ RAILS_ENV=production \ bin/rails r

    'p $$; GC.start; $stdin.getc' Enable the Logger Print the PID Clean any garbage Pause the process
  3. Log File Size $ ls -alh trunk_log.log -rw-r--r-- 1 aaron

    staff 6.2G Mar 12 10:42 trunk_log.log
  4. File Contents ALLOC 0x7fb1fa600940-0x7fb1fa600b4f [size=528]: thread_7fff8b218340 | start | main

    | ruby_init | ruby_setup | Init_BareVM | rb_objspace_alloc | calloc | malloc_zone_calloc FREE 0x7fb1fa603730: thread_7fff8b218340 | start | main | ruby_init | ruby_setup | rb_call_inits | Init_Encoding | rb_define_method | rb_add_method_cfunc | rb_add_method | rb_method_entry_make | rb_id_table_insert | rb_id_table_insert_key | hash_table_extend | ruby_xfree | ruby_sized_xfree | objspace_xfree | free
  5. Reconciling Live Memory allocs = {} total = 0 File.open(ARGV[0],

    "r") do |f| f.each_line do |line| case line when /^(?:ALLOC)\s*([^\s]+)\s+\[size=(\d+)\]:/ from, to = *$1.split('-', 2) size = $2.to_i total += size allocs[from] = size puts total when /^(?:FREE)\s*([^\s]+):\s/ total -= allocs.fetch($1) allocs.delete $1 end end end p allocs = {}
  6. Top 20 malloc Callers rb_ast_newnode 16% new_insn_body 12% iseq_setup 10%

    prepare_iseq_build 9% ary_resize_capa 8% st_init_table_with_size 6% io_fillbuf 6% new_insn_send 5% rb_ary_modify 5% str_new0 4% heap_assign_page 4% rb_iseq_new_with_opt 3% iseq_compile_each0 2% local_push_gen 2% CRYPTO_malloc 2% rb_str_resize 1% rb_str_buf_new 1% __opendir_common 1% rb_ast_new 1% ruby_strdup 1% ruby_strdup rb_ast_new __opendir_common rb_str_buf_new rb_str_resize CRYPTO_malloc local_push_gen iseq_compile_each0 rb_iseq_new_with_opt heap_assign_page str_new0 rb_ary_modify new_insn_send io_fillbuf st_init_table_with_size ary_resize_capa prepare_iseq_build iseq_setup new_insn_body rb_ast_newnode
  7. Shared Strings x = '/a/b/c.rb' a = x.dup b =

    x[1, x.length - 1] / a / b / c . r b x a b
  8. Requiring The Same File require '/a/b/c.rb' require '/a/b/c' $LOAD_PATH.unshift "/"

    require 'a/b/c.rb' require 'a/b/c' $LOAD_PATH.unshift "/a" require 'b/c.rb' require 'b/c' $LOAD_PATH.unshift "/a/b" require 'c.rb' require 'c'
  9. Cache Structure features_index = { '/a/b/c.rb' => 2, '/a/b/c' =>

    2, 'a/b/c.rb' => 2, 'b/c.rb' => 2, 'b/c' => 2, 'c.rb' => 2, 'c' => 2 }
  10. Generation Algorithm def features_index_add(feature, index) ext = feature.index('.') p =

    ext ? ext : feature.length loop do p -= 1 while p > 0 && feature[p] != '/' p -= 1 end break if p == 0 short_feature = feature[p + 1, feature.length - p - 1] # New Ruby Object features_index_add_single(short_feature, index) if ext # slice out the file extension if there is one short_feature = feature[p + 1, ext - p - 1] # New Ruby Object + malloc features_index_add_single(short_feature, index) end end end
  11. Key Generation / a / b / c . r

    b require '/a/b/c.rb' /a/b/c.rb /a/b/c a/b/c.rb a/b/c b/c.rb b/c c.rb c
  12. Key Generation / a / b / c . r

    b require '/a/b/c.rb' /a/b/c.rb /a/b/c a/b/c.rb a/b/c b/c.rb b/c c.rb c rb_substr( ) ) / a / b / c a / b / c b / c c
  13. Key Generation / a / b / c . r

    b require '/a/b/c.rb' /a/b/c.rb /a/b/c a/b/c.rb a/b/c b/c.rb b/c c.rb c rb_substr( ) ) / a / b / c rb_substr(
  14. Cache Structure Loaded Feature Cache (Hash) /a/b/c.rb /a/b/c a/b/c.rb a/b/c

    b/c.rb b/c c.rb c / a / b / c . r b / a / b / c
  15. Implementation From bec1637da7fc5bafd9c91ba6443ad38c29ec656f Mon Sep 17 00:00:00 2001 From: Aaron

    Patterson <[email protected]> Date: Fri, 9 Feb 2018 13:14:27 -0800 Subject: [PATCH] Use shared substrings in feature index cache hash Before this patch, `features_index_add` would use `rb_str_subseq` to get a substring of the feature being added to the loaded features list. `features_index_add_single` would use `ruby_strdup` to copy that string and use it as a hash key in `loaded_features_index`. This patch changes `features_index_add` to index in to the underlying character array stored in the Ruby string, and use that as the hash key without copying its contents. The cache also needs keys that do not contain file extensions, so this patch will allocate one new string that does not contain the file extension, then indexes in to that character array rather than use substrings. The strings that do not have the file extension are added to a new array on the VM `loaded_features_index_pool` to ensure liveness. The loaded features array already ensures liveness of the strings *with* file extensions. --- load.c | 42 ++++++++++++++++++++++++++---------------- vm.c | 1 + vm_core.h | 1 + 3 files changed, 28 insertions(+), 16 deletions(-) diff --git a/load.c b/load.c index fe1d0280bf..ec046db209 100644 --- a/load.c +++ b/load.c @@ -166,6 +166,12 @@ get_loaded_features_index_raw(void) return GET_VM()->loaded_features_index; } +static VALUE +get_loaded_features_index_pool_raw(void) +{ + return GET_VM()->loaded_features_index_pool; +} + static st_table *
  16. Output {["features.rb", 6, :T_STRING]=>[91, 0, 0, 0, 0, 0], ["features.rb",

    6, :T_DATA]=>[3, 0, 0, 0, 0, 0], ["features.rb", 6, :T_FILE]=>[1, 0, 0, 0, 0, 0], ["features.rb", 6, :T_ARRAY]=>[5, 0, 0, 0, 0, 0], ["features.rb", 6, :T_IMEMO]=>[3, 0, 0, 0, 0, 0], ["features.rb", 6, :T_HASH]=>[2, 0, 0, 0, 0, 0]} {["features.rb", 6, :T_STRING]=>[50, 0, 0, 0, 0, 0], ["features.rb", 6, :T_DATA]=>[3, 0, 0, 0, 0, 0], ["features.rb", 6, :T_FILE]=>[1, 0, 0, 0, 0, 0], ["features.rb", 6, :T_ARRAY]=>[4, 0, 0, 0, 0, 0], ["features.rb", 6, :T_IMEMO]=>[3, 0, 0, 0, 0, 0], ["features.rb", 6, :T_HASH]=>[2, 0, 0, 0, 0, 0]} Ruby 2.5 Ruby 2.6
  17. Object Allocations 0 25 50 75 100 T_STRING T_DATA T_FILE

    T_ARRAY T_IMEMO T_HASH Ruby 2.5 Ruby 2.6
  18. Stack VM [ :push, 3 ] [ :push, 5 ]

    [ :add ] Instructions Stack Program Workspace PC Instruction O perand 3 5 8
  19. Processing Phases AST Source Code (text) Linked List Byte Code

    Parsing Compiling Optimizations Product
  20. Source to AST 3 + 5 Ruby Code (text) +

    5 3 AST Ruby Objects! (T_NODE)
  21. AST to Linked List + 5 3 AST Ruby Objects!

    (T_NODE) Visit Visit Visit Push 3 Push 5 Add Linked List
  22. Optimization Pass Push 3 Push 5 Add Linked List Push

    3 Push 5 Add Linked List Optimized Linked List
  23. Byte Code Translation Push 3 Push 5 Add Linked List

    [ 123, 3, 123, 5, 456 ] Translate Byte Code
  24. Byte Code Translation Push 3 Push 5 Add Linked List

    [ 123, 3, 123, 5, 456 ] Byte Code Ruby Objects! (T_NODE) Ruby Objects! (IMEMO)
  25. Stack VM [ :push, 3 ] [ :push, 5 ]

    [ :add ] Instructions Stack Program Workspace PC
  26. Stack VM Instructions Stack Program Workspace PC 3 5 8

    [ 123, # push 3, 123, # push 5, 456 # add ]
  27. Simple VM PUSH = 123 ADD = 456 PRINT =

    789 byte_code = [ 123, 3, 123, 5, 456, 789 ] pc = 0 stack = [] # Virtual Machine Loop loop do case byte_code[pc] when nil then break when PUSH parameter = byte_code[pc + 1] stack.push parameter pc += 1 when ADD a = stack.pop b = stack.pop c = a + b stack.push c when PRINT puts stack.pop end pc += 1 end Extra Increment
  28. Hello World << world hello puts AST Ruby Object Ruby

    Object Push "hello" Push "world" << puts Linked List Ruby Object Ruby Object
  29. Hello World Push "hello" Push "world" << puts Linked List

    Translate [ 123, # push 111, # hello 123, # push 222, # world 333, # << 444, # puts ] Byte Code Object Address Object Address Ruby Object Ruby Object
  30. Hello World [ 123, # push 111, # hello 123,

    # push 222, # world 333, # << 444, # puts ] Byte Code Byte Code [ 123, # push "hello", 123, # push "world", 333, # << 444, # puts ]
  31. VM Implementation PUSH = 123 APPEND = 333 PRINT =

    444 byte_code = [ 123, # push "hello", 123, # push "world", 333, # << 444, # puts ] def run_vm(pc, stack, byte_code) # Virtual Machine Loop loop do case byte_code[pc] when nil then break when PUSH parameter = byte_code[pc + 1] stack.push parameter pc += 1 when APPEND b = stack.pop a = stack.pop c = a << b stack.push c when PRINT puts stack.pop end pc += 1 end end run_vm(0, [], byte_code) # helloworld run_vm(0, [], byte_code) Ruby puts "hello" << "world"
  32. Stack VM [ :push, "hello" ] [ :push, "world" ]

    [ :append ] Instructions Stack Program Workspace PC Ruby Ruby "hello" "world"
  33. Stack VM [ :push, "hello" ] [ :push, "world" ]

    [ :append ] Instructions Stack Program Workspace PC "world" "hello" "helloworld" [ :push, "helloworld" ]
  34. Stack VM [ :push, "world" ] [ :append ] Instructions

    Stack Program Workspace PC [ :push, "helloworld" "world" "helloworldworld" [ :push, "helloworldworld" ]
  35. New VM PUSH = 123 APPEND = 333 PRINT =

    444 byte_code = [ 123, # push "hello", 123, # push "world", 333, # << 444, # puts ] def run_vm(pc, stack, byte_code) # Virtual Machine Loop loop do case byte_code[pc] when nil then break when PUSH parameter = byte_code[pc + 1] stack.push parameter.dup pc += 1 when APPEND b = stack.pop a = stack.pop c = a << b stack.push c when PRINT puts stack.pop end pc += 1 end end Copy
  36. Stack VM [ :push, "hello" ] [ :push, "world" ]

    [ :append ] Instructions Stack Program Workspace PC "hello" (copy) "world" (copy)
  37. ISeq Layout ISeq Object [ 123, 555, 123, 456, 333,

    444, ] "hello" "world" Array Mark Array
  38. ISeq GC ISeq Object [ 123, 555, 123, 456, 333,

    444, ] "hello" "world" Array GC mark
  39. ISeq GC ISeq Object [ 123, 555, 123, 456, 333,

    444, ] "hello" "world" Array Duplicated Information "Hidden" Reference "Hidden" Reference
  40. Bloat Graph Array Size vs Array Capacity Number of Elements

    0 750 1500 2250 3000 Size Capacity Unused
  41. ISeq GC ISeq Object [ 123, 555, 123, 456, 333,

    444, ] "hello" "world" Array Lives Forever!
  42. ISeq GC ISeq Object [ 123, 555, 123, 456, 333,

    444, ] "hello" "world" Array Decode
  43. Mark Loop def mark_params(pc, byte_code) # Virtual Machine Loop loop

    do case byte_code[pc] when nil then break when PUSH parameter = byte_code[pc + 1] gc_mark(parameter) pc += 1 when APPEND when PRINT end pc += 1 end end Mark
  44. ISeq GC ISeq Object [ 123, 555, 123, 456, 333,

    444, ] "hello" "world" Array
  45. Actual Code commit 9e26858e8c32e7f4b6ae3bccf9896ea7b61ce335 Author: tenderlove <tenderlove@b2dd03c8-39d4-4d8f-98ff-823fe69b080e> Date: Mon Mar

    19 18:21:54 2018 +0000 Reverting r62775, this should fix i686 builds We need to mark default values for kwarg methods. This also fixes Bootsnap. IBF iseq loading needed to mark iseqs as "having markable objects". git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62851 b2dd03c8-39d4-4d8f-98ff-823fe69b080e diff --git a/compile.c b/compile.c index 71f60b9b1b..68d4bf549a 100644 --- a/compile.c +++ b/compile.c @@ -562,15 +562,6 @@ APPEND_ELEM(ISEQ_ARG_DECLARE LINK_ANCHOR *const anchor, LINK_ELEMENT *before, LI #define APPEND_ELEM(anchor, before, elem) APPEND_ELEM(iseq, (anchor), (before), (elem)) #endif -static int -iseq_add_mark_object(const rb_iseq_t *iseq, VALUE v) -{ - if (!SPECIAL_CONST_P(v)) { - rb_iseq_add_mark_object(iseq, v); - } - return COMPILE_OK; -} - static int iseq_add_mark_object_compile_time(const rb_iseq_t *iseq, VALUE v) { @@ -749,6 +740,7 @@ rb_iseq_translate_threaded_code(rb_iseq_t *iseq) encoded[i] = (VALUE)table[insn]; i += len;
  46. Basic Rails App Number of Live Objects 0 17500 35000

    52500 70000 Object Type T_IM EM O T_STRIN G T_ARRAY T_C LASS T_O BJEC T T_DATA T_H ASH T_REG EXP T_IC LASS T_M O DU LE T_RATIO N AL T_STRU C T T_SYM BO L T_BIG N U M T_FLO AT T_FILE T_M ATC H T_C O M PLEX Ruby 2.5 Ruby 2.6 Array Reduction