Fixing Performance and Memory Problems (RubyWine)

Fixing Performance and Memory Problems (RubyWine)

81b61875e41eaa58887543635d556fca?s=128

Frederick Cheung

April 04, 2020
Tweet

Transcript

  1. Fixing performance & Memory problems @fglc2 / dressipi

  2. Part 1: The follow events happen in real-time

  3. Realtime recommender

  4. Realtime recommender Batch Predictions

  5. Realtime recommender Batch Predictions Event Stream Recommender Realtime Predictions

  6. Realtime recommender Batch Predictions Event Stream Recommender Realtime Predictions Results

  7. We deployed to production

  8. None
  9. API Combine Serialize Batch Predictions Realtime Predictions

  10. None
  11. None
  12. class ScoreMap def serialize to_int_array.pack("l<*") end def to_int_array @scores.flat_map do

    |key, value| [ key, (value/(scale_factor + 1).to_i] end end def scale_factor @scores.values.max_by {|n| n.abs } .abs / 1000 end end
  13. class ScoreMap def serialize to_int_array.pack("l<*") end def to_int_array @scores.flat_map do

    |key, value| [ key, (value/(scale_factor + 1)).to_i] end end def scale_factor @scores.values.max_by {|n| n.abs }
  14. class ScoreMap def serialize to_int_array.pack("l<*") end def to_int_array @scores.flat_map do

    |key, value| [ key, (value/(scale_factor + 1)).to_i] end end def scale_factor @scores.values.max_by {|n| n.abs } .abs / 1000
  15. def to_int_array @scores.flat_map do |key, value| [ key, (value/(scale_factor +

    1)).to_i] end end def scale_factor @scores.values.max_by {|n| n.abs } .abs / 1000 end end
  16. Ruby-Prof is a profiler • Records behaviour of your code

    • Produces reports on where your code spends time • Use to analyse problems not detect them
  17. None
  18. require 'ruby-prof'

  19. require 'ruby-prof' scores = load_example_data

  20. require 'ruby-prof' scores = load_example_data result = RubyProf.profile do ScoreMap.new(scores).serialize

    end
  21. require 'ruby-prof' scores = load_example_data result = RubyProf.profile do ScoreMap.new(scores).serialize

    end printer = RubyProf::GraphHtmlPrinter.new(result) File.open('out.html', 'w') do |file| printer.print(file) end
  22. None
  23. None
  24. None
  25. None
  26. None
  27. None
  28. None
  29. def to_int_array @scores.flat_map do |key, value| [ key, (value/(scale_factor +

    1)).to_i] end end def scale_factor @scores.values.max_by {|n| n.abs } .abs / 1000 end
  30. def to_int_array @scores.flat_map do |key, value| [ key, (value/(scale_factor +

    1)).to_i] end end def scale_factor @scale_factor ||= @scores.values.max_by {|n| n.abs } .abs / 1000 end
  31. None
  32. Slows down your code

  33. Can distort things

  34. ruby-prof is exact

  35. Sampling profilers

  36. Sampling profilers • Sampling profilers monitor (sample) call stack periodically

    and reports how often each method seen
  37. Sampling profilers • Sampling profilers monitor (sample) call stack periodically

    and reports how often each method seen • Sampling frequency controls accuracy vs overhead tradeoff
  38. Stackprof StackProf.run(mode: :cpu, out: 'stackprof.dump') do ScoreMap.new(scores).serialize end

  39. Stackprof StackProf.run(mode: :cpu, out: 'stackprof.dump') do ScoreMap.new(scores).serialize end

  40. ================================== Mode: cpu(1000) Samples: 435 (0.00% miss rate) GC: 2

    (0.46%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 433 (99.5%) 433 (99.5%) ScoreMap#scale_factor 2 (0.5%) 2 (0.5%) (garbage collection) 433 (99.5%) 0 (0.0%) ScoreMap#to_int_array 433 (99.5%) 0 (0.0%) ScoreMap#serialize 433 (99.5%) 0 (0.0%) block in <main> 433 (99.5%) 0 (0.0%) <main> 433 (99.5%) 0 (0.0%) <main>
  41. ================================== Mode: cpu(1000) Samples: 435 (0.00% miss rate) GC: 2

    (0.46%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 433 (99.5%) 433 (99.5%) ScoreMap#scale_factor 2 (0.5%) 2 (0.5%) (garbage collection) 433 (99.5%) 0 (0.0%) ScoreMap#to_int_array 433 (99.5%) 0 (0.0%) ScoreMap#serialize 433 (99.5%) 0 (0.0%) block in <main> 433 (99.5%) 0 (0.0%) <main> 433 (99.5%) 0 (0.0%) <main>
  42. Overhead

  43. Overhead • Code on its own: 6.05 s

  44. Overhead • Code on its own: 6.05 s • Rubyprof:

    ~ 27s
  45. Overhead • Code on its own: 6.05 s • Rubyprof:

    ~ 27s • Stackprof (default settings): 6.09s
  46. Part 1 Summary

  47. Part 1 Summary • Use monitoring (or angry users) to

    detect problems
  48. Part 1 Summary • Use monitoring (or angry users) to

    detect problems • Refine with stackprof (if needed)
  49. Part 1 Summary • Use monitoring (or angry users) to

    detect problems • Refine with stackprof (if needed) • Reproduce the problem in a small benchmark
  50. Part 1 Summary • Use monitoring (or angry users) to

    detect problems • Refine with stackprof (if needed) • Reproduce the problem in a small benchmark • Use rubyprof to understand / fix the problem
  51. Part 1 Summary • Use monitoring (or angry users) to

    detect problems • Refine with stackprof (if needed) • Reproduce the problem in a small benchmark • Use rubyprof to understand / fix the problem • Profit!
  52. Part 2: Memory

  53. • Bitmap GC (1.9) • Generational GC (2.1) • Incremental

    GC (2.2) • Symbol GC (2.2) • Transient Heap (2.6) • Compacting GC ( 2.7 )
  54. You still have to worry about memory

  55. NoMemoryError: failed to allocate

  56. None
  57. 0 100 200 300 400

  58. Memory leak

  59. Where do they come from

  60. Where do they come from • Leaks in native code

    (C extensions)
  61. Where do they come from • Leaks in native code

    (C extensions) • Ruby thinks you are still using it (long lived objects, caches, etc.)
  62. Where do they come from • Leaks in native code

    (C extensions) • Ruby thinks you are still using it (long lived objects, caches, etc.) • Not understanding object lifecycles
  63. class LeakingMiddleware def initialize(app) @app = app end def call(env)

    ... end def cached_thing(id) @things ||= {} @things[id] ||= Thing.find id end end
  64. class LeakingMiddleware def initialize(app) @app = app end def call(env)

    ... end def cached_thing(id) @things ||= {} @things[id] ||= Thing.find id end end
  65. Leaks are scary because

  66. Leaks are scary because • Manifest as global problem

  67. Leaks are scary because • Manifest as global problem •

    Memory usage is complicated and noisy (caches, threads, lazy load)
  68. Leaks are scary because • Manifest as global problem •

    Memory usage is complicated and noisy (caches, threads, lazy load) • Easily hidden in a dependency (see bundler-leak)
  69. Back to the problem

  70. At least a year

  71. At least a year • Slow or rare leak ~

    2 weeks to exhaust memory at 3M requests / day
  72. At least a year • Slow or rare leak ~

    2 weeks to exhaust memory at 3M requests / day • App usually restarted often enough (deploys, autoscaling)
  73. At least a year • Slow or rare leak ~

    2 weeks to exhaust memory at 3M requests / day • App usually restarted often enough (deploys, autoscaling) • Failed instances automatically detected as unhealthy and replaced
  74. None
  75. But seriously

  76. I couldn't reproduce the leak

  77. What could be happening?

  78. What could be happening? • Not picking the right endpoint

  79. What could be happening? • Not picking the right endpoint

    • Different usage patterns
  80. What could be happening? • Not picking the right endpoint

    • Different usage patterns • Environment differences
  81. Debugging in production

  82. rbtrace

  83. rbtrace • Count / trace method calls, GCs

  84. rbtrace • Count / trace method calls, GCs • Dump

    backtraces
  85. rbtrace • Count / trace method calls, GCs • Dump

    backtraces • Execute arbitrary ruby
  86. trace_object_allocations

  87. trace_object_allocations • Part of Objectspace (in stdlib)

  88. trace_object_allocations • Part of Objectspace (in stdlib) • Records details

    about each memory allocation (when/ where/what)
  89. trace_object_allocations • Part of Objectspace (in stdlib) • Records details

    about each memory allocation (when/ where/what) • Dump to a file for analysis ( heap dump )
  90. Note that this feature introduces a huge performance decrease and

    huge memory consumption.
  91. The plan

  92. The plan • Only work on one instance

  93. The plan • Only work on one instance • Use

    rbtrace to dump heap from single rails process
  94. The plan • Only work on one instance • Use

    rbtrace to dump heap from single rails process • Analyse heap dump offline
  95. I had to google this!

  96. bundle exec rbtrace --pid $PID -e ' Thread.new { require

    'objspace' ObjectSpace.trace_object_allocations_start }' Start Tracing
  97. bundle exec rbtrace --pid $PID -e ' Thread.new { require

    'objspace' ObjectSpace.trace_object_allocations_start }' Start Tracing
  98. Dump GC.start io=File.open("/tmp/heap.dump", "w"); ObjectSpace.dump_all(output: io); io.close

  99. What's in a heap dump?

  100. None
  101. "address": "0x7ff5bbdfc3f8",

  102. "address": "0x7ff5bbdfc3f8", "type": "OBJECT",
 "class": "0x7ff5ba88b8e8",

  103. "address": "0x7ff5bbdfc3f8", "type": "OBJECT",
 "class": "0x7ff5ba88b8e8", "file": "...gems/mongoid-7.0.5/lib/mongoid/document.rb",
 "line": 335,


    "method": "allocate",
  104. "address": "0x7ff5bbdfc3f8", "type": "OBJECT",
 "class": "0x7ff5ba88b8e8", "file": "...gems/mongoid-7.0.5/lib/mongoid/document.rb",
 "line": 335,


    "method": "allocate", "generation": 62,
  105. "address": "0x7ff5bbdfc3f8", "type": "OBJECT",
 "class": "0x7ff5ba88b8e8", "file": "...gems/mongoid-7.0.5/lib/mongoid/document.rb",
 "line": 335,


    "method": "allocate", "generation": 62, "ivars": 3,
 "memsize": 40,
  106. "address": "0x7ff5bbdfc3f8", "type": "OBJECT",
 "class": "0x7ff5ba88b8e8", "file": "...gems/mongoid-7.0.5/lib/mongoid/document.rb",
 "line": 335,


    "method": "allocate", "generation": 62, "ivars": 3,
 "memsize": 40, "references": [
 "0x7ff5bbe1c4f0"
 ],
  107. "address": "0x7ff5bbdfc3f8", "type": "OBJECT",
 "class": "0x7ff5ba88b8e8", "file": "...gems/mongoid-7.0.5/lib/mongoid/document.rb",
 "line": 335,


    "method": "allocate", "generation": 62, "ivars": 3,
 "memsize": 40, "references": [
 "0x7ff5bbe1c4f0"
 ], "flags": {
 "wb_protected": true
 },
  108. heapy

  109. heapy $ heapy read good_heap.dump

  110. heapy $ heapy read good_heap.dump Analyzing Heap
 ==============

  111. heapy $ heapy read good_heap.dump Analyzing Heap
 ============== Generation: nil

    object count: 475742, mem: 0.0 kb
  112. heapy $ heapy read good_heap.dump Analyzing Heap
 ============== Generation: nil

    object count: 475742, mem: 0.0 kb Generation: 61 object count: 5334, mem: 1782.5 kb
 Generation: 62 object count: 918, mem: 1324.8 kb
 Generation: 63 object count: 4, mem: 0.3 kb
  113. heapy $ heapy read good_heap.dump Analyzing Heap
 ============== Generation: nil

    object count: 475742, mem: 0.0 kb Generation: 61 object count: 5334, mem: 1782.5 kb
 Generation: 62 object count: 918, mem: 1324.8 kb
 Generation: 63 object count: 4, mem: 0.3 kb Generation: 2222 object count: 1, mem: 0.0 kb
 Generation: 2223 object count: 420, mem: 44.6 kb
 Generation: 2224 object count: 635, mem: 42.8 kb
  114. Analyzing Heap ============== Generation: nil object count: 498011, mem: 0.0

    kb Generation: 61 object count: 834, mem: 82.2 kb Generation: 62 object count: 36, mem: 3.9 kb Generation: 63 object count: 9, mem: 0.8 kb Generation: 64 object count: 3, mem: 0.4 kb Generation: 65 object count: 23, mem: 1.8 kb Generation: 66 object count: 16, mem: 1.4 kb Generation: 67 object count: 12, mem: 1.2 kb ...
  115. Analyzing Heap ============== Generation: nil object count: 498011, mem: 0.0

    kb Generation: 61 object count: 834, mem: 82.2 kb Generation: 62 object count: 36, mem: 3.9 kb Generation: 63 object count: 9, mem: 0.8 kb Generation: 64 object count: 3, mem: 0.4 kb Generation: 65 object count: 23, mem: 1.8 kb Generation: 66 object count: 16, mem: 1.4 kb Generation: 67 object count: 12, mem: 1.2 kb ...
  116. Generation: 218 object count: 11, mem: 1.2 kb Generation: 219

    object count: 10, mem: 1.1 kb Generation: 220 object count: 12, mem: 1.2 kb Generation: 221 object count: 12, mem: 1.2 kb Generation: 222 object count: 13, mem: 9.2 kb Generation: 223 object count: 10, mem: 1.0 kb Generation: 224 object count: 12, mem: 1.2 kb Generation: 225 object count: 12, mem: 1.2 kb Generation: 226 object count: 9, mem: 0.9 kb Generation: 227 object count: 11, mem: 1.2 kb ...
  117. Generation: 1874 object count: 8, mem: 0.9 kb Generation: 1875

    object count: 8, mem: 1.0 kb Generation: 1876 object count: 10, mem: 1.0 kb Generation: 1877 object count: 6, mem: 0.7 kb Generation: 1878 object count: 8, mem: 1.0 kb Generation: 1879 object count: 10, mem: 1.0 kb Generation: 1880 object count: 6, mem: 0.7 kb Generation: 1881 object count: 8, mem: 1.0 kb Generation: 1882 object count: 9, mem: 0.8 kb Generation: 1883 object count: 6, mem: 0.7 kb ...
  118. $ heapy read heap.dump 1879

  119. $ heapy read heap.dump 1879 allocated by memory (978) (in

    bytes)
 ==============================
 858 .../action_view/template.rb:296
 120 .../active_support/core_ext/module/remove_method.rb:4
  120. $ heapy read heap.dump 1879 allocated by memory (978) (in

    bytes)
 ==============================
 858 .../action_view/template.rb:296
 120 .../active_support/core_ext/module/remove_method.rb:4 _app_views_mobile_layouts_widgets_html_haml__3857206368_702097383 _app_views_mobile_widgets_outfits_html_haml__3715565700_702097206 _app_views_mobile_layouts_widgets_html_haml__3857206368_702096884
  121. $ heapy read heap.dump 1879 allocated by memory (978) (in

    bytes)
 ==============================
 858 .../action_view/template.rb:296
 120 .../active_support/core_ext/module/remove_method.rb:4 _app_views_mobile_layouts_widgets_html_haml__3857206368_702097383 _app_views_mobile_widgets_outfits_html_haml__3715565700_702097206 _app_views_mobile_layouts_widgets_html_haml__3857206368_702096884 "flags":{"wb_protected":true, "old":true, "uncollectible":true }
  122. 2 Questions

  123. 2 Questions • Why is the method name uncollectible?

  124. 2 Questions • Why is the method name uncollectible? •

    Why so many of them?
  125. Ruby 2.2: Symbols can be garbage collected

  126. Ruby 2.2: Symbols can be garbage collected *Terms and conditions

    may apply Your statutory rights are not affected
  127. 2 Kinds of Symbols

  128. 2 Kinds of Symbols • Mortal symbols: created from foo.to_sym

  129. 2 Kinds of Symbols • Mortal symbols: created from foo.to_sym

    • Immortal symbols: C-level references (rb_intern)
  130. 2 Kinds of Symbols • Mortal symbols: created from foo.to_sym

    • Immortal symbols: C-level references (rb_intern) Defining a method creates an immortal symbol
  131. Action View template caching

  132. Action View template caching • Each template is compiled to

    a method with random name
  133. Action View template caching • Each template is compiled to

    a method with random name • Each entry in the view search path is a resolver
  134. Action View template caching • Each template is compiled to

    a method with random name • Each entry in the view search path is a resolver • Resolvers return compiled template from cache or load from disk
  135. before_action :set_view_path def set_view_path if mobile_site_requested? prepend_view_path "app/views/mobile" end end

  136. before_action :set_view_path def set_view_path if mobile_site_requested? prepend_view_path "app/views/mobile" end end

  137. before_action :set_view_path def set_view_path if mobile_site_requested? prepend_view_path "app/views/mobile" end end

  138. devise/app/views app/views View Paths

  139. devise/app/views app/views View Paths

  140. devise/app/views app/views View Paths app/views/mobile

  141. devise/app/views app/views View Paths app/views/mobile

  142. devise/app/views app/views View Paths

  143. devise/app/views app/views View Paths app/views/mobile

  144. New (empty) cache on each request!

  145. Each request leaks a symbol

  146. None
  147. None
  148. Part 2 Summary

  149. Part 2 Summary • Don't be afraid of production!

  150. Part 2 Summary • Don't be afraid of production! •

    Understand what/how using memory with heap dumps
  151. Part 2 Summary • Don't be afraid of production! •

    Understand what/how using memory with heap dumps • Pay attention to your objects' lifecycles
  152. Thank you! @fglc2 / dressipi

  153. Further Reading • https://samsaffron.com/archive/2015/03/31/debugging-memory- leaks-in-ruby • https://samsaffron.com/archive/2019/10/08/debugging-unmanaged- and-hidden-memory-leaks-in-ruby • https://blog.codeship.com/tracking-object-allocations-in-ruby/

    • https://blog.codeship.com/debugging-a-memory-leak-on-heroku/ • https://www.slideshare.net/authorNari/symbol-gc • https://github.com/schneems/derailed_benchmarks • https://www.schneems.com/2019/11/07/why-does-my-apps- memory-usage-grow-asymptotically-over-time/