Pro Yearly is on sale from $80 to $50! »

Rails Performance Issues and Solutions

76a777ff80f30bd3b390e275cce625bc?s=47 Akira Matsuda
September 29, 2019

Rails Performance Issues and Solutions

Slides for my keynote at RubyConf Indonesia 2019 in Jakarta https://ruby.id/conf/2019/ #RubyconfID

76a777ff80f30bd3b390e275cce625bc?s=128

Akira Matsuda

September 29, 2019
Tweet

Transcript

  1. Rails Performance Issues
 and
 Solutions Akira Matsuda

  2. Akira Matsuda

  3. From Japan !

  4. amatsuda on GH

  5. A Rails Committer &
 A Ruby Committer

  6. Rails Commits This Month

  7. Mostly Micro-Optimizations + Keyword Arguments Fixes

  8. The Keyword Arguments Fixes

  9. The Ruby Team Is Going to Introduce Some Keyword Arguments

    Incompatibilities in Ruby 3 And I'm investigating the new specification on the Rails codebase
  10. I'm Not Going to Talk About The Details About This

    Topic
  11. But Anyway I Work on Such Things Between Ruby and

    Rails
  12. A Gem Author

  13. kaminari, active_decorator, action_args, heavens_door, erd, i18n_generators, etc

  14. But I'm Not Going to Talk About These Gems Today

  15. A Conference Organizer

  16. The Organizer of RubyKaigi https:/ /rubykaigi.org/

  17. But I'm Not Going to Talk About That Conference Today

  18. Today I'm Going to Talk About the Rails Performance

  19. Chapter 1

  20. Let's Start with a Benchmark!

  21. ruby -v ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64- darwin18]

  22. rails -v Rails 6.0.0

  23. rails new $ rails new rubyconfid --skip- spring --skip-bootsnap

  24. # Gemfile @@ -31,8 +31,6 @@ group :development, :test do

    end group :development do - # Access an interactive console on exception pages or by calling 'console' anywhere in the code. - gem 'web-console', '>= 3.3.0' gem 'listen', '>= 3.0.5', '< 3.2' end @@ -43,6 +41,3 @@ group :test do # Easy installation and use of web drivers to run system tests with browsers gem 'webdrivers' end - -# Windows does not include zoneinfo files, so bundle the tzinfo-data gem -gem 'tzinfo-data', platforms: [:mingw, :mswin, :x64_mingw, :jruby]
  25. scaffold $ rails g scaffold post title

  26. Prepare 50 "Hello" Records $ rails r "50.times { Post.create!

    title: 'hello' }"
  27. Starting Up the Rails Server in `production` Mode $ rails

    assets:precompile $ cp db/development.sqlite3 db/production.sqlite3 $ env RAILS_SERVE_STATIC_FILES=1 rails s -e production
  28. How Can We Measure the Application Performance?

  29. Maybe We Can Use time Command? $ time curl http:/

    /localhost:3000/posts/ > /dev/null
  30. Maybe We Can Use Something Like ab or gatling? $

    ab -n 100 -c 10 -l http:/ /127.0.0.1:3000/posts
  31. These Approaches Have Certain HTTP Overhead

  32. So, Instead,

  33. How About Benchmarking from Inside the Application Using Ruby?

  34. The Default Benchmarking Library in Ruby The stdlib benchmark.rb

  35. benchmark.rb Usage Benchmark.realtime do ... end

  36. Which Part of the App Would We Measure? Maybe the

    controllers' `process_action` method?
  37. A Very Rough Monkey-patch for Benchmarking `process_action` # config/initializers/perf.rb ActionController::Base.prepend(

    Module.new { def process_action(*) result = nil p Benchmark.realtime { result = super } result end } )
  38. config/initializers/*.rb I usually draft this sort of quick monkey-patches in

    `config/ initializers/` No need to explicitly `require` No need to give a name Maybe nobody else does this, but works for me...
  39. Now the App Prints Each Request Processing Time But `ab`ing

    with `-n 100` prints 100 lines What we want is a summary of 100 runs
  40. How About Iterating Inside the Request Rather than making hundreds

    of requests?
  41. Other Benchmarking Libraries for Doing This in Ruby benchmark-ips benchmark_driver

  42. A Benchmarking Script that Iterates the `process_action` # config/initializers/perf.rb ActionController::Base.prepend(

    Module.new { def process_action(*) Benchmark.ips do |x| x.report('process_action') { super } end super end } )
  43. Usage Start up the server and request from outside just

    once
  44. Like This $ curl http:/ /localhost:3000/posts/

  45. This Would Return Pure Execution Time of the Rails Land

    Alone
  46. Result Warming up -------------------------------------- process_action 218.000 i/100ms Calculating ------------------------------------- process_action

    2.321k (± 7.5%) i/s - 11.554k in 5.009905s
  47. Or We Could Include the Rack Middleware Stack and `before_action`

    and Such # config/initializers/perf.rb Rails::Engine.prepend( Module.new { def call(*) Benchmark.ips do |x| x.report('call') { super } end super end } )
  48. Result Warming up -------------------------------------- call 22.000 i/100ms Calculating ------------------------------------- call

    240.911 (± 7.1%) i/s - 1.210k in 5.048689s
  49. This Means That the Action Runs 240 Times per Second

    Seems like quite an accurate number than the time command one
  50. Now We've Got the Basic Benchmarking Tool

  51. This Approach Is Quite Flexible We can `Benchmark.ips` any part

    of the application for measuring its speed So we can
 "devide and conquer"
  52. Benchmarking only the "M" in the Whole MVC Stack Benchmark.ips

    do |x| Model.find(...) end
  53. Benchmarking only the "V" in the Whole MVC Stack Benchmark.ips

    do |x| render(...) end
  54. How Can We Measure the "C"?

  55. In Fact, We Can Create a Simple Action That Does

    Not find nor render
  56. Route to /posts/ok # config/routes.rb Rails.application.routes.draw do - resources :posts

    + resources :posts do + collection do + get :ok + end + end # For details on the DSL available within this file, see https://guides.rubyonrails.org/routing.html end
  57. The Controller Action # app/controllers/posts_controller.rb class PostsController < ApplicationController before_action

    :set_post, only: [:show, :edit, :update, :destroy] + def ok + head :ok + end + # GET /posts # GET /posts.json def index
  58. These Results Tell Us That... Finding a record is slow

    Rendering it is also slow Request handling is slow as well
  59. There's No Single Bottle Neck in This Simple App

  60. Conclusion

  61. Rails Is Slow Because Everything Is Slow!

  62. Chapter 1 End

  63. Chapter 2

  64. Let's Make Everything Faster!

  65. Why Rails Is Slow, and How Can We Make Rails

    Faster?
  66. There Are Some "Numbers" That We Can Get from the

    Ruby Process Time elapsed Memory usage Method invocations System calls etc...
  67. The Number of Object Allocations Today, we're going to focus

    on the number of Object allocations
  68. Why Does the Number of Allocations Matters? Allocating Objects in

    Ruby is a very heavy process Plus, it causes GC pressure The number would not be affected by the environment
  69. Actually, Rails 6 Shows This in the Log Completed 200

    OK in 58ms (Views: 9.0ms | ActiveRecord: 2.6ms | Allocations: 7601)
  70. How Can We Measure Object Allocations? $ ruby -e "c

    = GC.stat(:total_allocated_objects); 100.times { _ = 'hello' }; p GC.stat(:total_allocated_objects) - c" 100
  71. Quite Simple, Right?

  72. Number of Objects Allocated During a Rails Request # config/initializers/perf.rb

    Rails::Engine.prepend( Module.new { def call(*) c = GC.stat(:total_allocated_objects) result = super p GC.stat(:total_allocated_objects) - c result end } )
  73. Result $ curl http:/ /localhost:3000/posts/ok #=> 691

  74. Or If You'd Like a Little Bit More Detailed Report

  75. Gems for Profiling Object Allocations & Memory Usage memory_profiler allocation_stats

    allocation_tracer etc.
  76. bundle memory_profiler # Gemfile gem 'byebug', platforms: [:mri, :mingw, :x64_mingw]

    end gem 'benchmark-ips' +gem 'memory_profiler'
  77. An Utility Method That Prints the Memory Usage module Prof

    def self.mem result = nil MemoryProfiler.report { result = yield }.pretty_print(retained_strings: 0, allocated_strings: 100, normalize_paths: true) result end end
  78. Then Profile Rails::Engine.prepend( Module.new { def call(*) Prof.mem { super

    } end } )
  79. curl the ok Action $ curl http:/ /localhost:3000/posts/ok (abandoning the

    first trial)
  80. Result Total allocated: 63830 bytes (599 objects) Total retained: 21391

    bytes (134 objects) allocated memory by gem ----------------------------------- 27104 actionpack-6.0.0/lib 22574 activesupport-6.0.0/lib 5718 rack-2.0.7/lib 1976 actionview-6.0.0/lib 1944 logger 1520 ipaddr 1157 railties-6.0.0/lib 677 securerandom 504 concurrent-ruby-1.1.5/lib 320 activerecord-6.0.0/lib 176 monitor 80 turbolinks-5.2.1 40 i18n-1.6.0/lib 40 uri allocated memory by file ----------------------------------- 7128 actionpack-6.0.0/lib/action_controller/renderer.rb 3702 rack-2.0.7/lib/rack/utils.rb 3624 actionpack-6.0.0/lib/action_dispatch/http/response.rb ...
  81. It Turned Out That, a simple action that just returns

    a "200 OK" header creates 599 objects and consumes 63830 bytes of memory.
  82. Isn't That Too Much?

  83. Let's Try Reducing This As Much As Possible

  84. The memory_profiler Result Indicates That, actionpack allocates the most Especially,

    ActionController::Renderer
  85. But Who Creates the Renderer?

  86. We Actually Aren't Rendering Anything! We're just returning the "200

    OK" header
  87. Let's See Who Is Instantiating AC::Renderer # actionpack/lib/action_controller/renderer.rb @@ -63,6

    +63,7 @@ module ActionController # It will be merged with the default Rack environment defined by # +ActionController::Renderer::DEFAULTS+. def initialize(controller, env, defaults) + puts caller @controller = controller @defaults = defaults @env = normalize_keys defaults.merge(env)
  88. Result .../actionpack-6.0.0/lib/action_controller/renderer.rb:54:in `new' .../actionpack-6.0.0/lib/action_controller/renderer.rb:54:in `new' .../actiontext-6.0.0/lib/action_text/engine.rb:50:in `block (3 levels) in

    <class:Engine>' .../activesupport-6.0.0/lib/active_support/callbacks.rb:429:in `instance_exec'
  89. action_text/ engine.rb?? We aren't using action_text in this action!

  90. An Initializer in ActionText::Engine # actiontext-6.0.0/lib/action_text/engine.rb initializer "action_text.renderer" do |app|

    app.executor.to_run { ActionText::Content.renderer = ApplicationController.renderer } app.executor.to_complete { ActionText::Content.renderer = ApplicationController.renderer } ActiveSupport.on_load(:action_text_content) do self.renderer = ApplicationController.renderer end ActiveSupport.on_load(:action_controller_base) do before_action { ActionText::Content.renderer = ApplicationController.renderer.new(request.env) } end end
  91. So,

  92. Action Text 6.0.0 Includes This Bug

  93. When Action Text Is Bundled and Required, It Instantiates An

    AC::Renderer Before Each Action In Every Controller
  94. The Solution

  95. Skip Requiring actiontext/ railtie (unless you use it) # config/application.rb

    require_relative 'boot' -require 'rails/all' +%w( + active_record/railtie + active_storage/engine + action_controller/railtie + action_view/railtie + action_mailer/railtie + active_job/railtie + action_cable/engine + action_mailbox/engine + rails/test_unit/railtie + sprockets/railtie +).each do |railtie| + begin + require railtie + rescue LoadError + end +end
  96. Then Let's Measure the Memory Usage Again

  97. Result Total allocated: 63830 bytes (599 objects) ! Total allocated:

    55560 bytes (574 objects)
  98. The Memory Usage Improves 8000+ Bytes

  99. I Filed This Issue https:/ /github.com/rails/ rails/issues/36963

  100. So, This Would Be Fixed in 6.0.1, Hopefully

  101. What If We Skip Requiring Some More Other Components That

    We Don't Use Here? # config/application.rb @@ -2,13 +2,10 @@ require_relative 'boot' %w( active_record/railtie - active_storage/engine action_controller/railtie action_view/railtie action_mailer/railtie active_job/railtie - action_cable/engine - action_mailbox/engine
  102. Result Total allocated: 55560 bytes (574 objects) ↓ Total allocated:

    55560 bytes (574 objects)
  103. Nothing Changed!

  104. Benchmark Result for Each Action (Allocation) ok: 55560 bytes (574

    objects) show: 127210 bytes (1301 objects) new 173108 bytes (1657 objects) index: 597728 bytes (6333 objects)
  105. Now We Got the Numbers We need such numbers to

    see how we're optimizing things
  106. So, Let's Start Optimizing Rails

  107. Now We Have a List of Files that Consume Memory

    allocated memory by file ----------------------------------- 3702 rack-2.0.7/lib/rack/utils.rb 3624 actionpack-6.0.0/lib/action_dispatch/http/response.rb 2944 activesupport-6.0.0/lib/active_support/callbacks.rb 2728 activesupport-6.0.0/lib/active_support/ hash_with_indifferent_access.rb 2720 actionpack-6.0.0/lib/action_dispatch/middleware/cookies.rb 2144 activesupport-6.0.0/lib/active_support/messages/rotator.rb 1944 ruby/lib/logger.rb 1856 activesupport-6.0.0/lib/active_support/parameter_filter.rb 1680 activesupport-6.0.0/lib/active_support/inflector/methods.rb 1520 ruby/lib/ipaddr.rb 1480 actionpack-6.0.0/lib/action_dispatch/journey/router.rb 1280 actionpack-6.0.0/lib/action_controller/metal/ instrumentation.rb
  108. rack, actionpack, activesupport, ruby...

  109. These Things Are Part of Your Application

  110. You Can Make Your Applications Fast by Making Rails Fast

  111. Rails Is Just a Ruby Program That You Can Patch

  112. First, Set Up gem-src

  113. Then Re-install rails

  114. gem-src Will Turn These Gems Into Git Repos

  115. And Now You're Ready to Commit Your Patches $ cd

    (bundle show actionpack) % cd `bundle show actionpack` (zsh)
  116. Just Read These Files in the Memory Usage List

  117. And Try to Reduce Object Creations as Much as Possible

  118. For Example,

  119. Regexp#=~ vs Regexp#match?

  120. Regexp#=~ $ ruby -rmemory_profiler -e "s = 'ABC'; MemoryProfiler.report {

    s =~ / A/ }.pretty_print" Total allocated: 208 bytes (2 objects) allocated memory by location ----------------------------------- 208 -e:1 allocated memory by class ----------------------------------- 168 MatchData 40 String allocated objects by class ----------------------------------- 1 MatchData 1 String
  121. Regexp#match? $ ruby -rmemory_profiler -e "s = 'ABC'; MemoryProfiler.report {

    s.match?(/A/) }.pretty_print" Total allocated: 0 bytes (0 objects) Total retained: 0 bytes (0 objects) allocated memory by gem ----------------------------------- NO DATA
  122. match? Is Way More Efficient Where We Don't Need the

    MatchData Instance
  123. The Patch

  124. Hash#merge vs Hash#[]=

  125. Hash#merge $ ruby -rmemory_profiler -e "h1, h2, h3 = {a:

    1}, {a: 2}, {a: 3}; MemoryProfiler.report { h1.merge(h2.merge(h3)) }.pretty_print" Total allocated: 336 bytes (2 objects) allocated memory by class ----------------------------------- 336 Hash allocated objects by class ----------------------------------- 2 Hash
  126. Hash#[]= $ ruby -rmemory_profiler -e "h1, h2 = {a: 1},

    {a: 2}; MemoryProfiler.report { h = h1.merge(h2); h[:a] = 3; h }.pretty_print" Total allocated: 168 bytes (1 objects) allocated memory by class ----------------------------------- 168 Hash allocated objects by class ----------------------------------- 1 Hash
  127. We'd Better Avoid Unnecessary Hash#merge

  128. The Patch

  129. Hash#fetch vs (Hash#[] || default)

  130. Hash#fetch Taking Two Arguments $ ruby -rmemory_profiler -e "h =

    {x: 1}; MemoryProfiler.report { h.fetch(:x, 'hello') }.pretty_print" Total allocated: 40 bytes (1 objects) allocated memory by class ----------------------------------- 40 String allocated objects by class ----------------------------------- 1 String
  131. Hash#[] || default $ ruby -rmemory_profiler -e "h = {x:

    1}; MemoryProfiler.report { h[:x] || 'hello' }.pretty_print" Total allocated: 0 bytes (0 objects) allocated memory by class ----------------------------------- NO DATA
  132. The Patch

  133. I Pushed About 60 Micro- Optimization Patches Like These to

    Rails Since the 6.0.0 Release Since I decided to talk about Rails performance in this conference And I think I could reduce
 5ʙ10% memory consumption
  134. I Tried to cherry-pick These Commits to 6.0.0 to Show

    You the Result
  135. But I Gave Up We had too many conflicts

  136. So Instead, Let's Bundle Edge Rails and See How Things

    Are Improved in the Current master # Gemfile -gem 'rails', '~> 6.0.0' +gem 'rails', path: 'PATH/TO/RAILS'
  137. Result (posts/ok) 55560 bytes (574 objects) ↓ 54995 bytes (555

    objects)
  138. None
  139. The Improvement Was Indeed Only 1% Since 6.0.0 1% of

    the memory usage 3% of the number of the allocated objects
  140. Maybe Because We've Been Adding Other Heavy Stuffs to master?

  141. What About Other Actions?

  142. Result: posts/show 127931 bytes (1305 objects) ↓ 124773 bytes (1271

    objects)
  143. BTW Why Does posts/show Consume 70K More Bytes Than posts/ok?

  144. Because It Uses Whole M + V + C Stack

    Where posts/ok uses only "C"
  145. posts/show = posts/ok + find + render

  146. In Order to Divide the Problem,

  147. It'd Be Better to Measure M and V Separately

  148. Measuring Active Record Finder Alone # app/controllers/posts_controller.rb private # Use

    callbacks to share common setup or constraints between actions. def set_post - @post = Post.find(params[:id]) + Prof.mem do + @post = Post.find(params[:id]) + end end
  149. Measuring Action View Renderer Alone # app/controllers/posts_controller.rb def show +

    Prof.mem { default_render } end
  150. Action Controller's default_render Method A method that is called when

    no explicit render was called
  151. So Easy, Right?

  152. Result: default_render 40968 bytes (452 objects) out of 124773 bytes

    (1271 objects)
  153. Approx. 1/3

  154. But Why Does Action View Consume 40KB Memory for Composing

    An 1.5KB HTML?
  155. According to the Profiling Results

  156. Things That Consume Memory in Action View Strings and String-ish

    Objects Template engines Template lookup I18n UrlHelper
  157. What Are the String-ish Objects? SafeBuffer object Multibyte::Chars object

  158. ActiveSupport::SafeBuffer A variety of String class that can distinguish "safe"

    string and "unsafe" string Inherits String class to hold a flag as an ivar inside
  159. Can't We Use the taint Flag for This? I tried,

    but it didn't work well Because we cannot create "tainted by default" Strings
  160. BTW Who Here Knows What "the taint" Flag Is? Who

    actually have ever used it?
  161. It's a Flag That Every Ruby Object Has But Nobody

    Uses
  162. So, the Ruby Core Team Just Decided to Remove the

    Flag Around Ruby 3.0
  163. ActiveSupport::SafeBuffer We may be able to reduce so much garbage

    in the views if we can improve this Maybe we can create some kind of extension (that reuses the flag bit in String?) in the future
  164. Template Engines Template Engines tend to create so many garbage

    Strings while composing an HTML body String
  165. Why Don't We Try "The Fastest Template Engine in the

    World" Here?
  166. You Know What?

  167. It's Called "string_template" https:/ /github.com/ amatsuda/string_template

  168. Its Mark Up Syntax Is Ruby's String Interpolation That Everybody

    Knows It's simple
  169. It Doesn't Internally Create the Heavy SafeBuffer Objects And fast

  170. But Has the Tradeoff It has no auto HTML safety

    guard We need to manually call `h()` just like we used to do before Rails 3
  171. Stringification -<p id="notice"><%= notice %></p> +<p id="notice">#{h notice }</p> <p>

    <strong>Title:</strong> - <%= @post.title %> + #{h @post.title } </p> -<%= link_to 'Edit', edit_post_path(@post) %> | -<%= link_to 'Back', posts_path %> +#{ link_to 'Edit', edit_post_path(@post) } | +#{ link_to 'Back', posts_path }
  172. Result: After Stringifying the Templates 40968 bytes (452 objects) ↓

    40467 bytes (454 objects)
  173. Memory Usage Decreased, But Allocated Objects Increased...! Because string_template
 could

    create some extra String / SafeBuffer objects when concatenating strings with explicit `h()` call
  174. string_template Maybe efficient in some particular cases Inside heavy loops

    With lots of string interpolations That needs no HTML escaping But Action View is not generally optimized for this template engine
  175. Now, Let's Take a Look at the "new" Action

  176. Result: posts/new 165836 bytes (1613 objects)

  177. Who Consumes That Much Memory? allocated memory by file -----------------------------------

    8088 i18n-1.6.0/lib/i18n/interpolate/ruby.rb 7272 i18n-1.6.0/lib/i18n/backend/base.rb 6396 actionview/lib/action_view/helpers/tag_helper.rb 5088 actionview/lib/action_view/lookup_context.rb 5060 rack-2.0.7/lib/rack/utils.rb 4944 activesupport/lib/active_support/core_ext/string/output_safety.rb 4552 actionview/lib/action_view/helpers/form_helper.rb
  178. Another Newcomer "I18n" Is the Top Memory Consumer Here!

  179. I18n.t(:hello) => "͜Μʹͪ͸" A very simple function that looks up

    the dictionary with a key, and returns a String
  180. Why Is I18n.t Heavy? Is it doing something more than

    just a Hash lookup?
  181. Let's See What's Happening $ env RAILS_ENV=production rails r 'I18n.t(:hello);

    Prof.mem { I18n.t :hello }'
  182. Result Total allocated: 1336 bytes (9 objects) allocated memory by

    class ----------------------------------- 928 Hash 368 Array 40 String allocated objects by class ----------------------------------- 4 Array 4 Hash 1 String
  183. It Turned Out That,

  184. I18n Creates Some Hash and Array Objects Per Each I18n.t

    Invocation Especially for manipulating given keys and options
  185. How Can We Reduce These Object Creations?

  186. The Solution Wrap I18n.t and cache

  187. Like This? module I18nCache def self.extended(obj) obj.instance_variable_set :@i18n_cache, {} end

    def translate(*args, **options) cache = @i18n_cache[I18n.locale] ||= {} cache_key = args << options cache[cache_key] || begin result = super cache[cache_key] = result if result result ennd alias t translate end I18n.extend I18nCache
  188. This Monkey-Patch May Not Be Production-Ready Yet I'm aware that

    we at least need to handle `:count` option There may be some more corner cases If this approach is too aggressive, maybe we could monkey-patch `Backend#lookup`
  189. Anyway, Now I18n.t Became Much Cheaper $ env RAILS_ENV=production rails

    r 'I18n.t(:hello); Prof.mem { I18n.t :hello }' Total allocated: 272 bytes (2 objects) allocated memory by class ----------------------------------- 232 Hash 40 Array allocated objects by class ----------------------------------- 1 Array 1 Hash
  190. Result: I18n.t(:hello) Total allocated: 1336 bytes (9 objects) ↓ Total

    allocated: 272 bytes (2 objects)
  191. Applying This Monkey- Patch to the Rails App

  192. Result: posts/new Memory Usage 165836 bytes (1613 objects) ↓ 143306

    bytes (1476 objects)
  193. Next, Let's Investigate the index Action Which usually becomes the

    main hotspot in the app
  194. Result: 50 "Hello" Records Total allocated: 580397 bytes (6170 objects)

  195. It Consumes This Much Memory for Fetching and Showing
 "Hello"

    50 Times
  196. Allocated Objects by Class allocated objects by class ----------------------------------- 3235

    String 1300 Array 1019 Hash 162 ActiveSupport::SafeBuffer 100 ActiveModel::Attribute::FromDatabase 50 ActiveModel::AttributeSet 50 ActiveModel::LazyAttributeHash 50 Post
  197. ActiveModel::Attribute::FromDatab ase, ActiveModel::AttributeSet, ActiveModel::LazyAttributeHash These are created through the Attributes

    API
  198. So Here's a Super Monkey-Patch That Could Reduce These Objects

    https:/ /github.com/ amatsuda/ lightweight_attributes
  199. lightweight_attributes' Approach AR model has too many features But in

    most cases, what we need is "data transfer object" instances that are "read only" Such objects can be made more lightweight by dropping some heavy features e.g. dirty tracking, type casting
  200. How to Use lightweight_attributes Bundle the gem. Done.

  201. Allocated Objects by Class: with lightweight_attributes allocated objects by class

    ----------------------------------- 3187 String 1293 Array 996 Hash 162 ActiveSupport::SafeBuffer 52 Class 50 LightweightAttributes::AttributeSet 50 Post
  202. Allocated Objects by Class: Before => After 3235 String 1300

    Array 1019 Hash 162 ActiveSupport::SafeBuffer 100 ActiveModel::Attribute::FromDatabase 50 ActiveModel::AttributeSet 50 ActiveModel::LazyAttributeHash 50 Post ! 3187 String 1293 Array 996 Hash 162 ActiveSupport::SafeBuffer 52 Class 50 LightweightAttributes::AttributeSet 50 Post
  203. Yay! We Could Reduce So Many Objects!!

  204. Result: Total Allocated with lightweight_attributes Total allocated: 580397 bytes (6170

    objects) ↓ Total allocated: 586565 bytes (5981 objects)
  205. Allocated Objects Decreased, But Memory Usage Increased...!

  206. Probably Because We're Not Accessing All Fields in This Example

    The default implementation lazily type casts The lightweight_attributes version immediately type casts
  207. Anyway,

  208. What Consumes Memory in the index Action? allocated memory by

    location ----------------------------------- 35032 actionpack/lib/action_controller/metal/url_for.rb:42 35032 actionpack/lib/action_dispatch/routing/route_set.rb:199 35032 actionpack/lib/action_dispatch/routing/route_set.rb:231 27200 rubyconfid/app/views/posts/index.html.erb:19 26000 lightweight_attributes/lib/lightweight_attributes/attribute_set.rb:17 25587 activesupport/lib/active_support/core_ext/string/output_safety.rb:175 25368 actionpack/lib/action_dispatch/journey/visitors.rb:49 23432 actionview/lib/action_view/helpers/url_helper.rb:683 18920 actionpack/lib/action_dispatch/journey/visitors.rb:38
  209. Mostly from UrlHelper

  210. Improving UrlHelper Would Greatly Improve the index Action Performance We

    need your ideas!
  211. Another Thing That Is Allocated Per Each Action and Can

    Be Cached
  212. ActiveSupport Inflector post => posts post => Post etc.

  213. But We Cannot Just Cache Every Inflector Calls That'll cause

    a memory bloat vulnerability
  214. What If We Cache Only What's Included in the DB

    Schema?
  215. The Monkey-Patches for Inflector def camelize(term, uppercase_first_letter = true) return

    super unless uppercase_first_letter if (cached = InflectorCache.camelize_cache[term]) cached.dup else super ennd def underscore(camel_cased_word) if (cached = InflectorCache.underscore_cache[camel_cased_word]) cached.dup else super ennd
  216. And Another Monkey-Patch That Creates the Cache from Table Name

    module SchemaStatementsExtension def columns(table_name) result = super unless InflectorCache.camelize_cache[table_name] InflectorCache.camelize_cache[table_name] = table_name.camelize end unless InflectorCache.underscore_cache[table_name] InflectorCache.underscore_cache[table_name] = table_name.underscor end result ennd ActiveRecord::ConnectionAdapters::AbstractAdapter.prepend SchemaStatementsExtension
  217. Result: posts/ok with Inflector Cache 50569 bytes (533 objects) =>

    49235 bytes (520 objects)
  218. Yet Another Thing That Can Be Cached

  219. Action View Template Lookup

  220. There Are Three Kinds of Template Lookup Layout Main tamplate

    Partial
  221. Caching Layout Resolution `:layout` per action is almost always non

    dynamic So it can be cached
  222. The Monkey-Patch Can Be Something Like This module CachedLayout private

    def _layout(lookup_context, formats) cache = self.class.instance_variable_get :@_layout_cache implied_layout_name = self.class.send(:_implied_layout_name) prefixes = /\blayouts/.match?(implied_layout_name) ? [] : ["layouts"] cache[[implied_layout_name, prefixes, formats]] ||= super ennd
  223. Only if Given `:layout` Is `nil` or String def _write_layout_method

    # :nodoc: super if _layout.nil? || (String === _layout) @_layout_cache = {} prepend CachedLayout ennd
  224. Result: Layout Cache 116586 bytes (1227 objects) ↓ 113323 bytes

    (1189 objects)
  225. Before Layout Cache allocated memory by file ----------------------------------- 4916 rack-2.0.7/lib/rack/utils.rb

    3839 actionpack/lib/action_dispatch/middleware/static.rb 3821 ruby/lib/logger.rb 3800 actionpack/lib/action_dispatch/http/response.rb 3760 actionview/lib/action_view/lookup_context.rb
  226. After Layout Cache allocated memory by file ----------------------------------- 4884 rack-2.0.7/lib/rack/utils.rb

    3839 actionpack/lib/action_dispatch/middleware/static.rb 3821 ruby/lib/logger.rb 3800 actionpack/lib/action_dispatch/http/response.rb 3688 activesupport/lib/active_support/callbacks.rb 3552 actionpack/lib/action_dispatch/middleware/cookies.rb 3504 activesupport/lib/active_support/hash_with_indifferent_access.rb 3465 activesupport/lib/active_support/json/encoding.rb 3458 actionview/lib/action_view/helpers/tag_helper.rb 3311 activesupport/lib/active_support/message_encryptor.rb 2851 activesupport/lib/active_support/core_ext/string/output_safety.rb 2384 activesupport/lib/active_support/messages/rotator.rb 2344 activesupport/lib/active_support/core_ext/hash/keys.rb 2299 rubyconfid/app/views/layouts/application.html.string 2198 activesupport/lib/active_support/core_ext/object/json.rb 2080 actionview/lib/action_view/lookup_context.rb
  227. For Caching Partials Here's an experimental repository https:/ /github.com/ amatsuda/turbo_partial

  228. turbo_partial `render_partial` takes many parameters So it can't simply be

    cached turbo_partial adds `render_relative` and `render_absolute` features to `render_partial`
  229. Rendering with turbo_partial - render 'form', post: @post + render

    './form', post: @post
  230. I'm Skipping the Result Because It Wasn't That Much Efficient

    in This Case...
  231. Anyway,

  232. Who Are Allocating Objects Now? (post/ok) allocated memory by location

    ----------------------------------- 3451 actionpack/lib/action_dispatch/middleware/static.rb:95 1347 ruby/lib/logger.rb:597 1288 actionpack/lib/action_dispatch/http/response.rb:437 1160 activesupport/lib/active_support/notifications/instrumenter.rb:67 1152 activesupport/lib/active_support/callbacks.rb:405
  233. Logging Creates so many String objects that are usually much

    longer than the response body
  234. ActiveSupport::
 Instrumentation Creates large payload Hash objects and `dup`s them

  235. ActiveSupport::Callbacks Creates many Procs and things

  236. And What Is This Binding Thing? allocated memory by class

    ----------------------------------- 14440 Hash 9987 String 9688 Array 2520 MatchData 1728 ActionDispatch::Request 1608 Thread::Backtrace 1160 ActiveSupport::HashWithIndifferentAccess 1040 Proc 944 Rack::Utils::HeaderHash 912 Class 512 Regexp 352 Binding
  237. It's Created Through `binding.local_variable_get`

  238. binding.local_variable_ get A magical idiom to get a local variable

    that has a name that conflicts with reserved words
  239. This Doesn't Usually Happen, But Keyword Arguments Can Create This

    def do_something(if: nil, unless: nil) if if ... end if unless ... ennd
  240. Which Causes SyntaxError

  241. I Found This When I Was Trying to Introduce Keyword

    Arguments in Rails
  242. I Asked @ko1 How to Avoid This SyntaxError

  243. Then He Introduced This New Magic

  244. binding.local_variable_ get def do_something(if: nil, unless: nil) if binding.local_variable_get(:if) ...

    end if binding.local_variable_get(:unless) ... end end
  245. Rails Started Using This in ActiveSupport::HashWithIndifferen tAccess Since 6.0

  246. But It Turned Out That It Allocates a Binding Object

    I hope we can fix this in Ruby 2.7
  247. HashWithIndifferentAc cess A variety of Hash that treats Symbol keys

    and String keys as the same things Converts all Symbol keys to String internally Abused too much in the Rails internal
  248. Maybe We Can Reduce HashWithIndifferentAccess By defaulting this to be

    Symbol based Thanks to Symbol GC from @nari3 Or by implementing this in C
  249. Maybe We Can Reduce HashWithIndifferentAccess By introducing more keyword arguments

  250. Another Thing That Is Making This Benchmark Slow Thread synchronization

  251. Current Rails Codebase Synchronizes Threads Too Defensively Because not all

    supported Ruby implementations has GIL (GVL)
  252. Monitor#synchronize posts/show calls this 25 times per each request We

    could skip some of these in MRI
  253. Ruby 2.6.4 or Current Master `Monitor#synchronize` creates some extra Hash

    objects
  254. Today's Benchmark Result Becomes Like This allocated memory by file

    ----------------------------------- 7472 ruby/lib/monitor.rb 4908 rack-2.0.7/lib/rack/utils.rb 3839 actionpack/lib/action_dispatch/middleware/static.rb 3821 ruby/lib/logger.rb 3800 actionpack/lib/action_dispatch/http/response.rb 3688 activesupport/lib/active_support/callbacks.rb 3552 actionpack/lib/action_dispatch/middleware/cookies.rb
  255. We Need to Fix This in Ruby 2.7

  256. Summary

  257. posts/ok 55560 bytes (574 objects) ↓ 50299 bytes (531 objects)

    (-9.5%)
  258. show 127210 bytes (1301 objects) ↓ 113047 bytes (1192 objects)

    (-11.1%)
  259. new 173108 bytes (1657 objects) ↓ 139605 bytes (1435 objects)

    (-19.4%)
  260. index 597728 bytes (6333 objects) ↓ 569575 bytes (6092 objects)

    (-4.7%)
  261. Future Works More patches on Cookies, Instrumentation, UrlHelper, etc. Publish

    these monkey-patches as a gem
  262. Actually I Baked Some of These Silly Monkey-Patches into a

    Gem https:/ /github.com/ amatsuda/speed_king
  263. Wrap Up Rails is a huge software that has no

    obvious bottleneck But we can somehow improve it by micro- optimizations I already pushed so many patches in Rails master, so Rails 6.1 has to be faster than 6.0 After such micro-optimizations, today's benchmark highlighted some performance hotspots
  264. Wrap Up (2) Try speed_king gem (not released yet) But

    don't in production...
  265. Wrap up (3) Rails is such a fun thing to

    play with! Performance tuning is like a game aiming for a high-score Everybody can play with it for free! Since it's open sourced! I'm waiting for you to join this game!
  266. end