Slide 1

Slide 1 text

Rails Performance Issues
 and
 Solutions Akira Matsuda

Slide 2

Slide 2 text

Akira Matsuda

Slide 3

Slide 3 text

From Japan !

Slide 4

Slide 4 text

amatsuda on GH

Slide 5

Slide 5 text

A Rails Committer &
 A Ruby Committer

Slide 6

Slide 6 text

Rails Commits This Month

Slide 7

Slide 7 text

Mostly Micro-Optimizations + Keyword Arguments Fixes

Slide 8

Slide 8 text

The Keyword Arguments Fixes

Slide 9

Slide 9 text

The Ruby Team Is Going to Introduce Some Keyword Arguments Incompatibilities in Ruby 3 And I'm investigating the new specification on the Rails codebase

Slide 10

Slide 10 text

I'm Not Going to Talk About The Details About This Topic

Slide 11

Slide 11 text

But Anyway I Work on Such Things Between Ruby and Rails

Slide 12

Slide 12 text

A Gem Author

Slide 13

Slide 13 text

kaminari, active_decorator, action_args, heavens_door, erd, i18n_generators, etc

Slide 14

Slide 14 text

But I'm Not Going to Talk About These Gems Today

Slide 15

Slide 15 text

A Conference Organizer

Slide 16

Slide 16 text

The Organizer of RubyKaigi https:/ /rubykaigi.org/

Slide 17

Slide 17 text

But I'm Not Going to Talk About That Conference Today

Slide 18

Slide 18 text

Today I'm Going to Talk About the Rails Performance

Slide 19

Slide 19 text

Chapter 1

Slide 20

Slide 20 text

Let's Start with a Benchmark!

Slide 21

Slide 21 text

ruby -v ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64- darwin18]

Slide 22

Slide 22 text

rails -v Rails 6.0.0

Slide 23

Slide 23 text

rails new $ rails new rubyconfid --skip- spring --skip-bootsnap

Slide 24

Slide 24 text

# Gemfile @@ -31,8 +31,6 @@ group :development, :test do end group :development do - # Access an interactive console on exception pages or by calling 'console' anywhere in the code. - gem 'web-console', '>= 3.3.0' gem 'listen', '>= 3.0.5', '< 3.2' end @@ -43,6 +41,3 @@ group :test do # Easy installation and use of web drivers to run system tests with browsers gem 'webdrivers' end - -# Windows does not include zoneinfo files, so bundle the tzinfo-data gem -gem 'tzinfo-data', platforms: [:mingw, :mswin, :x64_mingw, :jruby]

Slide 25

Slide 25 text

scaffold $ rails g scaffold post title

Slide 26

Slide 26 text

Prepare 50 "Hello" Records $ rails r "50.times { Post.create! title: 'hello' }"

Slide 27

Slide 27 text

Starting Up the Rails Server in `production` Mode $ rails assets:precompile $ cp db/development.sqlite3 db/production.sqlite3 $ env RAILS_SERVE_STATIC_FILES=1 rails s -e production

Slide 28

Slide 28 text

How Can We Measure the Application Performance?

Slide 29

Slide 29 text

Maybe We Can Use time Command? $ time curl http:/ /localhost:3000/posts/ > /dev/null

Slide 30

Slide 30 text

Maybe We Can Use Something Like ab or gatling? $ ab -n 100 -c 10 -l http:/ /127.0.0.1:3000/posts

Slide 31

Slide 31 text

These Approaches Have Certain HTTP Overhead

Slide 32

Slide 32 text

So, Instead,

Slide 33

Slide 33 text

How About Benchmarking from Inside the Application Using Ruby?

Slide 34

Slide 34 text

The Default Benchmarking Library in Ruby The stdlib benchmark.rb

Slide 35

Slide 35 text

benchmark.rb Usage Benchmark.realtime do ... end

Slide 36

Slide 36 text

Which Part of the App Would We Measure? Maybe the controllers' `process_action` method?

Slide 37

Slide 37 text

A Very Rough Monkey-patch for Benchmarking `process_action` # config/initializers/perf.rb ActionController::Base.prepend( Module.new { def process_action(*) result = nil p Benchmark.realtime { result = super } result end } )

Slide 38

Slide 38 text

config/initializers/*.rb I usually draft this sort of quick monkey-patches in `config/ initializers/` No need to explicitly `require` No need to give a name Maybe nobody else does this, but works for me...

Slide 39

Slide 39 text

Now the App Prints Each Request Processing Time But `ab`ing with `-n 100` prints 100 lines What we want is a summary of 100 runs

Slide 40

Slide 40 text

How About Iterating Inside the Request Rather than making hundreds of requests?

Slide 41

Slide 41 text

Other Benchmarking Libraries for Doing This in Ruby benchmark-ips benchmark_driver

Slide 42

Slide 42 text

A Benchmarking Script that Iterates the `process_action` # config/initializers/perf.rb ActionController::Base.prepend( Module.new { def process_action(*) Benchmark.ips do |x| x.report('process_action') { super } end super end } )

Slide 43

Slide 43 text

Usage Start up the server and request from outside just once

Slide 44

Slide 44 text

Like This $ curl http:/ /localhost:3000/posts/

Slide 45

Slide 45 text

This Would Return Pure Execution Time of the Rails Land Alone

Slide 46

Slide 46 text

Result Warming up -------------------------------------- process_action 218.000 i/100ms Calculating ------------------------------------- process_action 2.321k (± 7.5%) i/s - 11.554k in 5.009905s

Slide 47

Slide 47 text

Or We Could Include the Rack Middleware Stack and `before_action` and Such # config/initializers/perf.rb Rails::Engine.prepend( Module.new { def call(*) Benchmark.ips do |x| x.report('call') { super } end super end } )

Slide 48

Slide 48 text

Result Warming up -------------------------------------- call 22.000 i/100ms Calculating ------------------------------------- call 240.911 (± 7.1%) i/s - 1.210k in 5.048689s

Slide 49

Slide 49 text

This Means That the Action Runs 240 Times per Second Seems like quite an accurate number than the time command one

Slide 50

Slide 50 text

Now We've Got the Basic Benchmarking Tool

Slide 51

Slide 51 text

This Approach Is Quite Flexible We can `Benchmark.ips` any part of the application for measuring its speed So we can
 "devide and conquer"

Slide 52

Slide 52 text

Benchmarking only the "M" in the Whole MVC Stack Benchmark.ips do |x| Model.find(...) end

Slide 53

Slide 53 text

Benchmarking only the "V" in the Whole MVC Stack Benchmark.ips do |x| render(...) end

Slide 54

Slide 54 text

How Can We Measure the "C"?

Slide 55

Slide 55 text

In Fact, We Can Create a Simple Action That Does Not find nor render

Slide 56

Slide 56 text

Route to /posts/ok # config/routes.rb Rails.application.routes.draw do - resources :posts + resources :posts do + collection do + get :ok + end + end # For details on the DSL available within this file, see https://guides.rubyonrails.org/routing.html end

Slide 57

Slide 57 text

The Controller Action # app/controllers/posts_controller.rb class PostsController < ApplicationController before_action :set_post, only: [:show, :edit, :update, :destroy] + def ok + head :ok + end + # GET /posts # GET /posts.json def index

Slide 58

Slide 58 text

These Results Tell Us That... Finding a record is slow Rendering it is also slow Request handling is slow as well

Slide 59

Slide 59 text

There's No Single Bottle Neck in This Simple App

Slide 60

Slide 60 text

Conclusion

Slide 61

Slide 61 text

Rails Is Slow Because Everything Is Slow!

Slide 62

Slide 62 text

Chapter 1 End

Slide 63

Slide 63 text

Chapter 2

Slide 64

Slide 64 text

Let's Make Everything Faster!

Slide 65

Slide 65 text

Why Rails Is Slow, and How Can We Make Rails Faster?

Slide 66

Slide 66 text

There Are Some "Numbers" That We Can Get from the Ruby Process Time elapsed Memory usage Method invocations System calls etc...

Slide 67

Slide 67 text

The Number of Object Allocations Today, we're going to focus on the number of Object allocations

Slide 68

Slide 68 text

Why Does the Number of Allocations Matters? Allocating Objects in Ruby is a very heavy process Plus, it causes GC pressure The number would not be affected by the environment

Slide 69

Slide 69 text

Actually, Rails 6 Shows This in the Log Completed 200 OK in 58ms (Views: 9.0ms | ActiveRecord: 2.6ms | Allocations: 7601)

Slide 70

Slide 70 text

How Can We Measure Object Allocations? $ ruby -e "c = GC.stat(:total_allocated_objects); 100.times { _ = 'hello' }; p GC.stat(:total_allocated_objects) - c" 100

Slide 71

Slide 71 text

Quite Simple, Right?

Slide 72

Slide 72 text

Number of Objects Allocated During a Rails Request # config/initializers/perf.rb Rails::Engine.prepend( Module.new { def call(*) c = GC.stat(:total_allocated_objects) result = super p GC.stat(:total_allocated_objects) - c result end } )

Slide 73

Slide 73 text

Result $ curl http:/ /localhost:3000/posts/ok #=> 691

Slide 74

Slide 74 text

Or If You'd Like a Little Bit More Detailed Report

Slide 75

Slide 75 text

Gems for Profiling Object Allocations & Memory Usage memory_profiler allocation_stats allocation_tracer etc.

Slide 76

Slide 76 text

bundle memory_profiler # Gemfile gem 'byebug', platforms: [:mri, :mingw, :x64_mingw] end gem 'benchmark-ips' +gem 'memory_profiler'

Slide 77

Slide 77 text

An Utility Method That Prints the Memory Usage module Prof def self.mem result = nil MemoryProfiler.report { result = yield }.pretty_print(retained_strings: 0, allocated_strings: 100, normalize_paths: true) result end end

Slide 78

Slide 78 text

Then Profile Rails::Engine.prepend( Module.new { def call(*) Prof.mem { super } end } )

Slide 79

Slide 79 text

curl the ok Action $ curl http:/ /localhost:3000/posts/ok (abandoning the first trial)

Slide 80

Slide 80 text

Result Total allocated: 63830 bytes (599 objects) Total retained: 21391 bytes (134 objects) allocated memory by gem ----------------------------------- 27104 actionpack-6.0.0/lib 22574 activesupport-6.0.0/lib 5718 rack-2.0.7/lib 1976 actionview-6.0.0/lib 1944 logger 1520 ipaddr 1157 railties-6.0.0/lib 677 securerandom 504 concurrent-ruby-1.1.5/lib 320 activerecord-6.0.0/lib 176 monitor 80 turbolinks-5.2.1 40 i18n-1.6.0/lib 40 uri allocated memory by file ----------------------------------- 7128 actionpack-6.0.0/lib/action_controller/renderer.rb 3702 rack-2.0.7/lib/rack/utils.rb 3624 actionpack-6.0.0/lib/action_dispatch/http/response.rb ...

Slide 81

Slide 81 text

It Turned Out That, a simple action that just returns a "200 OK" header creates 599 objects and consumes 63830 bytes of memory.

Slide 82

Slide 82 text

Isn't That Too Much?

Slide 83

Slide 83 text

Let's Try Reducing This As Much As Possible

Slide 84

Slide 84 text

The memory_profiler Result Indicates That, actionpack allocates the most Especially, ActionController::Renderer

Slide 85

Slide 85 text

But Who Creates the Renderer?

Slide 86

Slide 86 text

We Actually Aren't Rendering Anything! We're just returning the "200 OK" header

Slide 87

Slide 87 text

Let's See Who Is Instantiating AC::Renderer # actionpack/lib/action_controller/renderer.rb @@ -63,6 +63,7 @@ module ActionController # It will be merged with the default Rack environment defined by # +ActionController::Renderer::DEFAULTS+. def initialize(controller, env, defaults) + puts caller @controller = controller @defaults = defaults @env = normalize_keys defaults.merge(env)

Slide 88

Slide 88 text

Result .../actionpack-6.0.0/lib/action_controller/renderer.rb:54:in `new' .../actionpack-6.0.0/lib/action_controller/renderer.rb:54:in `new' .../actiontext-6.0.0/lib/action_text/engine.rb:50:in `block (3 levels) in ' .../activesupport-6.0.0/lib/active_support/callbacks.rb:429:in `instance_exec'

Slide 89

Slide 89 text

action_text/ engine.rb?? We aren't using action_text in this action!

Slide 90

Slide 90 text

An Initializer in ActionText::Engine # actiontext-6.0.0/lib/action_text/engine.rb initializer "action_text.renderer" do |app| app.executor.to_run { ActionText::Content.renderer = ApplicationController.renderer } app.executor.to_complete { ActionText::Content.renderer = ApplicationController.renderer } ActiveSupport.on_load(:action_text_content) do self.renderer = ApplicationController.renderer end ActiveSupport.on_load(:action_controller_base) do before_action { ActionText::Content.renderer = ApplicationController.renderer.new(request.env) } end end

Slide 91

Slide 91 text

So,

Slide 92

Slide 92 text

Action Text 6.0.0 Includes This Bug

Slide 93

Slide 93 text

When Action Text Is Bundled and Required, It Instantiates An AC::Renderer Before Each Action In Every Controller

Slide 94

Slide 94 text

The Solution

Slide 95

Slide 95 text

Skip Requiring actiontext/ railtie (unless you use it) # config/application.rb require_relative 'boot' -require 'rails/all' +%w( + active_record/railtie + active_storage/engine + action_controller/railtie + action_view/railtie + action_mailer/railtie + active_job/railtie + action_cable/engine + action_mailbox/engine + rails/test_unit/railtie + sprockets/railtie +).each do |railtie| + begin + require railtie + rescue LoadError + end +end

Slide 96

Slide 96 text

Then Let's Measure the Memory Usage Again

Slide 97

Slide 97 text

Result Total allocated: 63830 bytes (599 objects) ! Total allocated: 55560 bytes (574 objects)

Slide 98

Slide 98 text

The Memory Usage Improves 8000+ Bytes

Slide 99

Slide 99 text

I Filed This Issue https:/ /github.com/rails/ rails/issues/36963

Slide 100

Slide 100 text

So, This Would Be Fixed in 6.0.1, Hopefully

Slide 101

Slide 101 text

What If We Skip Requiring Some More Other Components That We Don't Use Here? # config/application.rb @@ -2,13 +2,10 @@ require_relative 'boot' %w( active_record/railtie - active_storage/engine action_controller/railtie action_view/railtie action_mailer/railtie active_job/railtie - action_cable/engine - action_mailbox/engine

Slide 102

Slide 102 text

Result Total allocated: 55560 bytes (574 objects) ↓ Total allocated: 55560 bytes (574 objects)

Slide 103

Slide 103 text

Nothing Changed!

Slide 104

Slide 104 text

Benchmark Result for Each Action (Allocation) ok: 55560 bytes (574 objects) show: 127210 bytes (1301 objects) new 173108 bytes (1657 objects) index: 597728 bytes (6333 objects)

Slide 105

Slide 105 text

Now We Got the Numbers We need such numbers to see how we're optimizing things

Slide 106

Slide 106 text

So, Let's Start Optimizing Rails

Slide 107

Slide 107 text

Now We Have a List of Files that Consume Memory allocated memory by file ----------------------------------- 3702 rack-2.0.7/lib/rack/utils.rb 3624 actionpack-6.0.0/lib/action_dispatch/http/response.rb 2944 activesupport-6.0.0/lib/active_support/callbacks.rb 2728 activesupport-6.0.0/lib/active_support/ hash_with_indifferent_access.rb 2720 actionpack-6.0.0/lib/action_dispatch/middleware/cookies.rb 2144 activesupport-6.0.0/lib/active_support/messages/rotator.rb 1944 ruby/lib/logger.rb 1856 activesupport-6.0.0/lib/active_support/parameter_filter.rb 1680 activesupport-6.0.0/lib/active_support/inflector/methods.rb 1520 ruby/lib/ipaddr.rb 1480 actionpack-6.0.0/lib/action_dispatch/journey/router.rb 1280 actionpack-6.0.0/lib/action_controller/metal/ instrumentation.rb

Slide 108

Slide 108 text

rack, actionpack, activesupport, ruby...

Slide 109

Slide 109 text

These Things Are Part of Your Application

Slide 110

Slide 110 text

You Can Make Your Applications Fast by Making Rails Fast

Slide 111

Slide 111 text

Rails Is Just a Ruby Program That You Can Patch

Slide 112

Slide 112 text

First, Set Up gem-src

Slide 113

Slide 113 text

Then Re-install rails

Slide 114

Slide 114 text

gem-src Will Turn These Gems Into Git Repos

Slide 115

Slide 115 text

And Now You're Ready to Commit Your Patches $ cd (bundle show actionpack) % cd `bundle show actionpack` (zsh)

Slide 116

Slide 116 text

Just Read These Files in the Memory Usage List

Slide 117

Slide 117 text

And Try to Reduce Object Creations as Much as Possible

Slide 118

Slide 118 text

For Example,

Slide 119

Slide 119 text

Regexp#=~ vs Regexp#match?

Slide 120

Slide 120 text

Regexp#=~ $ ruby -rmemory_profiler -e "s = 'ABC'; MemoryProfiler.report { s =~ / A/ }.pretty_print" Total allocated: 208 bytes (2 objects) allocated memory by location ----------------------------------- 208 -e:1 allocated memory by class ----------------------------------- 168 MatchData 40 String allocated objects by class ----------------------------------- 1 MatchData 1 String

Slide 121

Slide 121 text

Regexp#match? $ ruby -rmemory_profiler -e "s = 'ABC'; MemoryProfiler.report { s.match?(/A/) }.pretty_print" Total allocated: 0 bytes (0 objects) Total retained: 0 bytes (0 objects) allocated memory by gem ----------------------------------- NO DATA

Slide 122

Slide 122 text

match? Is Way More Efficient Where We Don't Need the MatchData Instance

Slide 123

Slide 123 text

The Patch

Slide 124

Slide 124 text

Hash#merge vs Hash#[]=

Slide 125

Slide 125 text

Hash#merge $ ruby -rmemory_profiler -e "h1, h2, h3 = {a: 1}, {a: 2}, {a: 3}; MemoryProfiler.report { h1.merge(h2.merge(h3)) }.pretty_print" Total allocated: 336 bytes (2 objects) allocated memory by class ----------------------------------- 336 Hash allocated objects by class ----------------------------------- 2 Hash

Slide 126

Slide 126 text

Hash#[]= $ ruby -rmemory_profiler -e "h1, h2 = {a: 1}, {a: 2}; MemoryProfiler.report { h = h1.merge(h2); h[:a] = 3; h }.pretty_print" Total allocated: 168 bytes (1 objects) allocated memory by class ----------------------------------- 168 Hash allocated objects by class ----------------------------------- 1 Hash

Slide 127

Slide 127 text

We'd Better Avoid Unnecessary Hash#merge

Slide 128

Slide 128 text

The Patch

Slide 129

Slide 129 text

Hash#fetch vs (Hash#[] || default)

Slide 130

Slide 130 text

Hash#fetch Taking Two Arguments $ ruby -rmemory_profiler -e "h = {x: 1}; MemoryProfiler.report { h.fetch(:x, 'hello') }.pretty_print" Total allocated: 40 bytes (1 objects) allocated memory by class ----------------------------------- 40 String allocated objects by class ----------------------------------- 1 String

Slide 131

Slide 131 text

Hash#[] || default $ ruby -rmemory_profiler -e "h = {x: 1}; MemoryProfiler.report { h[:x] || 'hello' }.pretty_print" Total allocated: 0 bytes (0 objects) allocated memory by class ----------------------------------- NO DATA

Slide 132

Slide 132 text

The Patch

Slide 133

Slide 133 text

I Pushed About 60 Micro- Optimization Patches Like These to Rails Since the 6.0.0 Release Since I decided to talk about Rails performance in this conference And I think I could reduce
 5ʙ10% memory consumption

Slide 134

Slide 134 text

I Tried to cherry-pick These Commits to 6.0.0 to Show You the Result

Slide 135

Slide 135 text

But I Gave Up We had too many conflicts

Slide 136

Slide 136 text

So Instead, Let's Bundle Edge Rails and See How Things Are Improved in the Current master # Gemfile -gem 'rails', '~> 6.0.0' +gem 'rails', path: 'PATH/TO/RAILS'

Slide 137

Slide 137 text

Result (posts/ok) 55560 bytes (574 objects) ↓ 54995 bytes (555 objects)

Slide 138

Slide 138 text

No content

Slide 139

Slide 139 text

The Improvement Was Indeed Only 1% Since 6.0.0 1% of the memory usage 3% of the number of the allocated objects

Slide 140

Slide 140 text

Maybe Because We've Been Adding Other Heavy Stuffs to master?

Slide 141

Slide 141 text

What About Other Actions?

Slide 142

Slide 142 text

Result: posts/show 127931 bytes (1305 objects) ↓ 124773 bytes (1271 objects)

Slide 143

Slide 143 text

BTW Why Does posts/show Consume 70K More Bytes Than posts/ok?

Slide 144

Slide 144 text

Because It Uses Whole M + V + C Stack Where posts/ok uses only "C"

Slide 145

Slide 145 text

posts/show = posts/ok + find + render

Slide 146

Slide 146 text

In Order to Divide the Problem,

Slide 147

Slide 147 text

It'd Be Better to Measure M and V Separately

Slide 148

Slide 148 text

Measuring Active Record Finder Alone # app/controllers/posts_controller.rb private # Use callbacks to share common setup or constraints between actions. def set_post - @post = Post.find(params[:id]) + Prof.mem do + @post = Post.find(params[:id]) + end end

Slide 149

Slide 149 text

Measuring Action View Renderer Alone # app/controllers/posts_controller.rb def show + Prof.mem { default_render } end

Slide 150

Slide 150 text

Action Controller's default_render Method A method that is called when no explicit render was called

Slide 151

Slide 151 text

So Easy, Right?

Slide 152

Slide 152 text

Result: default_render 40968 bytes (452 objects) out of 124773 bytes (1271 objects)

Slide 153

Slide 153 text

Approx. 1/3

Slide 154

Slide 154 text

But Why Does Action View Consume 40KB Memory for Composing An 1.5KB HTML?

Slide 155

Slide 155 text

According to the Profiling Results

Slide 156

Slide 156 text

Things That Consume Memory in Action View Strings and String-ish Objects Template engines Template lookup I18n UrlHelper

Slide 157

Slide 157 text

What Are the String-ish Objects? SafeBuffer object Multibyte::Chars object

Slide 158

Slide 158 text

ActiveSupport::SafeBuffer A variety of String class that can distinguish "safe" string and "unsafe" string Inherits String class to hold a flag as an ivar inside

Slide 159

Slide 159 text

Can't We Use the taint Flag for This? I tried, but it didn't work well Because we cannot create "tainted by default" Strings

Slide 160

Slide 160 text

BTW Who Here Knows What "the taint" Flag Is? Who actually have ever used it?

Slide 161

Slide 161 text

It's a Flag That Every Ruby Object Has But Nobody Uses

Slide 162

Slide 162 text

So, the Ruby Core Team Just Decided to Remove the Flag Around Ruby 3.0

Slide 163

Slide 163 text

ActiveSupport::SafeBuffer We may be able to reduce so much garbage in the views if we can improve this Maybe we can create some kind of extension (that reuses the flag bit in String?) in the future

Slide 164

Slide 164 text

Template Engines Template Engines tend to create so many garbage Strings while composing an HTML body String

Slide 165

Slide 165 text

Why Don't We Try "The Fastest Template Engine in the World" Here?

Slide 166

Slide 166 text

You Know What?

Slide 167

Slide 167 text

It's Called "string_template" https:/ /github.com/ amatsuda/string_template

Slide 168

Slide 168 text

Its Mark Up Syntax Is Ruby's String Interpolation That Everybody Knows It's simple

Slide 169

Slide 169 text

It Doesn't Internally Create the Heavy SafeBuffer Objects And fast

Slide 170

Slide 170 text

But Has the Tradeoff It has no auto HTML safety guard We need to manually call `h()` just like we used to do before Rails 3

Slide 171

Slide 171 text

Stringification -

<%= notice %>

+

#{h notice }

Title: - <%= @post.title %> + #{h @post.title }

-<%= link_to 'Edit', edit_post_path(@post) %> | -<%= link_to 'Back', posts_path %> +#{ link_to 'Edit', edit_post_path(@post) } | +#{ link_to 'Back', posts_path }

Slide 172

Slide 172 text

Result: After Stringifying the Templates 40968 bytes (452 objects) ↓ 40467 bytes (454 objects)

Slide 173

Slide 173 text

Memory Usage Decreased, But Allocated Objects Increased...! Because string_template
 could create some extra String / SafeBuffer objects when concatenating strings with explicit `h()` call

Slide 174

Slide 174 text

string_template Maybe efficient in some particular cases Inside heavy loops With lots of string interpolations That needs no HTML escaping But Action View is not generally optimized for this template engine

Slide 175

Slide 175 text

Now, Let's Take a Look at the "new" Action

Slide 176

Slide 176 text

Result: posts/new 165836 bytes (1613 objects)

Slide 177

Slide 177 text

Who Consumes That Much Memory? allocated memory by file ----------------------------------- 8088 i18n-1.6.0/lib/i18n/interpolate/ruby.rb 7272 i18n-1.6.0/lib/i18n/backend/base.rb 6396 actionview/lib/action_view/helpers/tag_helper.rb 5088 actionview/lib/action_view/lookup_context.rb 5060 rack-2.0.7/lib/rack/utils.rb 4944 activesupport/lib/active_support/core_ext/string/output_safety.rb 4552 actionview/lib/action_view/helpers/form_helper.rb

Slide 178

Slide 178 text

Another Newcomer "I18n" Is the Top Memory Consumer Here!

Slide 179

Slide 179 text

I18n.t(:hello) => "͜Μʹͪ͸" A very simple function that looks up the dictionary with a key, and returns a String

Slide 180

Slide 180 text

Why Is I18n.t Heavy? Is it doing something more than just a Hash lookup?

Slide 181

Slide 181 text

Let's See What's Happening $ env RAILS_ENV=production rails r 'I18n.t(:hello); Prof.mem { I18n.t :hello }'

Slide 182

Slide 182 text

Result Total allocated: 1336 bytes (9 objects) allocated memory by class ----------------------------------- 928 Hash 368 Array 40 String allocated objects by class ----------------------------------- 4 Array 4 Hash 1 String

Slide 183

Slide 183 text

It Turned Out That,

Slide 184

Slide 184 text

I18n Creates Some Hash and Array Objects Per Each I18n.t Invocation Especially for manipulating given keys and options

Slide 185

Slide 185 text

How Can We Reduce These Object Creations?

Slide 186

Slide 186 text

The Solution Wrap I18n.t and cache

Slide 187

Slide 187 text

Like This? module I18nCache def self.extended(obj) obj.instance_variable_set :@i18n_cache, {} end def translate(*args, **options) cache = @i18n_cache[I18n.locale] ||= {} cache_key = args << options cache[cache_key] || begin result = super cache[cache_key] = result if result result ennd alias t translate end I18n.extend I18nCache

Slide 188

Slide 188 text

This Monkey-Patch May Not Be Production-Ready Yet I'm aware that we at least need to handle `:count` option There may be some more corner cases If this approach is too aggressive, maybe we could monkey-patch `Backend#lookup`

Slide 189

Slide 189 text

Anyway, Now I18n.t Became Much Cheaper $ env RAILS_ENV=production rails r 'I18n.t(:hello); Prof.mem { I18n.t :hello }' Total allocated: 272 bytes (2 objects) allocated memory by class ----------------------------------- 232 Hash 40 Array allocated objects by class ----------------------------------- 1 Array 1 Hash

Slide 190

Slide 190 text

Result: I18n.t(:hello) Total allocated: 1336 bytes (9 objects) ↓ Total allocated: 272 bytes (2 objects)

Slide 191

Slide 191 text

Applying This Monkey- Patch to the Rails App

Slide 192

Slide 192 text

Result: posts/new Memory Usage 165836 bytes (1613 objects) ↓ 143306 bytes (1476 objects)

Slide 193

Slide 193 text

Next, Let's Investigate the index Action Which usually becomes the main hotspot in the app

Slide 194

Slide 194 text

Result: 50 "Hello" Records Total allocated: 580397 bytes (6170 objects)

Slide 195

Slide 195 text

It Consumes This Much Memory for Fetching and Showing
 "Hello" 50 Times

Slide 196

Slide 196 text

Allocated Objects by Class allocated objects by class ----------------------------------- 3235 String 1300 Array 1019 Hash 162 ActiveSupport::SafeBuffer 100 ActiveModel::Attribute::FromDatabase 50 ActiveModel::AttributeSet 50 ActiveModel::LazyAttributeHash 50 Post

Slide 197

Slide 197 text

ActiveModel::Attribute::FromDatab ase, ActiveModel::AttributeSet, ActiveModel::LazyAttributeHash These are created through the Attributes API

Slide 198

Slide 198 text

So Here's a Super Monkey-Patch That Could Reduce These Objects https:/ /github.com/ amatsuda/ lightweight_attributes

Slide 199

Slide 199 text

lightweight_attributes' Approach AR model has too many features But in most cases, what we need is "data transfer object" instances that are "read only" Such objects can be made more lightweight by dropping some heavy features e.g. dirty tracking, type casting

Slide 200

Slide 200 text

How to Use lightweight_attributes Bundle the gem. Done.

Slide 201

Slide 201 text

Allocated Objects by Class: with lightweight_attributes allocated objects by class ----------------------------------- 3187 String 1293 Array 996 Hash 162 ActiveSupport::SafeBuffer 52 Class 50 LightweightAttributes::AttributeSet 50 Post

Slide 202

Slide 202 text

Allocated Objects by Class: Before => After 3235 String 1300 Array 1019 Hash 162 ActiveSupport::SafeBuffer 100 ActiveModel::Attribute::FromDatabase 50 ActiveModel::AttributeSet 50 ActiveModel::LazyAttributeHash 50 Post ! 3187 String 1293 Array 996 Hash 162 ActiveSupport::SafeBuffer 52 Class 50 LightweightAttributes::AttributeSet 50 Post

Slide 203

Slide 203 text

Yay! We Could Reduce So Many Objects!!

Slide 204

Slide 204 text

Result: Total Allocated with lightweight_attributes Total allocated: 580397 bytes (6170 objects) ↓ Total allocated: 586565 bytes (5981 objects)

Slide 205

Slide 205 text

Allocated Objects Decreased, But Memory Usage Increased...!

Slide 206

Slide 206 text

Probably Because We're Not Accessing All Fields in This Example The default implementation lazily type casts The lightweight_attributes version immediately type casts

Slide 207

Slide 207 text

Anyway,

Slide 208

Slide 208 text

What Consumes Memory in the index Action? allocated memory by location ----------------------------------- 35032 actionpack/lib/action_controller/metal/url_for.rb:42 35032 actionpack/lib/action_dispatch/routing/route_set.rb:199 35032 actionpack/lib/action_dispatch/routing/route_set.rb:231 27200 rubyconfid/app/views/posts/index.html.erb:19 26000 lightweight_attributes/lib/lightweight_attributes/attribute_set.rb:17 25587 activesupport/lib/active_support/core_ext/string/output_safety.rb:175 25368 actionpack/lib/action_dispatch/journey/visitors.rb:49 23432 actionview/lib/action_view/helpers/url_helper.rb:683 18920 actionpack/lib/action_dispatch/journey/visitors.rb:38

Slide 209

Slide 209 text

Mostly from UrlHelper

Slide 210

Slide 210 text

Improving UrlHelper Would Greatly Improve the index Action Performance We need your ideas!

Slide 211

Slide 211 text

Another Thing That Is Allocated Per Each Action and Can Be Cached

Slide 212

Slide 212 text

ActiveSupport Inflector post => posts post => Post etc.

Slide 213

Slide 213 text

But We Cannot Just Cache Every Inflector Calls That'll cause a memory bloat vulnerability

Slide 214

Slide 214 text

What If We Cache Only What's Included in the DB Schema?

Slide 215

Slide 215 text

The Monkey-Patches for Inflector def camelize(term, uppercase_first_letter = true) return super unless uppercase_first_letter if (cached = InflectorCache.camelize_cache[term]) cached.dup else super ennd def underscore(camel_cased_word) if (cached = InflectorCache.underscore_cache[camel_cased_word]) cached.dup else super ennd

Slide 216

Slide 216 text

And Another Monkey-Patch That Creates the Cache from Table Name module SchemaStatementsExtension def columns(table_name) result = super unless InflectorCache.camelize_cache[table_name] InflectorCache.camelize_cache[table_name] = table_name.camelize end unless InflectorCache.underscore_cache[table_name] InflectorCache.underscore_cache[table_name] = table_name.underscor end result ennd ActiveRecord::ConnectionAdapters::AbstractAdapter.prepend SchemaStatementsExtension

Slide 217

Slide 217 text

Result: posts/ok with Inflector Cache 50569 bytes (533 objects) => 49235 bytes (520 objects)

Slide 218

Slide 218 text

Yet Another Thing That Can Be Cached

Slide 219

Slide 219 text

Action View Template Lookup

Slide 220

Slide 220 text

There Are Three Kinds of Template Lookup Layout Main tamplate Partial

Slide 221

Slide 221 text

Caching Layout Resolution `:layout` per action is almost always non dynamic So it can be cached

Slide 222

Slide 222 text

The Monkey-Patch Can Be Something Like This module CachedLayout private def _layout(lookup_context, formats) cache = self.class.instance_variable_get :@_layout_cache implied_layout_name = self.class.send(:_implied_layout_name) prefixes = /\blayouts/.match?(implied_layout_name) ? [] : ["layouts"] cache[[implied_layout_name, prefixes, formats]] ||= super ennd

Slide 223

Slide 223 text

Only if Given `:layout` Is `nil` or String def _write_layout_method # :nodoc: super if _layout.nil? || (String === _layout) @_layout_cache = {} prepend CachedLayout ennd

Slide 224

Slide 224 text

Result: Layout Cache 116586 bytes (1227 objects) ↓ 113323 bytes (1189 objects)

Slide 225

Slide 225 text

Before Layout Cache allocated memory by file ----------------------------------- 4916 rack-2.0.7/lib/rack/utils.rb 3839 actionpack/lib/action_dispatch/middleware/static.rb 3821 ruby/lib/logger.rb 3800 actionpack/lib/action_dispatch/http/response.rb 3760 actionview/lib/action_view/lookup_context.rb

Slide 226

Slide 226 text

After Layout Cache allocated memory by file ----------------------------------- 4884 rack-2.0.7/lib/rack/utils.rb 3839 actionpack/lib/action_dispatch/middleware/static.rb 3821 ruby/lib/logger.rb 3800 actionpack/lib/action_dispatch/http/response.rb 3688 activesupport/lib/active_support/callbacks.rb 3552 actionpack/lib/action_dispatch/middleware/cookies.rb 3504 activesupport/lib/active_support/hash_with_indifferent_access.rb 3465 activesupport/lib/active_support/json/encoding.rb 3458 actionview/lib/action_view/helpers/tag_helper.rb 3311 activesupport/lib/active_support/message_encryptor.rb 2851 activesupport/lib/active_support/core_ext/string/output_safety.rb 2384 activesupport/lib/active_support/messages/rotator.rb 2344 activesupport/lib/active_support/core_ext/hash/keys.rb 2299 rubyconfid/app/views/layouts/application.html.string 2198 activesupport/lib/active_support/core_ext/object/json.rb 2080 actionview/lib/action_view/lookup_context.rb

Slide 227

Slide 227 text

For Caching Partials Here's an experimental repository https:/ /github.com/ amatsuda/turbo_partial

Slide 228

Slide 228 text

turbo_partial `render_partial` takes many parameters So it can't simply be cached turbo_partial adds `render_relative` and `render_absolute` features to `render_partial`

Slide 229

Slide 229 text

Rendering with turbo_partial - render 'form', post: @post + render './form', post: @post

Slide 230

Slide 230 text

I'm Skipping the Result Because It Wasn't That Much Efficient in This Case...

Slide 231

Slide 231 text

Anyway,

Slide 232

Slide 232 text

Who Are Allocating Objects Now? (post/ok) allocated memory by location ----------------------------------- 3451 actionpack/lib/action_dispatch/middleware/static.rb:95 1347 ruby/lib/logger.rb:597 1288 actionpack/lib/action_dispatch/http/response.rb:437 1160 activesupport/lib/active_support/notifications/instrumenter.rb:67 1152 activesupport/lib/active_support/callbacks.rb:405

Slide 233

Slide 233 text

Logging Creates so many String objects that are usually much longer than the response body

Slide 234

Slide 234 text

ActiveSupport::
 Instrumentation Creates large payload Hash objects and `dup`s them

Slide 235

Slide 235 text

ActiveSupport::Callbacks Creates many Procs and things

Slide 236

Slide 236 text

And What Is This Binding Thing? allocated memory by class ----------------------------------- 14440 Hash 9987 String 9688 Array 2520 MatchData 1728 ActionDispatch::Request 1608 Thread::Backtrace 1160 ActiveSupport::HashWithIndifferentAccess 1040 Proc 944 Rack::Utils::HeaderHash 912 Class 512 Regexp 352 Binding

Slide 237

Slide 237 text

It's Created Through `binding.local_variable_get`

Slide 238

Slide 238 text

binding.local_variable_ get A magical idiom to get a local variable that has a name that conflicts with reserved words

Slide 239

Slide 239 text

This Doesn't Usually Happen, But Keyword Arguments Can Create This def do_something(if: nil, unless: nil) if if ... end if unless ... ennd

Slide 240

Slide 240 text

Which Causes SyntaxError

Slide 241

Slide 241 text

I Found This When I Was Trying to Introduce Keyword Arguments in Rails

Slide 242

Slide 242 text

I Asked @ko1 How to Avoid This SyntaxError

Slide 243

Slide 243 text

Then He Introduced This New Magic

Slide 244

Slide 244 text

binding.local_variable_ get def do_something(if: nil, unless: nil) if binding.local_variable_get(:if) ... end if binding.local_variable_get(:unless) ... end end

Slide 245

Slide 245 text

Rails Started Using This in ActiveSupport::HashWithIndifferen tAccess Since 6.0

Slide 246

Slide 246 text

But It Turned Out That It Allocates a Binding Object I hope we can fix this in Ruby 2.7

Slide 247

Slide 247 text

HashWithIndifferentAc cess A variety of Hash that treats Symbol keys and String keys as the same things Converts all Symbol keys to String internally Abused too much in the Rails internal

Slide 248

Slide 248 text

Maybe We Can Reduce HashWithIndifferentAccess By defaulting this to be Symbol based Thanks to Symbol GC from @nari3 Or by implementing this in C

Slide 249

Slide 249 text

Maybe We Can Reduce HashWithIndifferentAccess By introducing more keyword arguments

Slide 250

Slide 250 text

Another Thing That Is Making This Benchmark Slow Thread synchronization

Slide 251

Slide 251 text

Current Rails Codebase Synchronizes Threads Too Defensively Because not all supported Ruby implementations has GIL (GVL)

Slide 252

Slide 252 text

Monitor#synchronize posts/show calls this 25 times per each request We could skip some of these in MRI

Slide 253

Slide 253 text

Ruby 2.6.4 or Current Master `Monitor#synchronize` creates some extra Hash objects

Slide 254

Slide 254 text

Today's Benchmark Result Becomes Like This allocated memory by file ----------------------------------- 7472 ruby/lib/monitor.rb 4908 rack-2.0.7/lib/rack/utils.rb 3839 actionpack/lib/action_dispatch/middleware/static.rb 3821 ruby/lib/logger.rb 3800 actionpack/lib/action_dispatch/http/response.rb 3688 activesupport/lib/active_support/callbacks.rb 3552 actionpack/lib/action_dispatch/middleware/cookies.rb

Slide 255

Slide 255 text

We Need to Fix This in Ruby 2.7

Slide 256

Slide 256 text

Summary

Slide 257

Slide 257 text

posts/ok 55560 bytes (574 objects) ↓ 50299 bytes (531 objects) (-9.5%)

Slide 258

Slide 258 text

show 127210 bytes (1301 objects) ↓ 113047 bytes (1192 objects) (-11.1%)

Slide 259

Slide 259 text

new 173108 bytes (1657 objects) ↓ 139605 bytes (1435 objects) (-19.4%)

Slide 260

Slide 260 text

index 597728 bytes (6333 objects) ↓ 569575 bytes (6092 objects) (-4.7%)

Slide 261

Slide 261 text

Future Works More patches on Cookies, Instrumentation, UrlHelper, etc. Publish these monkey-patches as a gem

Slide 262

Slide 262 text

Actually I Baked Some of These Silly Monkey-Patches into a Gem https:/ /github.com/ amatsuda/speed_king

Slide 263

Slide 263 text

Wrap Up Rails is a huge software that has no obvious bottleneck But we can somehow improve it by micro- optimizations I already pushed so many patches in Rails master, so Rails 6.1 has to be faster than 6.0 After such micro-optimizations, today's benchmark highlighted some performance hotspots

Slide 264

Slide 264 text

Wrap Up (2) Try speed_king gem (not released yet) But don't in production...

Slide 265

Slide 265 text

Wrap up (3) Rails is such a fun thing to play with! Performance tuning is like a game aiming for a high-score Everybody can play with it for free! Since it's open sourced! I'm waiting for you to join this game!

Slide 266

Slide 266 text

end