Slide 1

Slide 1 text

Turbo Boosting Real-world Applications Akira Matsuda

Slide 2

Slide 2 text

Turbo Boosting

Slide 3

Slide 3 text

Real-world Applications

Slide 4

Slide 4 text

Question

Slide 5

Slide 5 text

Is Your Application Fast Enough?

Slide 6

Slide 6 text

My Answer No. My app is not.

Slide 7

Slide 7 text

When We Started a Project with a Simple Scaffold, It Wasn't That Slow

Slide 8

Slide 8 text

But Our Production App Today Is Slow I guess this applies to any and all Rails applications

Slide 9

Slide 9 text

Is That Essentially Because Ruby Is Slow? I don't think so

Slide 10

Slide 10 text

Ruby Is Already Doing Very Well Even if we completely disable Ruby GC, we don't actually get that much performance gain Freezing Strings in your application code may not solve the performance problem

Slide 11

Slide 11 text

The Real Problem Lies in the Framework Architecture And some very slow components inside the framework

Slide 12

Slide 12 text

Typical Performance Diagram
 (taken from https://www.skylight.io/)

Slide 13

Slide 13 text

These Are All Serially Executed in the Main Thread For example, while querying to the DB, Ruby is doing nothing. Just waiting.

Slide 14

Slide 14 text

In Other Words, These Are All Blocking Operations

Slide 15

Slide 15 text

What If We Can Perform Them Without Blocking the Main Thread?

Slide 16

Slide 16 text

In Parallel?

Slide 17

Slide 17 text

Non-blocking?

Slide 18

Slide 18 text

Menu Turbo Boosting External API Calls Turbo Boosting DB Queries Turbo Boosting Partial Renderings Turbo Boosting Lazy Attributes Turbo Boosting Named Urls

Slide 19

Slide 19 text

Turbo Boosting External API Calls

Slide 20

Slide 20 text

Turbo Boosting External API Calls Let's start with the easiest one

Slide 21

Slide 21 text

API Calls Typically via HTTP Actually call some outside APIs Or microservices

Slide 22

Slide 22 text

"Microservices" Microservices will not solve your performance problem It can be a solution for your scalability problem It would rather add some extra network overhead on your app

Slide 23

Slide 23 text

Problem Calling external APIs makes your application slow

Slide 24

Slide 24 text

While Waiting for the HTTP Response The API call blocks the main thread The CPU does nothing while waiting for the response

Slide 25

Slide 25 text

Can We Make This Non- blocking? By doing the work in the background thread?

Slide 26

Slide 26 text

Example

Slide 27

Slide 27 text

Example The client has to call a heavy API 3 times Each API call takes 1 second

Slide 28

Slide 28 text

The API # Sleeps 1 second and says 'Hello' % rackup -b "run ->(e) { sleep 1; [200, {}, ['Hello']] }"

Slide 29

Slide 29 text

The Client Code % ruby -rhttpclient -e "t = Time.now; 3.times { p HTTPClient.new.get('http://localhost: 9292/').content }; p Time.now - t"

Slide 30

Slide 30 text

Result % ruby -rhttpclient -e "t = Time.now; 3.times { p HTTPClient.new.get('http://localhost: 9292/').content }; p Time.now - t" #=> This takes 3 seconds

Slide 31

Slide 31 text

Solution

Slide 32

Slide 32 text

Using Threads % ruby -rhttpclient -e "t = Time.now; 3.times.map { Thread.new { HTTPClient.new.get('http:// localhost:9292/') } }.each {|t| p t.value.content }; p Time.now - t"

Slide 33

Slide 33 text

Using Threads % ruby -rhttpclient -e "t = Time.now; 3.times.map { Thread.new { HTTPClient.new.get('http:// localhost:9292/') } }.each {|t| p t.value.content }; p Time.now - t" #=> This finishes in 1 second!

Slide 34

Slide 34 text

"Future Pattern" Thread.new { (do something) }.value Thread#value waits for the block to finish (internally with Thread#join) You can do anything else in the main thread while other threads are running

Slide 35

Slide 35 text

"Future Pattern" Usually we wrap this Thread with a "future object"

Slide 36

Slide 36 text

"Future Pattern" future = Future.execute { some_background_tasks } do_some_heavy_tasks_in_the_main_thread value = future.value # join the background thread

Slide 37

Slide 37 text

Turbo Boosting External API Calls Using Threads Push an I/O blocking task to a child Thread The main thread can do some other heavy tasks I know the reality is not that simple For example, in many cases, you will be caching some results in the client side. In such case, you need to synchronize the threads before caching But anyway, think about using threads. This is the basic idea

Slide 38

Slide 38 text

Turbo Boosting DB Queries

Slide 39

Slide 39 text

DB Queries Are So
 Time Consuming Obviously, the most time- consuming tasks in most of the real-world Rails apps It's essentially just another kind of I/O blocking task

Slide 40

Slide 40 text

While Active Record Is Waiting for the DB Server Response, the Ruby Process Is Doing Nothing!

Slide 41

Slide 41 text

How AR Deals with Connections AR pools the DB connections Each HTTP request kicks one Ruby Thread (or Process) in the app server AR checks out a connection from the pool per each Thread

Slide 42

Slide 42 text

So, One Request Uses Only One Connection, Although There Are So Many More Pooled Connections

Slide 43

Slide 43 text

Problem

Slide 44

Slide 44 text

DB Query Blocks the Main Thread When you throw a query to the DB, you need to wait until you get the results back

Slide 45

Slide 45 text

Solution

Slide 46

Slide 46 text

Querying in a Child Thread Maybe we can apply the same pattern with the API case?

Slide 47

Slide 47 text

Example

Slide 48

Slide 48 text

A Very Heavy Finder Query class User < ApplicationRecord def self.heavy_find(id) select('*, sleep(id)').where(id: id).first end end

Slide 49

Slide 49 text

Takes 3 Seconds for heavy_finding User 1 and 2 % rails r "User.first; p Benchmark.realtime { p User.heavy_find(1).name, User.heavy_find(2).name }" "user 1" "user 2" 3.129794000182301

Slide 50

Slide 50 text

We Can Do This in 2 Seconds Using Threads! % rails r "User.first; p Benchmark.realtime { t1= Thread.new { User.heavy_find(1).name }; t2 = Thread.new { User.heavy_find(2).name }; p t1.value, t2.value }" "user 1" "user 2" 2.0408139997161925

Slide 51

Slide 51 text

This Is Great! Why Doesn't Active Record Do This by Default?

Slide 52

Slide 52 text

Problem with This Approach Each Thread automatically establishes a new connection You'd better use with_connection to explicitly checkout and release a connection in a Thread User.connection.pool.with_conne ction { ... }

Slide 53

Slide 53 text

So This Checks Out 3 Connections... % rails r "User.first; p Benchmark.realtime { t1= Thread.new { User.connection.pool.with_connection { User.heavy_find(1).name } }; t2 = Thread.new { User.connection.pool.with_connection { User.heavy_find(2).name } }; p t1.value, t2.value; p User.connection.pool.stat }" "user 1" "user 2" {:size=>5, :connections=>3, :busy=>1, :dead=>2, :idle=>0,
 :waiting=>0, :checkout_timeout=>5} 2.0807580002583563

Slide 54

Slide 54 text

Implementation

Slide 55

Slide 55 text

I Baked This into an Experimental Plugin

Slide 56

Slide 56 text

With the Following
 2 APIs: # Fires the query in a background Thread. Joins at #records call AR::Relation#future # e.g. @posts = current_user.posts.future # Runs the block in a background Thread, checking out a new AR connection and releasing it. Returns a Future object FutureRecords.future(&block)

Slide 57

Slide 57 text

GH/amatsuda/ future_records Very roughly implemented No tests, no documentations, no comments But it works Actually, it's already used in our production app at Money Forward Please be careful not to exhaust all the connections in the connection pool

Slide 58

Slide 58 text

Future Improvements Introduce a thread pool instead of Thread.new for performance and safety I'll explain this later through another example

Slide 59

Slide 59 text

Two Other Possible Approaches Don't checkout a new connection per Thread. Share the main connection Use asynchronous connection

Slide 60

Slide 60 text

Sharing the Main Connection Mutex.synchronize { Pass the main connection to a child thread when querying } Cannot run queries in parallel. Less performance gain Maybe we can use Thread + Fiber

Slide 61

Slide 61 text

Async Connection For DB adapters that have asynchronous query API e.g. mysql2, postgres

Slide 62

Slide 62 text

Async Connection Example (mysql2) client.query(some_very_heavy_query, async: true) # This method immediately returns nil # and once the query finishes, result = client.async_result # This returns a normal ResultSet

Slide 63

Slide 63 text

You Need to Create a Mechanizm to Detect When the Query Is Done

Slide 64

Slide 64 text

Async Connection + Active Record + EventMachine I could kind of make this work locally, but it required super crazy monkey-patches on AR::Relation, FinderMethods, connection adapters, etc. Also, maybe we need to create another connection pool instance that handles async connections There's an existing library for doing this. Check out em-synchrony project if you're interested in this approach I personally don't want my production Rails app to heavily depend on EM though

Slide 65

Slide 65 text

Turbo Boosting Partial Renderings

Slide 66

Slide 66 text

We Often Have Slow Partial Templates render_partial of course blocks the main thread And in most cases partials do not depend on each other So, we may be able to render them asynchronously

Slide 67

Slide 67 text

With Ajax? Rails Ajax I guess everybody comes up with this idea and have implemented their own plugin

Slide 68

Slide 68 text

And Here's My Implementation <%= render @users, remote: true %>

Slide 69

Slide 69 text

GH/amatsuda/ ljax_rails Actually I did this 5.years.ago And realized that this is not really a good approach Because the partial needs an extra routes and a controller. It’s like creating a whole set of API for just a partial template It adds another huge overhead for Ajax roundtrip, especially on narrowband

Slide 70

Slide 70 text

Instead, Let's Think About Simply Threading render_partial Future pattern again Doesn't this perfectly work if AR connections are not concerned?

Slide 71

Slide 71 text

Initial Implementation module AsyncRenderer def render(context, options, block) if (options.delete(:async) || (options[:locals]&.delete(:async))) FuturePartial.new { super } else super ennnd class FuturePartial def initialize(&block) @thread = Thread.new(&block) end def to_s @thread.value ennd ActionView::PartialRenderer.prepend AsyncRenderer

Slide 72

Slide 72 text

Let's Measure! Adding <% sleep 1 %> in a parent template and a partial, and see how the performance was changed

Slide 73

Slide 73 text

Like This # routes.rb resources :users do collection do get :a end end # show.html.erb A <%= render 'b', locals: {async: true} %> <% sleep 1 %> # _b.html.erb B <% sleep 1 %>

Slide 74

Slide 74 text

The Result This kinda works! Seems like it returns a correct HTML. But, NO performance gain. AT ALL.

Slide 75

Slide 75 text

Why?

Slide 76

Slide 76 text

Let's See What’s Actually Executed in Ruby-level

Slide 77

Slide 77 text

Action View Compiles Each Template to a Ruby Method

Slide 78

Slide 78 text

Let's Check the Compiled Template Source Maybe the easiest way to show the Ruby code is to add something like puts source; puts at the bottom of the bundled actionview gem's ActionView::Template#compile

Slide 79

Slide 79 text

The Source def _app_views_users_a_html_erb__247788595159253739_70287467036600(lo cal_assigns, output_buffer) _old_virtual_path, @virtual_path = @virtual_path, "users/ a";_old_output_buffer = @output_buffer;;@output_buffer = output_buffer || ActionView::OutputBuffer.new;@output_buffer.safe_append='A '.freeze;@output_buffer.append=( render 'b', async: true );@output_buffer.safe_append=' '.freeze; sleep 1 @output_buffer.to_s ensure @virtual_path, @output_buffer = _old_virtual_path, _old_output_buffer end

Slide 80

Slide 80 text

@output_buffer.append= @output_buffer.append=( render 'b', async: true );

Slide 81

Slide 81 text

@output_buffer.append= Creates a future object via render async: true, then appends the future object to the buffer

Slide 82

Slide 82 text

Implementation of @output_buffer.append= module ActionView class OutputBuffer < ActiveSupport::SafeBuffer #:nodoc: ... def <<(value) return self if value.nil? super(value.to_s) end alias :append= :<< ...

Slide 83

Slide 83 text

Immediate to_s Call is Happening @output_buffer.append= calls to_s on the future object immediately after its creation Then it causes the background Thread's join

Slide 84

Slide 84 text

But Why Do We Need to Call to_s There? Because ActionView::OutputBuffer < ActiveSupport::SafeBuffer < String You need to make sure that the value is_a String before <

Slide 85

Slide 85 text

Like This '' << :x #=> no implicit conversion of Symbol into String
 (TypeError) '' << 10 #=> "\n"

Slide 86

Slide 86 text

How Can We Make Future Partial Objects Live Longer? Immediate to_s call is inevitable so far as the buffer is_a String What if we store the view fragments in an Array, then concat them at the very last?

Slide 87

Slide 87 text

The Array Buffer module ArrayBuffer def initialize(*) super @values = [] end def <<(value) @values << value unless value.nil? self end alias :append= :<< def to_s @values.join # or something like that end ... end ActionView::OutputBuffer.prepend AsyncPartial::ArrayBuffer

Slide 88

Slide 88 text

Measuring Again Completed 200 OK in 1026ms (Views: 1013.0ms | ActiveRecord: 0.7ms)

Slide 89

Slide 89 text

Measuring Again It works perfectly! Now it returns the result in 1 second!

Slide 90

Slide 90 text

BTW, If You're Looking for the Fastest Template Engine on the current String-based OutputBuffer There's an implementation that is faster than Erubi, or Haml, or any other existing template engine in the world The gems is called string_template

Slide 91

Slide 91 text

GH/amatsuda/ string_template It compiles the whole template in one single String literal with interpolations Which is of course significantly faster than string << another_string << another_string...

Slide 92

Slide 92 text

Anyway, Now Let's See How the Array-based Version Scales!

Slide 93

Slide 93 text

Extract the Repetition in index.html.erb to a Partial # app/views/users/index.html.erb <% @users.each do |user| %> - - <%= user.name %> - <%= link_to 'Show', user %> - <%= link_to 'Edit', edit_user_path(user) %> - <%= link_to 'Destroy', user, method: :delete, data: { confirm: 'Are you sure?' } %> - + <%= render partial: 'user', object: user, locals: {async: true} %> <% end %>

Slide 94

Slide 94 text

With Some Random Slowness to the Partial <% sleep(rand(3) / 100.0) %>

Slide 95

Slide 95 text

Register 10 Users % rails r '(1..10).each {|i| User.create! name: "user #{i}"}'

Slide 96

Slide 96 text

The Result

Slide 97

Slide 97 text

Or a 500 Error ActionView::Template::Error (Target thread must not be current thread)

Slide 98

Slide 98 text

What the Hell Is Happening?

Slide 99

Slide 99 text

It's Called
 Race Condition

Slide 100

Slide 100 text

Why Does This Code Cause Race Condition? def _app_views_users__user_html_erb___590070358791478326_70218505010200(local_assigns, output_buffer) _old_virtual_path, @virtual_path = @virtual_path, "users/_user";_old_output_buffe = @output_buffer;user = local_assigns[:user]; user = user;;@output_buffer = output_buffer || ActionView::OutputBuffer.new;@output_buffer.safe_append=' '.freeze;@output_buffer.append=( user.name );@output_buffer.safe_append=' '.freeze;@output_buffer.append=( link_to 'Show', user );@output_buffer.safe_append=' '.freeze;@output_buffer.append=( link_to 'Edit', edit_user_path(user) );@output_buffer.safe_append=' '.freeze;@output_buffer.append=( link_to 'Destroy', user, method: :delete, data { confirm: 'Are you sure?' } );@output_buffer.safe_append=' '.freeze; sleep(rand(3) / 100.0) @output_buffer.to_s ensure @virtual_path, @output_buffer = _old_virtual_path, _old_output_buffer end

Slide 101

Slide 101 text

Because It Shares an Instance Variable @output_buffer Between Threads!

Slide 102

Slide 102 text

We Need to Change the Buffer Object to Be a Local Variable or a Thread Local Variable

Slide 103

Slide 103 text

And in Order to Achieve This, We Need to Monkey-patch the Erubi Template Handler

Slide 104

Slide 104 text

I'm Not Gonna Paste the Whole Patch Here, But It's Been Done Like This properties[:bufvar] = "output_buffer" # and so on...

Slide 105

Slide 105 text

And So It Works Now!

Slide 106

Slide 106 text

Now Let's Try to Render _form.html.erb Asynchronously # new.html.erb <%= render partial: 'form', locals: {user: @user, async: true} %>

Slide 107

Slide 107 text

Then, It Renders Something Broken

Slide 108

Slide 108 text

OMG

Slide 109

Slide 109 text

This Happens Because of Action View's capture Helper Which is used to render the block content inside <%= ... do %> capture creates a new buffer, swaps @output_buffer ivar, then swaps it back at the end It's impossible to do such thing for a lvar

Slide 110

Slide 110 text

But I Could Emulate the Behavior in Another Way Somehow

Slide 111

Slide 111 text

With This Patch, Rails Would Run Hundreads or Thousands of Threads at Once Which would make the whole response time rather slower

Slide 112

Slide 112 text

We Need to Control the Number of Running Threads

Slide 113

Slide 113 text

Introducing a Thread Pool Thread.new in Ruby is not cheap Running too many Threads at once costs unignorable Thread switching cost

Slide 114

Slide 114 text

Thread Pool Implementation We can create our own Or concurrent-ruby ships with a good one concurrent-ruby should be already bundled on your app through Active Support

Slide 115

Slide 115 text

So, I Finally Finished Implementing an Async Partial Renderer! With a lot of monkey-patches But, this works only with Erubi so far We have so many other template engines, such as Erubis, Haml, Slim, etc. Especially, monkey-patching Haml is so tough (Even for the main maintainer of Haml...!)

Slide 116

Slide 116 text

The Code

Slide 117

Slide 117 text

GH/amatsuda/ async_partial

Slide 118

Slide 118 text

And, These Are All Template Engines for Rendering HTML Files What about .json renderers?

Slide 119

Slide 119 text

Jbuilder The Default JSON Renderer Completely not working Because Jbuilder is implemented very differently from other orthodox template engines

Slide 120

Slide 120 text

I Suppose Many of You May Have Already Switched to a Fast and Elegant Alternative

Slide 121

Slide 121 text

Called Jb

Slide 122

Slide 122 text

Jb of Course Works Perfectly with This Array Buffer and Threaded Partials

Slide 123

Slide 123 text

GH/amatsuda/jb

Slide 124

Slide 124 text

Turbo Boosting Lazy Attributes

Slide 125

Slide 125 text

So, Let's Move on to The View Code, and Find What's Slow There

Slide 126

Slide 126 text

Now Let's Try to Make Something Heavy and Realistic

Slide 127

Slide 127 text

Example

Slide 128

Slide 128 text

Scaffolding % rails g scaffold post col1 col2 col3 col4 col5 col6 col7 col8 col9 col10 col11 col12 col13 col14 col15 col16 col17 col18 col19 col20 col21 col22 col23 col24 col25 col26 col27 col28 col29 col30 col31 col32 col33 col34 col35 col36 col37 col38 col39 col40 col41 col42 col43 col44 col45 col46 col47 col48 col49 col50 col51 col52 col53 col54 col55 col56 col57 col58 col59 col60 col61 col62 col63 col64 col65 col66 col67 col68 col69 col70 col71 col72 col73 col74 col75 col76 col77 col78 col79 col80 col81 col82 col83 col84 col85 col86 col87 col88 col89 col90 col91 col92 col93 col94 col95 col96 col97

Slide 129

Slide 129 text

With the Data % rails r '(1..1000).each {|i| Post.create! col1: i, col2: i, col3: i, col4: i, col5: i, col6: i, col7: i, col8: i, col9: i, col10: i, col11: i, col12: i, col13: i, col14: i, col15: i, col16: i, col17: i, col18: i, col19: i, col20: i, col21: i, col22: i, col23: i, col24: i, col25: i, col26: i, col27: i, col28: i, col29: i, col30: i, col31: i, col32: i, col33: i, col34: i, col35: i, col36: i, col37: i, col38: i, col39: i, col40: i, col41: i, col42: i, col43: i, col44: i, col45: i, col46: i, col47: i, col48: i, col49: i, col50: i, col51: i, col52: i, col53: i, col54: i, col55: i, col56: i, col57: i, col58: i, col59: i, col60: i, col61: i, col62: i, col63: i, col64: i, col65: i, col66: i, col67: i, col68: i, col69: i, col70: i, col71: i, col72: i, col73: i, col74: i, col75: i, col76: i, col77: i, col78: i, col79: i, col80: i, col81: i, col82: i, col83: i, col84: i, col85: i, col86: i, col87: i, col88: i, col89: i, col90: i, col91: i, col92: i, col93: i, col94: i, col95: i, col96: i, col97: i }'

Slide 130

Slide 130 text

Benchmark % curl http:/ /localhost:3000/ posts Run this several times, abandon the fastest and slowest results

Slide 131

Slide 131 text

Results Completed 200 OK in 1610ms (Views: 1568.9ms | ActiveRecord: 40.4ms) Completed 200 OK in 1693ms (Views: 1511.1ms | ActiveRecord: 43.3ms) Completed 200 OK in 1555ms (Views: 1484.5ms | ActiveRecord: 69.9ms) Completed 200 OK in 1668ms (Views: 1626.1ms | ActiveRecord: 41.9ms) Completed 200 OK in 1791ms (Views: 1737.3ms | ActiveRecord: 53.1ms)

Slide 132

Slide 132 text

Let's See What Takes Time in Views

Slide 133

Slide 133 text

What If We Changed the Attribute Accesses to Literals? - <%= post.col1 %> - ... - <%= post.col97 %> + <%= 'post.col1' %> + ... + <%= 'post.col97' %>

Slide 134

Slide 134 text

Results Completed 200 OK in 803ms (Views: 747.5ms | ActiveRecord: 55.2ms) Completed 200 OK in 827ms (Views: 782.5ms | ActiveRecord: 44.2ms) Completed 200 OK in 820ms (Views: 775.9ms | ActiveRecord: 43.2ms) Completed 200 OK in 833ms (Views: 721.8ms | ActiveRecord: 110.3ms) Completed 200 OK in 834ms (Views: 781.1ms | ActiveRecord: 52.6ms)

Slide 135

Slide 135 text

This Means,

Slide 136

Slide 136 text

Half of the Response Time Was Spent on Reading Values from Already Selected AR Model Instance

Slide 137

Slide 137 text

Why Does Just Accessing Attributes Take That Much Time? It should be just a method call, right?

Slide 138

Slide 138 text

Let's Count The Number of Method Calls % rails r 'p = Post.first; (trace = TracePoint.new(:call) {|t| p "#{t.defined_class}##{t.method_id}"}).enable; p.col1; trace.disable' "##__temp__36f6c613" "ActiveRecord::AttributeMethods::Read#_read_attribute" "ActiveModel::AttributeSet#fetch_value" "ActiveModel::AttributeSet#[]" "ActiveModel::LazyAttributeHash#[]" "ActiveModel::LazyAttributeHash#assign_default_value" "##from_database" "ActiveModel::Attribute#initialize" "ActiveModel::Attribute#value" "ActiveModel::Attribute::FromDatabase#type_cast" "ActiveModel::Type::Value#deserialize" "ActiveModel::Type::Value#cast" "ActiveModel::Type::String#cast_value"

Slide 139

Slide 139 text

13 Method Calls per 1 String Attribute Access!

Slide 140

Slide 140 text

And 30 Method Calls per 1 Timestamp Attribute Access! % rails r 'p = Post.first; (trace = TracePoint.new(:call) {|t| p "#{t.defined_class}##{t.method_id}"}).enable; p.created_at; trace.disable' "##__temp__36275616475646f51647" "ActiveRecord::AttributeMethods::Read#_read_attribute" "ActiveModel::AttributeSet#fetch_value" "ActiveModel::AttributeSet#[]" "ActiveModel::LazyAttributeHash#[]" "ActiveModel::LazyAttributeHash#assign_default_value" "##from_database" "ActiveModel::Attribute#initialize" "ActiveModel::Attribute#value" "ActiveModel::Attribute::FromDatabase#type_cast" "ActiveRecord::AttributeMethods::TimeZoneConversion::TimeZoneConverter#deserialize" "##deserialize" "##__getobj__" "ActiveModel::Type::Value#deserialize" "##cast" "ActiveModel::Type::Value#cast" "ActiveModel::Type::DateTime#cast_value" "ActiveModel::Type::Helpers::TimeValue#fast_string_to_time" "ActiveModel::Type::Helpers::TimeValue#new_time" "ActiveRecord::Type::Internal::Timezone#default_timezone" "##default_timezone" "ActiveRecord::AttributeMethods::TimeZoneConversion::TimeZoneConverter#convert_time_to_time_zone" "Object#acts_like?" "##zone" "DateAndTime::Zones#in_time_zone" "##find_zone!" "Object#acts_like?" "DateAndTime::Zones#time_with_zone" "ActiveSupport::TimeWithZone#initialize" "ActiveSupport::TimeWithZone#transfer_time_values_to_utc_constructor"

Slide 141

Slide 141 text

So, for Looping 1000 Records and Accesing 100 Columns... Does Ruby make 13 * 100 * 1000 = 130,0000 method calls?

Slide 142

Slide 142 text

Yes, It Really Does % rails r 'calls = 0; trace = TracePoint.new(:call) {|t| calls += 1 }; Post.all.each {|p| trace.enable; p.id; p.col1; p.col2; p.col3; p.col4; p.col5; p.col6; p.col7; p.col8; p.col9; p.col10; p.col11; p.col12; p.col13; p.col14; p.col15; p.col16; p.col17; p.col18; p.col19; p.col20; p.col21; p.col22; p.col23; p.col24; p.col25; p.col26; p.col27; p.col28; p.col29; p.col30; p.col31; p.col32; p.col33; p.col34; p.col35; p.col36; p.col37; p.col38; p.col39; p.col40; p.col41; p.col42; p.col43; p.col44; p.col45; p.col46; p.col47; p.col48; p.col49; p.col50; p.col51; p.col52; p.col53; p.col54; p.col55; p.col56; p.col57; p.col58; p.col59; p.col60; p.col61; p.col62; p.col63; p.col64; p.col65; p.col66; p.col67; p.col68; p.col69; p.col70; p.col71; p.col72; p.col73; p.col74; p.col75; p.col76; p.col77; p.col78; p.col79; p.col80; p.col81; p.col82; p.col83; p.col84; p.col85; p.col86; p.col87; p.col88; p.col89; p.col90; p.col91; p.col92; p.col93; p.col94; p.col95; p.col96; p.col97; p.created_at; p.updated_at; trace.disable }; p calls' 1335000

Slide 143

Slide 143 text

So, Active Record Is Slow Not because Ruby is slow But because the code is written to be slow

Slide 144

Slide 144 text

Of Course, the Example I Showed Here Is a Silly UI We won't usually render 1,000 records in a single page In such case, we would use pagination

Slide 145

Slide 145 text

kaminari/kaminari With this plugin

Slide 146

Slide 146 text

But There Are Some Use Cases
 That We Deal with Thousands of
 AR Model Instances, e.g. APIs Batches Fintech apps

Slide 147

Slide 147 text

In Fact, We Actually Hit This Problem at Money Forward We had to render 2,500 models in one page, which was unbearably slow

Slide 148

Slide 148 text

IMO Active Record Model is Designed to Do Too Much Work What we really need here in this situation is just a value object (something like "entity bean" in the Java world) AR model is apparently an overkill for this usage AR object has too many features such as type casting, dirty tracking, serialization, validation, etcetc.

Slide 149

Slide 149 text

AR Implements Two Different Roles in One Class Data transfer object that transfers readonly data between MVC layers Form object that accepts user inputs and safely saves them to the DB

Slide 150

Slide 150 text

And What We Need in This Scenario Is Just a Lightweight Readonly Object

Slide 151

Slide 151 text

Probably We Can Transfer the ResultSet into Some Kind of DTO (Data Transfer Object)? Which is simply based on Ruby Struct?

Slide 152

Slide 152 text

It Should Kinda Work for a Simple Use Case Like the Example in This Slides But we don't want to do that in Ruby. Ruby is not Java. And we want to use associations, some other methods defined on the model class, etc. And it won't play nice with our favorite decorator plugin

Slide 153

Slide 153 text

GH/amatsuda/ active_decorator

Slide 154

Slide 154 text

Instead, Why Don't We Just Store the Attributes as a Hash Instance? And just delegate the attribute accessors to the Hash instance? (Actually, AR used to be designed that way)

Slide 155

Slide 155 text

Problem AR attribute reader method is slow

Slide 156

Slide 156 text

Solution

Slide 157

Slide 157 text

Let’s Solve This Problem Not by Adding More Complexity but Retrieving Back the Simplicity

Slide 158

Slide 158 text

Good Old Hash-based Attributes We need to monkey-patch AR internals

Slide 159

Slide 159 text

Recent Versions of Active Record Implements the "Attribute API"

Slide 160

Slide 160 text

Attribute API Highly extensible, elegantly customizable It's a great feature, indeed But... who actually uses this feature in production?

Slide 161

Slide 161 text

Attribute API Implementation In order to implement this feature, AR holds an instance of LazyAttribute per each column per each model instance

Slide 162

Slide 162 text

Can’t We Opt-out This Rarely Used Feature? And let AR objects work speedily by default? It's great that AR has a lot of elegant features, but we want the model instances to perform as fast as possible by default

Slide 163

Slide 163 text

Implementation

Slide 164

Slide 164 text

If The Model Declares No Custom Attribute, Return a Good Old Simple Hash Based Model Instance I suppose this would speed up 99.8% of AR models in the world

Slide 165

Slide 165 text

Implementation

Slide 166

Slide 166 text

An AttributeSet Alternative That Simply Delegates to a Given Hash Attributes module LightweightAttributes class AttributeSet delegate :each_value, :fetch, :except, :[], : []=, :key?, :keys, to: :attributes def initialize(attributes) @attributes = attributes end def fetch_value(name) self[name] end ... ennd

Slide 167

Slide 167 text

An AttributeSet Builder that Builds the Lightweight AttributeSet when Building an Instance from DB Query Result module LightweightAttributes class AttributeSet class Builder ... def build_from_database(values = {}, _additional_types = {}) LightweightAttributes::AttributeSet.new values ennnnd

Slide 168

Slide 168 text

Overriding AR::Base.attributes_builder to Return the Lightweight AttributeSet Builder module ARBaseClassMethods def attributes_builder # If the model has no custom attribute if attributes_to_define_after_schema_loads.empty? LightweightAttributes::AttributeSet::Builder.new(...) else super ennnd

Slide 169

Slide 169 text

Results (Before) Completed 200 OK in 1610ms (Views: 1568.9ms | ActiveRecord: 40.4ms) Completed 200 OK in 1693ms (Views: 1511.1ms | ActiveRecord: 43.3ms) Completed 200 OK in 1555ms (Views: 1484.5ms | ActiveRecord: 69.9ms) Completed 200 OK in 1668ms (Views: 1626.1ms | ActiveRecord: 41.9ms) Completed 200 OK in 1791ms (Views: 1737.3ms | ActiveRecord: 53.1ms)

Slide 170

Slide 170 text

Results (After) Completed 200 OK in 971ms (Views: 926.5ms | ActiveRecord: 44.4ms) Completed 200 OK in 998ms (Views: 950.3ms | ActiveRecord: 46.8ms) Completed 200 OK in 1128ms (Views: 1073.2ms | ActiveRecord: 54.1ms) Completed 200 OK in 927ms (Views: 876.1ms | ActiveRecord: 50.1ms) Completed 200 OK in 963ms (Views: 919.3ms | ActiveRecord: 42.9ms)

Slide 171

Slide 171 text

Results The whole scaffold app became 40% faster!!! Because of less method invocations and less object creations

Slide 172

Slide 172 text

It's Still Not Production Ready Though % rails r 'p [(c = Post.first.created_at), c.class]' ["2018-04-16 21:13:21.667499", String]

Slide 173

Slide 173 text

Other Possible APIs Add a new method on AR::Relation that returns a lightweight Model collection, and don't change the default behavior Change Relation#readonly method to return a lightweight Model collection

Slide 174

Slide 174 text

But I Basically Prefer Automagic APIs over Too explicit APIs

Slide 175

Slide 175 text

The Code

Slide 176

Slide 176 text

GH/amatsuda/ lightweight_attributes

Slide 177

Slide 177 text

Turbo Boosting Named Urls

Slide 178

Slide 178 text

Now the AR Attributes Became Fast Enough, What in the View Is Slow Next?

Slide 179

Slide 179 text

What Is the Slowest Thing in the Scaffold View?

Slide 180

Slide 180 text

The Answer Is, the Links

Slide 181

Slide 181 text

If We Remove these 3 Links from posts#index View # app/views/posts/index.html.erb <%= post.col95 %> <%= post.col96 %> <%= post.col97 %> - <%= link_to 'Show', post %> - <%= link_to 'Edit', edit_post_path(post) %> td> - <%= link_to 'Destroy', post, method: :delete, data: { confirm: 'Are you sure?' } %> <% end %>

Slide 182

Slide 182 text

Results (Before) Completed 200 OK in 971ms (Views: 926.5ms | ActiveRecord: 44.4ms) Completed 200 OK in 998ms (Views: 950.3ms | ActiveRecord: 46.8ms) Completed 200 OK in 1128ms (Views: 1073.2ms | ActiveRecord: 54.1ms) Completed 200 OK in 927ms (Views: 876.1ms | ActiveRecord: 50.1ms) Completed 200 OK in 963ms (Views: 919.3ms | ActiveRecord: 42.9ms)

Slide 183

Slide 183 text

Results (After) Completed 200 OK in 661ms (Views: 608.2ms | ActiveRecord: 51.8ms) Completed 200 OK in 604ms (Views: 563.4ms | ActiveRecord: 40.0ms) Completed 200 OK in 574ms (Views: 533.2ms | ActiveRecord: 39.8ms) Completed 200 OK in 735ms (Views: 695.3ms | ActiveRecord: 38.9ms) Completed 200 OK in 698ms (Views: 657.7ms | ActiveRecord: 39.3ms)

Slide 184

Slide 184 text

Results 35% performance gain even with the 100 columns view! For a typical models like with 10-ish columns, it changes more, like 70%

Slide 185

Slide 185 text

Problem named_url Is Slow

Slide 186

Slide 186 text

Solution If the OutputBuffer is already Array based, there's a very simple solution We can futurize it

Slide 187

Slide 187 text

Rendering the Links Asynchronously module FutureUrlHelper def link_to(name = nil, options = nil, html_options = nil, &block) if ((Hash === options) && options.delete(:async)) || ((Hash === html_options) && html_options.delete(:async)) FutureObject.new { super } else super ennnd

Slide 188

Slide 188 text

In This Particular Example, It Won't Be That Effective Because the Links Are Already at the Very Bottom of the Page

Slide 189

Slide 189 text

Another Possible Solution Cache url_for results in memory

Slide 190

Slide 190 text

I Created This 2.years.ago It may be helpful if your app heavily uses named urls

Slide 191

Slide 191 text

GH/amatsuda/ turbo_urls

Slide 192

Slide 192 text

What We Learned

Slide 193

Slide 193 text

What We Learned (1) If you have external API calls in your app, consider doing them in child threads You can run AR queries in Threads, but be careful not to use up all pooled connections ActionView::OutputBuffer can be Array based, for some future extensions Monkey-patching Haml is hard LazyAttribute is so lazy, and opting this out may drastically boost the performance url_for is slow, and we need to fix it

Slide 194

Slide 194 text

What We Learned (2) You can find what’s slow in your app And YOU can fix it If the problem lies inside the framework, just hack the framework It should be fun!

Slide 195

Slide 195 text

What We Learned (3) Performance is not for free There are certain trade offs In Rails' case, we need to craft so many evil monkey-patches Maybe because the framework is not flexible enough

Slide 196

Slide 196 text

What We Learned (4) Thread programming, especially debugging is hard I don’t wanna do this anymore I'm really looking forward for the new Thread model planned to be introduced in Ruby 3

Slide 197

Slide 197 text

Future Plans Finish implementing the plugins that I introduced today All these plugins are experimental. They basically have no tests, no documentations, no comments at the moment Put them in actual production apps I'm sorry but the title of this talk was probably a little bit misleading Introduce more extensibility to the framework I realized some things that should better be changed in the framework side rather than in monkey-patch plugins

Slide 198

Slide 198 text

end

Slide 199

Slide 199 text

end name: Akira Matsuda GitHub: @amatsuda Twitter: @a_matsuda