Upgrade to Pro — share decks privately, control downloads, hide ads and more …

3x Rails

3x Rails

Slides for RailsConf 2016 talk "3x Rails: Tuning the Framework Internals" http://railsconf.com/program#prop_2159

Akira Matsuda

May 04, 2016
Tweet

More Decks by Akira Matsuda

Other Decks in Programming

Transcript

  1. 3x Rails
    Akira Matsuda
    RAILSCONF 2016

    View Slide

  2. 3x Rails?
    Akira Matsuda
    RAILSCONF 2016

    View Slide

  3. Matz @ RubyKaigi 2015

    View Slide

  4. Ruby 3x3
    Matz: “Ruby 3.0 will
    be 3 times faster!”

    View Slide

  5. 3x Rails?
    Wait until Ruby 3.0
    release
    Run your Rails app on

    Ruby 3.0
    Done.

    View Slide

  6. self
    name: Akira
    GitHub: amatsuda
    Twitter: @a_matsuda

    View Slide

  7. Ruby

    View Slide

  8. Rails

    View Slide

  9. Gems
    kaminari
    active_decorator
    motorhead
    stateful_enum
    action_args (asakusarb)

    View Slide

  10. View Slide

  11. Asakusa.rb
    Since 2008
    356 meetups
    30+ Ruby Core committers
    Attendees from 19 different
    countries

    View Slide

  12. RubyKaigi
    Chief Organizer

    View Slide

  13. RubyKaigi 2015
    = ͟ ͟͞͞

    View Slide

  14. RubyKaigi 2016
    September 8..10
    Kyoto (not in Tokyo!)

    View Slide

  15. Kyoto (ژ౎)

    View Slide

  16. RubyKaigi - Venue

    View Slide

  17. RubyKaigi - Main Hall

    View Slide

  18. RubyKaigi - Hall B

    View Slide

  19. RubyKaigi - Garden

    View Slide

  20. RubyKaigi 2016
    CFP is open!
    Tickets are available!
    http:/
    /rubykaigi.org/

    View Slide

  21. begin

    View Slide

  22. Speeding Up

    the Rails Framework

    View Slide

  23. Know the Speed

    View Slide

  24. How Can We Measure the
    Speed?

    View Slide

  25. Use benchmark-ips
    You can benchmark
    anything inside the
    block

    View Slide

  26. For Example
    If you want to very
    roughly benchmark the
    whole Rails app's
    request processing...

    View Slide

  27. A Horrible Way to Benchmark
    Rails’ Request Processing
    # This is not a very beautiful code, but it's just an example...
    # And it kinda works...
    require 'benchmark/ips'
    Rails.application.config.after_initialize do
    Rails.application.extend Module.new {
    def call(e)
    super
    Benchmark.ips do |x|
    x.report('Rails.application#call') do
    super
    end
    end
    super
    end
    }
    end

    View Slide

  28. How Can We Improve

    This Score?

    View Slide

  29. My 1st Assumption
    GC

    View Slide

  30. Ruby Is Slow Because
    Ruby GC Is Slow
    Everyone knows this
    fact, right?
    I heard GC takes 30%
    of the whole Response
    time in Rails

    View Slide

  31. Let's Observe the GC
    GC.stat
    gc_tracer

    View Slide

  32. Adding GC.stat to the ips
    Code
    Rails.application.config.after_initialize do
    Rails.application.extend Module.new {
    def call(e)
    super
    + p before: GC.stat
    Benchmark.ips do |x|
    x.report('Rails.application#call') do
    super
    end
    end
    + p after: GC.stat
    super
    end
    }
    end

    View Slide

  33. {:before=>{:count=>53, ..., :minor_gc_count=>44, :major_gc_count=>9, …}}
    Warming up --------------------------------------
    Rails.application#call
    4.000 i/100ms
    Calculating -------------------------------------
    Rails.application#call
    45.233 (± 4.4%) i/s - 228.000 in 5.052723s
    {:after=>{:count=>107, ..., :minor_gc_count=>95, :major_gc_count=>12, ...}}
    The GC.stat + ips Result 

    (scaffold index, 100 AR models)

    View Slide

  34. The GC.stat + ips Result 

    (scaffold index, 100 AR models)
    GC is surely happening

    View Slide

  35. Let's Stop the GC
    RUBY_GC_HEAP_INIT_
    SLOTS=1000000 +
    GC.disable & stat

    View Slide

  36. Result
    {:before=>{:count=>3, ..., :minor_gc_count=>1, :major_gc_count=>2, …}}
    Warming up --------------------------------------
    Rails.application#call
    4.000 i/100ms
    Calculating -------------------------------------
    Rails.application#call
    50.449 (± 5.9%) i/s - 252.000 in 5.008789s
    {:after=>{:count=>5, ..., :minor_gc_count=>1, :major_gc_count=>4, ...}}

    View Slide

  37. Summary (GC)
    The GC adds about
    10% overhead

    View Slide

  38. History of Ruby GC
    Improvement
    1.9.3: Lazy Sweep (nari3)
    2.0 : Bitmap Marking (nari3)
    2.1 : RGen GC (ko1)
    2.2 : Incremental GC (ko1),

    2 age RGenGC (ko1),

    Symbol GC (nari3)

    View Slide

  39. Hats Off to ko1!
    ko1 really keeps doing
    amazing amount &
    quality of Ruby internal
    improvements!

    View Slide

  40. Garbage Strings

    View Slide

  41. Garbage Strings Used to
    Be a Big Concern
    There was a trend putting so many `.freeze`
    here and there in the code
    I thought that made our codebase super ugly. I
    did no more want to see PRs that adds
    `.freeze` to String literals in the framework
    So I proposed the magic comment

    (ruby 2.3): # frozen-string-literal: true
    Not mainly in order to improve the speed but in
    order to stop people's code pollution!

    View Slide

  42. # frozen-string-literal: true
    Have anyone tried this?
    Maybe Rails will become a little bit
    faster if you put this to all .rb files in
    Rails
    Then we could remove all explicit
    `.freeze` calls
    We need to add some `.dup` calls though

    View Slide

  43. Anyway,

    View Slide

  44. Garbage Strings
    String garbages do no
    more affect your app's
    throughput
    Let's stop caring about
    that now!

    View Slide

  45. Another Ruby Myth

    View Slide

  46. Ruby Is Slow Because It's
    a Scripting Language?

    View Slide

  47. Ruby 2.3 New Features!
    RubyVM::InstructionSequence#

    to_binary(extra_data = nil)
    RubyVM::InstructionSequence.

    load_from_binary(binary)
    RubyVM::InstructionSequence.

    load_from_binary_extra_data(binary)

    View Slide

  48. Ruby 2.3 New Features!
    You can precompile
    Ruby code now!

    View Slide

  49. yomikomu?
    See ko1's talk
    tomorrow!

    View Slide

  50. So,

    View Slide

  51. Which Part of the App
    Takes Time?
    Let’s profile!

    View Slide

  52. stackprof
    A sampling call-stack
    profiler
    Flamegraph support
    https:/
    /github.com/tmm1/
    stackprof

    View Slide

  53. peek-rblineprof
    Shows how much time each
    line of your Rails application
    takes throughout a request
    https:/
    /github.com/peek/
    peek-rblineprof

    View Slide

  54. TracePoint

    View Slide

  55. You Can Simply Count the
    Numbers of Method Calls
    Without Adding a Gem
    Use Ruby's built in
    TracePoint API (ko1)

    View Slide

  56. Counting Method Calls
    Using TracePoint
    class MethodCounter
    def initialize(app)
    @app = app
    end
    def call(env)
    calls = []
    trace = TracePoint.new(:call, :c_call) do |tp|
    calls << [tp.defined_class, tp.method_id, tp.lineno]
    end
    trace.enable
    ret = @app.call env
    trace.disable
    pp calls.group_by(&:itself).map {|k, v| {k => v.length}}.sort_by {|h|

    -h.values.first}
    ret
    end
    end
    use MethodCounter

    View Slide

  57. Top 10 Method Calls on the
    Scaffold Index (100 AR models)

    {[ActiveSupport::SafeBuffer, :html_safe?, 212] => 1622},

    {[Object, :html_safe?, 123] => 1213},

    {[Set, :include?, 214] => 1137},

    {[CGI::Escape, :escapeHTML, 39] => 913},

    {[#, :unwrapped_html_escape, 34] => 913},

    {[ActiveSupport::Multibyte::Unicode, :tidy_bytes, 245] => 913},

    {[String, :scrub, 248] => 912},

    {[ActiveRecord::AttributeSet, :[], 9] => 900},

    {[ActiveRecord::LazyAttributeHash, :[], 39] => 900},

    {[ActiveRecord::AttributeSet, :fetch_value, 41] => 900},

    View Slide

  58. However,
    These are well known theories
    that you might heard of before
    I’ll take a different approach
    today
    I’ll show you some known problems
    (to me) through my experience

    View Slide

  59. Rails Consists of

    M, V, and C
    Which one of

    M, V, or C is working
    heavily?

    View Slide

  60. How about ActionPack?
    ActionPack sits on top of Rack
    Let's see if we could find a
    bottleneck in the middleware stack
    Or maybe we could compose a
    minimum Rack middleware stack
    for our app?

    View Slide

  61. Minimum Rack
    Middleware Stack
    This is what rails-api
    (which has been merged
    into Rails 5) does
    Let's see which Rack
    middleware takes time

    View Slide

  62. Measuring Each Rack
    Middleware
    # Again, very roughly implemented monkey-patch that kinda works...
    $RACK_BENCH_BEFORE_CALL = $RACK_BENCH_AFTER_CALL = nil
    module RackBench
    def call(*)
    p "#{self.class}: before call" => Time.now - $RACK_BENCH_BEFORE_CALL if
    $RACK_BENCH_BEFORE_CALL
    $RACK_BENCH_BEFORE_CALL = Time.now
    ret = super
    p "#{self.class}: after call" => Time.now - $RACK_BENCH_AFTER_CALL if
    $RACK_BENCH_AFTER_CALL
    $RACK_BENCH_AFTER_CALL = Time.now
    ret
    end
    end
    Rails.configuration.middleware.each do |m|
    if m.klass.respond_to? :prepend
    m.klass.prepend RackBench
    else
    m.klass.singleton_class.prepend RackBench
    end
    end

    View Slide

  63. Measuring Each Rack
    Middleware - Result
    {"ActionDispatch::Static: before call" => 8.0e-06}
    {"ActionDispatch::Executor: before call" => 0.000107}
    {"AS::Cache::Strategy::LocalCache::Middleware: before call" => 0.002279}
    {"Rack::Runtime: before call" => 1.9e-05}
    {"Rack::MethodOverride: before call" => 8.0e-06}
    {"ActionDispatch::RequestId: before call" => 1.0e-05}
    {"Rails::Rack::Logger: before call" => 4.8e-05}
    {"ActionDispatch::ShowExceptions: before call" => 0.00023}
    {"ActionDispatch::DebugExceptions: before call" => 1.0e-05}
    {"ActionDispatch::RemoteIp: before call" => 1.0e-05}
    {"ActionDispatch::Callbacks: before call" => 1.5e-05}
    {"ActionDispatch::Cookies: before call" => 1.6e-05}
    {"ActionDispatch::Session::CookieStore: before call" => 1.0e-05}
    {"Rack::Head: before call" => 3.1e-05}
    {"Rack::ConditionalGet: before call" => 5.0e-06}
    {"Rack::ETag: before call" => 6.0e-06}

    View Slide

  64. Measuring Each Rack
    Middleware - Result (2)
    {"Rack::ConditionalGet: after call" => 2.4e-05}
    {"Rack::Head: after call" => 5.0e-06}
    {"ActionDispatch::Session::CookieStore: after call" => 0.000269}
    {"ActionDispatch::Cookies: after call" => 7.8e-05}
    {"ActionDispatch::Callbacks: after call" => 4.0e-06}
    {"ActionDispatch::RemoteIp: after call" => 1.3e-05}
    {"ActionDispatch::DebugExceptions: after call" => 6.0e-06}
    {"ActionDispatch::ShowExceptions: after call" => 2.0e-06}
    {"Rails::Rack::Logger: after call" => 2.2e-05}
    {"ActionDispatch::RequestId: after call" => 1.0e-05}
    {"Rack::MethodOverride: after call" => 3.0e-06}
    {"Rack::Runtime: after call" => 1.4e-05}
    {"AS::Cache::Strategy::LocalCache::Middleware: after call" => 7.0e-06}
    {"ActionDispatch::Executor: after call" => 5.0e-06}
    {"ActionDispatch::Static: after call" => 2.0e-06}
    {"Rack::Sendfile: after call" => 7.0e-06}

    View Slide

  65. There's No Slow Middleware
    in the Default Stack
    It wouldn't be that
    effective if we could
    speed up or remove
    some Rack middleware

    View Slide

  66. Back to the Method Calls
    List Again,

    View Slide

  67. Top 10 Method Calls on the
    Scaffold Index (100 AR models)

    {[ActiveSupport::SafeBuffer, :html_safe?, 212] => 1622},

    {[Object, :html_safe?, 123] => 1213},

    {[Set, :include?, 214] => 1137},

    {[CGI::Escape, :escapeHTML, 39] => 913},

    {[#, :unwrapped_html_escape, 34] => 913},

    {[ActiveSupport::Multibyte::Unicode, :tidy_bytes, 245] => 913},

    {[String, :scrub, 248] => 912},

    {[ActiveRecord::AttributeSet, :[], 9] => 900},

    {[ActiveRecord::LazyAttributeHash, :[], 39] => 900},

    {[ActiveRecord::AttributeSet, :fetch_value, 41] => 900},

    View Slide

  68. ActionView

    View Slide

  69. ActionView Has Some
    Performance Problems, For Sure

    View Slide

  70. ActionView

    Template Rendering Flow
    Template lookup
    Template compilation
    Template rendering

    View Slide

  71. Speeding Up

    Template Lookup

    View Slide

  72. Current Implementation
    of Template Lookup
    # AV/template/resolver.rb
    module ActionView
    class PathResolver < Resolver #:nodoc:
    def find_template_paths(query)
    Dir[query].uniq.reject do |filename|
    File.directory?(filename) ||
    # deals with case-insensitive file systems.
    !File.fnmatch(query, filename, File::FNM_EXTGLOB)
    ennnnd

    View Slide

  73. Current Implementation
    The Resolver queries to
    the filesystem per each
    template rendering
    Queries with a Bash-like
    globbing format

    View Slide

  74. Couldn't We Speed This
    Up?
    By default, AV uses a
    Resolver called
    “OptimizedResolver”
    Maybe we can create
    “MoreOptimizedResolver”?

    View Slide

  75. MoreOptimizedResolver -
    Concept
    Why don't we cache
    all filenames, and
    perform the template
    search in memory?

    (in production env)

    View Slide

  76. MoreOptimizedResolver -
    Implementation
    https:/
    /github.com/
    amatsuda/
    more_optimized_resolver

    View Slide

  77. Benchmark
    require 'benchmark/ips'
    view = Class.new(ActionView::Base).new('.')
    path, _prefix, *args = view.lookup_context.send(

    :args_for_lookup, 'foo', [], false, [], {})
    resolver = ::ActionView::OptimizedFileSystemResolver.new '.'
    Benchmark.ips do |x|
    x.report('default') {
    resolver.find_all(path, '', *args)
    }
    end

    View Slide

  78. Benchmark Result
    # The original Resolver (OptimizedResolver)
    Warming up --------------------------------------
    default 17.000 i/100ms
    Calculating -------------------------------------
    default 179.250 (± 3.3%) i/s - 901.000 in 5.031392s
    # MoreOptimizedResolver
    Warming up --------------------------------------
    default 320.000 i/100ms
    Calculating -------------------------------------
    default 3.266k (± 2.8%) i/s - 16.640k in 5.099793s

    View Slide

  79. Benchmark Result
    18x faster than the
    AV default Resolver!

    (in a micro benchmark)

    View Slide

  80. Inline render partial -
    Concept

    render_partial is basically slow

    Because it looks up the template

    And creates another buffer, runs another
    template compilation & rendering per each partial

    We don't always need a new context for a partial
    Simply concatenating templates (just like PHP's
    include) would be enough in some cases
    e.g. `<%= render 'footer' %>`

    View Slide

  81. Inline render partial -
    Implementation
    WIP

    View Slide

  82. Another Idea
    `render` method does too
    much assumptions
    Maybe we can give more
    hints to `render` to make
    template resolution faster?

    View Slide

  83. render :path - Concept
    Maybe `render` can
    accept full_path
    template name so that
    it doesn't have to scan
    through all PathSets?

    View Slide

  84. render :path - API
    render path: __dir__ + 'foo'
    render relative: 'foo'

    View Slide

  85. render :path -
    Implementation
    Unimplemented

    View Slide

  86. Parallelized render partial
    - Concept
    `render_collection` could be
    parallelized
    Partials are basically individual
    We could render all of them
    at once using Threads

    View Slide

  87. Parellelize render partial -
    Result
    I tried,
    But with this patch,
    ActiveRecord connections very
    easily bloats up
    And that very often causes
    "Too many connections" error

    View Slide

  88. Remote Render Partial -
    Concept
    We sometimes want heavy
    partials to be rendered lazily
    Would be nice if we could
    render via Ajax (or
    ActiveJob, maybe)

    View Slide

  89. Remote Render Partial -
    Implementation
    https:/
    /github.com/
    amatsuda/ljax_rails
    (I forgot what "ljax"
    stands for)

    View Slide

  90. ljax_rails - API
    <%= render 'users', remote: true %>

    View Slide

  91. ljax_rails - Result
    Kind of works
    I’m not using it though

    View Slide

  92. Template Rendering
    IMO the most
    unneeded effort in AV
    template rendering is
    Encoding support

    View Slide

  93. Current Implementation
    # template/handlers/erb.rb
    module ActionView
    class Template
    module Handlers
    class ERB
    def call(template)
    # First, convert to BINARY, so in case the encoding is
    # wrong, we can still find an encoding tag
    # (<%# encoding %>) inside the String using a regular
    # expression
    template_source = template.source.dup.force_encoding(Encoding::ASCII_8BIT)
    erb = template_source.gsub(ENCODING_TAG, '')
    encoding = $2
    erb.force_encoding valid_encoding(template.source.dup, encoding)
    # Always make sure we return a String in the default_internal
    erb.encode!
    self.class.erb_implementation.new(
    erb,
    :escape => (self.class.escape_whitelist.include? template.type),
    :trim => (self.class.erb_trim_mode == "-")
    ).src
    ennnnnd

    View Slide

  94. What We Do for the

    Multi Encoding Support

    `.dup` the given template source

    .`force_encoding` the source to Binary

    Extract the magic encoding comment from the
    template source

    `.dup` the given template source again

    `.force_enconding` the template source if a magic
    comment was found

    `.force_enconding` the ERB template

    `.encode!` the ERB template

    View Slide

  95. Who Needs This Encoding
    Support?
    Who actually writes a non-UTF8 view
    file?
    Who actually puts an encoding magic
    comment in the view files?
    We see some test cases concerning
    Shift JIS encoded templates, but I'm
    sure nobody does this in Japan

    View Slide

  96. Current Status
    99.9% of Rails apps in the
    world do not require this
    feature
    But this default behavior puts
    the brakes on everyone’s apps
    [citation needed]

    View Slide

  97. My Suggestion
    No Encoding conversion!
    Let’s assume that everybody
    writes their template in UTF-8
    If that’s too aggressive, maybe
    we could extract this feature
    to a gem

    View Slide

  98. def call(template)
    - # 4 lines of comments
    - template_source =
    template.source.dup.force_encoding(Encoding::ASCII_8BIT)
    -
    - erb = template_source.gsub(ENCODING_TAG, '')
    - encoding = $2
    -
    - erb.force_encoding valid_encoding(template.source.dup, encoding)
    -
    - # Always make sure we return a String in the default_internal
    - erb.encode!
    + erb = template.source
    self.class.erb_implementation.new(
    erb,
    :escape => (self.class.escape_whitelist.include? template.type),
    :trim => (self.class.erb_trim_mode == "-")
    ).src
    end
    UTF-8 Only ERBHandler -
    The Patch

    View Slide

  99. UTF-8 Only ERBHandler -
    Benchmark (200 lines ERB)
    require 'benchmark/ips'
    view = Class.new(ActionView::Base).new('.')
    template = view.lookup_context.find_template('foo')
    erb = ::ActionView::Template::Handlers::ERB.new
    Benchmark.ips do |x|
    x.report('default or patched') {
    erb.call template
    }
    end

    View Slide

  100. UTF-8 Only ERBHandler -
    Benchmark Result
    # The Original ERBHandler
    Warming up --------------------------------------
    default 836.000 i/100ms
    Calculating -------------------------------------
    default 8.582k (± 4.9%) i/s - 43.472k in 5.077812s
    # Patched ERBHandler
    Warming up --------------------------------------
    default 1.281k i/100ms
    Calculating -------------------------------------
    default 13.229k (± 6.9%) i/s - 66.612k in 5.058864s

    View Slide

  101. UTF-8 Only ERBHandler -
    Benchmark Result
    1.5x faster!

    View Slide

  102. Only 1.5x?
    This process includes erb
    template => ruby compilation
    One more thing.

    Memory consumption has to
    be reduced

    View Slide

  103. Profiling the Memory
    Usage
    memory_profiler gem
    https:/
    /github.com/
    SamSaffron/
    memory_profiler

    View Slide

  104. Profiling the Memory
    Usage of the ERB Handler
    require 'benchmark/ips'
    view = Class.new(ActionView::Base).new('.')
    template = view.lookup_context.find_template('foo')
    erb = ::ActionView::Template::Handlers::ERB.new
    report = MemoryProfiler.report do
    erb.call template
    end
    report.pretty_print

    View Slide

  105. Memory Usage Result
    # The Original ERBHandler
    allocated memory by class
    -----------------------------------
    1989 String
    640 MatchData
    232 Hash
    160 Array
    144 ActionView::Template::Handlers::Erubis
    80 Symbol
    40 Range
    # Patched ERBHandler
    allocated memory by class
    -----------------------------------
    1660 String
    640 MatchData
    232 Hash
    160 Array
    144 ActionView::Template::Handlers::Erubis
    80 Symbol
    40 Range

    View Slide

  106. Memory Usage
    Memory usage is also very
    important
    If we could reduce this, we
    would be able to put more
    workers in a webapp container

    View Slide

  107. So, I’d Like to Propose Removing
    the Encoding Support from Rails
    Maybe in Rails 6?

    View Slide

  108. BTW,
    This was about the
    ERB Handler

    View Slide

  109. If You're Using Haml
    There are faster alternative
    implementations
    Faml: https:/
    /github.com/
    eagletmt/faml
    Hamlit: https:/
    /github.com/
    k0kubun/hamlit

    View Slide

  110. Just Bundle Either of These
    Gems, Then You’ll Get the Speed!
    (Taken from Hamlit’s README)

    View Slide

  111. AS::SafeBuffer
    As we saw in the
    method calls count,
    SafeBuffer is heavily
    used in ActionView

    View Slide

  112. AS::SafeBuffer
    Very adhoc implementation
    Every String has a flag inside
    Every template String
    concatenation is performed
    here

    View Slide

  113. Faster AS::SafeBuffer
    I tried to use
    Object#tainted flag...
    but this didn't work
    Maybe we could make a
    faster SafeBuffer in C?

    View Slide

  114. I18n Alternative
    I18n is unnecessarily complex
    (e.g. who uses a non-Yaml
    backend?)
    What if we make a simple I18n
    alternative that does nothing
    but just a simple Hash lookup?

    View Slide

  115. I18n Alternative -
    implementation
    WIP
    Almost working, but
    some tests are still
    failing

    View Slide

  116. ActiveRecord

    View Slide

  117. Reducing Arel Objects
    Current AR query creates
    so many Arel Node objects
    Since AR 4, AR caches
    Arel Nodes in memory
    (AdequateRecord)

    View Slide

  118. Reducing Arel Objects -
    Concept
    If the query is simple enough, directly
    compose an SQL statement, just like we
    were doing in AR1 and 2
    If the query is not simple enough,
    fallback to `super` (AR default behavior)
    No caching! (because building the whole
    query is as cheap as computing a cache
    key)

    View Slide

  119. Reducing Arel Objects -
    Implementation
    WIP
    Almost working on Rails 4, not
    working on Rails 5
    The product is called Arenai
    Arelɹɹ=> ΞϨΔ

    No Arel => ΞϨͳ͍

    View Slide

  120. Arenai find -
    Implementation
    # The code is a little bit shortened for the presentation slide
    module Arenai
    module Base
    def find(*ids)
    return super unless ids.length == 1
    return super if block_given? || primary_key.nil? || default_scopes.any?
    || columns_hash.include?(inheritance_column) || ids.first.kind_of?(Array)
    id = ids.first
    return super if !((Fixnum === id) || (String === id))
    # SELECT "users".* FROM "users" WHERE "users"."id" = $1 [["id", 1]]
    find_by_sql("SELECT #{quoted_table_name}.* FROM #{quoted_table_name}
    WHERE #{quoted_table_name}.#{connection.quote_column_name primary_key} = $1",
    [[columns_hash[primary_key], id]]).first
    ennnd

    View Slide

  121. Arenai - Expectation
    Get the AR1 speed back
    Less Object creation
    Less memory
    consumption

    View Slide

  122. AR Object creation

    AR 5 creates an Object per each attribute in a
    model instance

    This brings a flexibility

    But it's sometimes too much

    For example, think of a batch system that selects
    100,000 records that has 20 columns. That would
    create 2,000,000 "attribute" Objects
    I'm thinking of a plugin that can reduce this
    Object creation somehow

    View Slide

  123. A Plugin or Patch Doing
    This
    Nothing have done yet

    View Slide

  124. model.present?
    # We sometimes do this, but…
    if @current_user.present?
    ...
    end

    View Slide

  125. Do Never Hit
    model.present?
    model.present? causes
    massive method calls
    Guess how many. 3? 5?

    View Slide

  126. Method Calls That Happen
    When You Hit @user.present?
    Object#present?
    Object#blank?
    ActiveRecord::AttributeMethods#respond_to?
    ActiveModel::AttributeMethods#respond_to?
    Kernel#respond_to?
    Kernel#respond_to_missing?
    Kernel#respond_to?
    Kernel#respond_to_missing?
    Symbol#to_s
    ActiveModel::AttributeMethods#matched_attribute_method
    Kernel#class
    ActiveModel::AttributeMethods::ClassMethods#attribute_method_matchers_matching
    ActiveModel::AttributeMethods::ClassMethods#attribute_method_matchers_cache
    Concurrent::Collection::MriMapBackend#compute_if_absent
    Concurrent::Collection::NonConcurrentMapBackend#[]
    Mutex#synchronize
    Concurrent::Collection::NonConcurrentMapBackend#compute_if_absent
    Hash#fetch
    ##attribute_method_matchers
    Symbol#to_proc
    Enumerable#partition
    Array#each
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#plain?
    Array#reverse
    Array#flatten
    Array#map
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    #tributeMethodMatch>#new
    Struct#initialize
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher#match
    Regexp#=~
    #tributeMethodMatch>#new
    Struct#initialize
    Array#compact
    Enumerable#detect
    Array#each
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher::AttributeM
    ethodMatch#attr_name
    ActiveRecord::AttributeMethods::PrimaryKey#attribute_method?
    ActiveRecord::AttributeMethods#attribute_method?
    ActiveRecord::AttributeSet#key?
    ActiveRecord::LazyAttributeHash#key?
    Hash#key?
    Hash#key?
    Hash#key?
    ActiveModel::AttributeMethods::ClassMethods::AttributeMethodMatcher::AttributeM
    ethodMatch#attr_name
    ActiveRecord::AttributeMethods::PrimaryKey#attribute_method?
    ActiveRecord::AttributeMethods#attribute_method?
    ActiveRecord::AttributeSet#key?
    ActiveRecord::LazyAttributeHash#key?
    Hash#key?
    Hash#key?
    Hash#key?
    NilClass#nil?

    View Slide

  127. 85 Method Calls!
    Significantly becoming
    high-cost after AR4,5
    refactoring
    What a trap!

    View Slide

  128. https://github.com/rails/
    rails/pull/23394
    I suggested a patch to fix
    this situation, but the
    proposal was turned down
    Because Rails is expecting you
    all to be careful enough never
    to walk into this trap

    View Slide

  129. Or You Can Monkey-Patch
    module ActiveRecord
    class Base
    def present?
    true
    end
    def blank?
    false
    end
    end
    end

    View Slide

  130. Other Kinds of
    Performance Concerns
    Development env
    Booting up the App
    Running tests

    View Slide

  131. A Rails App Booting
    Process
    Bundles the gems
    Requires the libraries
    Loads the app
    Runs the "Initializers"

    View Slide

  132. Maybe There's an Initializer
    Taking Too Much Time
    # railties/lib/rails/initializable.rb
    module Rails
    module Initializable
    def run_initializers(group=:default, *args)
    return if instance_variable_defined?(:@ran)
    initializers.tsort_each do |initializer|
    + now = Time.now
    initializer.run(*args) if initializer.belongs_to?(group)
    + p initializer.name => Time.now - now if Time.now - now > 0.1
    end
    @ran = true
    end

    View Slide

  133. Which Gem Takes Time
    When Being required?
    # bundler/lib/bundler/runtime.rb
    module Bundler
    class Runtime < Environment
    def require(*groups)
    groups.map!(&:to_sym)
    groups = [:default] if groups.empty?
    @definition.dependencies.each do |dep|
    # Skip the dependency if it is not in any of the requested
    # groups
    next unless (dep.groups & groups).any? && dep.current_platform?
    + now = Time.now
    required_file = nil
    ...
    end
    + p dep.name => Time.now - now
    ennnnd

    View Slide

  134. This Way, I Found Dozens of Gems
    That Slows Down Our App Boot
    (And so I sent dozens of
    patches)
    Most of them just didn't use
    `AS.on_load` properly
    e.g.) https:/
    /github.com/zdennis/
    activerecord-import/pull/136

    View Slide

  135. There Are Some Gems That
    Shouldn't Be required via Bundler
    e.g.) pry-doc

    Actually, pry-*
    Just add `require: false`
    to each of them in your
    Gemfile

    View Slide

  136. Prying out the
    Mechanism
    pry-doc takes 0.2sec to load on my MBP(SSD)
    When booting pry, it scans through all the
    installed gems and tries to require every gem
    that matches pry-* (see lib/pry/plugins.rb)
    You do not at all have to require pry-* via
    Bundler. You might not need them until you
    boot pry

    View Slide

  137. Squashing All Gems Into
    One Directory
    Every RubyGem has its own path,
    and its own namespace inside
    Can’t all these gems be merged
    into one directory so we could
    make $LOAD_PATH shorter, then
    make require faster?

    View Slide

  138. bundle-squash -
    Implementation
    https:/
    /github.com/
    amatsuda/bundle-
    squash

    View Slide

  139. bundle-squash
    Still not perfectly
    working with Rails
    No significant
    performance
    improvement :<

    View Slide

  140. Kernel#require vs
    Kernel#require_relative
    ko1 once told me that
    require_relative must be
    faster
    Can we speed up Rails boot
    by replacing `require` =>
    `require_relative`?

    View Slide

  141. require_relative branch
    I tried.
    https:/
    /github.com/amatsuda/
    rails/tree/require_relative
    No significant speed
    improvement :<

    View Slide

  142. autoload in production
    We should better avoid
    autoloading in production,
    especially on a forked process
    Let’s make sure that we’re not
    autoloading anything

    View Slide

  143. Detecting autoload
    TracePoint.new(:call, :c_call) do |tp|
    if tp.method_id == :autoload
    b = tp.binding
    case tp.event
    when :call
    p [tp.lineno, tp.defined_class, b.local_variable_get(:const_name)]
    when :c_call
    if b.local_variable_defined?(:const_name)
    p [tp.lineno, tp.defined_class, b.local_variable_get(:const_name),
    b.local_variable_get(:full), b.local_variable_get(:path)]
    else
    p [tp.lineno, tp.defined_class, b.local_variables]
    puts caller
    puts
    end
    end
    end
    end.enable

    View Slide

  144. I Found 2 Occurrences
    When posting a form:
    rack-2.0.0.alpha/lib/rack/
    multipart.rb:8
    rack-2.0.0.alpha/lib/rack/
    multipart.rb:9
    There’ll probably be more?

    View Slide

  145. Speeding Up Tests

    View Slide

  146. INSERT INTO
    schema_migrations
    We noticed that our
    app with 600 tables
    took 1 minute to
    create all tables in
    CircleCI

    View Slide

  147. What Was Happening
    "INSERT INTO schema_migrations
    (version) VALUES ('20160504000000');"
    "INSERT INTO schema_migrations
    (version) VALUES ('20160504000001');"
    "INSERT INTO schema_migrations
    (version) VALUES ('20160504000002');"
    … (600 SQLs)

    View Slide

  148. We Changed This to…
    "INSERT INTO
    schema_migrations (version)
    VALUES ('20160504000000'),
    ('20160504000001'),
    ('20160504000002');"
    (1 SQL!)

    View Slide

  149. This Commit Is Included
    in Rails 5
    https:/
    /github.com/rails/
    rails/commit/42dd233
    This patch was provided
    by MoneyForward
    (@ppworks and me)

    View Slide

  150. If You Feel Like Your
    Database Cleaning Is Slow
    database_cleaner’s delete and
    truncate strategy deletes
    (truncates) all tables
    That is unbearably slow if you
    have hundreds of tables ✕
    thousands of test cases

    View Slide

  151. In Such Case,
    Use database_rewinder

    View Slide

  152. database_rewinder -
    Concept
    It memorizes the
    inserted table names per
    each test
    And deletes only from
    those tables

    View Slide

  153. database_rewinder -
    Implementation
    https:/
    /github.com/
    amatsuda/
    database_rewinder

    View Slide

  154. ActiveSupport

    View Slide

  155. Some Slow Parts in AS
    Multibyte
    timezones
    (AS::TimeWithZone is
    unbearably slow)
    How can we not load them?

    View Slide

  156. AS::Multibyte
    Consists of Multibyte::Chars and
    Multibyte::Unicode
    Loads the whole Unicode database file (AS/
    values/unicode_tables.dat), which consumes
    time and memory
    I’m not sure if we still need this (doesn’t ruby
    have this?)
    I suppose we Japanese don’t use most of the
    features provided here

    View Slide

  157. AS::TimeWithZone
    Known as a slower
    version of Time

    View Slide

  158. Time vs
    AS::TimeWithZone
    Benchmark.ips do |x|
    x.report('Time') {
    Time.now
    }
    x.report('Time.zone.now') {
    Time.zone.now
    }
    x.compare!
    end

    View Slide

  159. AS::TimeWithZone is 25x
    Slower Than Time!
    Warming up --------------------------------------
    Time 145.872k i/100ms
    Time.zone.now 8.557k i/100ms
    Calculating -------------------------------------
    Time 2.209M (± 7.6%) i/s - 11.086M in 5.048168s
    Time.zone.now 88.469k (± 4.3%) i/s - 444.964k in 5.039123s
    Comparison:
    Time: 2209154.5 i/s
    Time.zone.now: 88468.8 i/s - 24.97x slower

    View Slide

  160. If You’re 100% Sure What
    You Are Doing
    Maybe you could speed
    up your app by using
    Time instead of
    AS::TimeWithZone

    View Slide

  161. Boosting with C
    Extensions
    Sometimes,
    reimplementing
    performance hotspot in
    C would boost
    performance

    View Slide

  162. Boosting with C
    Extensions
    CGI.escapeHTML (ruby 2.3)
    CGI.escape (ruby 2.4)
    fast_blank (SamSaffron’s gem)
    hwia (HashWithIndifferentAccess
    in C for Rails 2)

    View Slide

  163. Use Newest Ruby
    If you’re still using ruby
    < 2.3
    You’ll get the
    performance for free
    just by updating ruby

    View Slide

  164. Conclusion

    View Slide

  165. Maybe What We Need Is More
    Flexibility and Modularity
    YMMV

    View Slide

  166. YMMV
    There will be no one single
    bottleneck for every app
    Some apps might have 1000 models,
    some apps might have 3000 lines of
    routes.rb
    If you feel your Rails app is slow,
    you need to find your solution

    View Slide

  167. Rails is Omakase
    It’s a really good thing for
    newbies that we don’t need
    no special configuration
    But in some cases we need
    some special customization on
    certain parts

    View Slide

  168. Maybe What We Need Is More
    Flexibility and Modularity
    I know a software
    designed that way
    It used to be called
    “Merb”!

    View Slide

  169. Everybody Let’s Hack!
    There remains so many problems
    So many possibilities of improvements
    Everyone, do reveal your hacks!
    Let there be more alternatives
    That would bring the missing
    “Merbism” back to the community

    View Slide

  170. end

    View Slide