Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RailsConf 2018 Keynote

RailsConf 2018 Keynote

My keynote at RailsConf 2018

Aaron Patterson

May 11, 2018
Tweet

More Decks by Aaron Patterson

Other Decks in Programming

Transcript

  1. Exact Profiler def fast; end def slow; sleep(0.1); end def

    benchmark_me 1000.times { fast } 10.times { slow } end def exact_profiler counter = Hash.new(0) tp = TracePoint.new(:call) do |event| counter[event.method_id] += 1 end tp.enable yield tp.disable counter end result = exact_profiler { benchmark_me } p result # => {:benchmark_me=>1, :fast=>1000, :slow=>10} Listen for an Event Keep Track of Calls
  2. Sampling Profiler def fast; end def slow; sleep(0.1); end def

    benchmark_me 1000.times { fast } 10.times { slow } end def exact_profiler target = Thread.current samples = Hash.new(0) t = Thread.new do loop do sleep 0.0001 # sample rate samples[target.backtrace_locations.first.label] += 1 end end yield t.kill samples.dup end result = exact_profiler { benchmark_me } p result # => {"sleep"=>6497, "slow"=>1} 6498 samples 99% inside `sleep`
  3. Using StackProf require 'stackprof' def fast; end def slow; sleep(0.1);

    end def benchmark_me 1000.times { fast } 10.times { slow } end StackProf.run(mode: :wall, out: 'profile.dump') do benchmark_me end
  4. `stackprof` output $ stackprof profile.dump ================================== Mode: wall(1000) Samples: 999

    (0.20% miss rate) GC: 0 (0.00%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 999 (100.0%) 999 (100.0%) Object#slow 999 (100.0%) 0 (0.0%) Object#benchmark_me 999 (100.0%) 0 (0.0%) block in <main> 999 (100.0%) 0 (0.0%) <main> 999 (100.0%) 0 (0.0%) <main>
  5. Rails Boot Process Requiring Files "Compiling" Application Compiling Views Time

    Hit Enter Not Ready For Requests First Request Business Logic Second Request Rack Starts Unicorn Starts
  6. Rails Boot Process Requiring Files "Compiling" Application Compiling Views Time

    Restarting the Webserver Warmup Time Business Logic Runtime Running Scripts
  7. Example Output $ stackprof profile.dump ================================== Mode: wall(1000) Samples: 3951

    (6.17% miss rate) GC: 663 (16.78%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 1134 (28.7%) 821 (20.8%) ActiveSupport::Dependencies::Loadable#require 663 (16.8%) 663 (16.8%) (garbage collection) 166 (4.2%) 104 (2.6%) Kernel#require 86 (2.2%) 86 (2.2%) #<Module:0x00007fb24f9f5200>.mechanism 98 (2.5%) 59 (1.5%) Gem::Specification#validate 56 (1.4%) 55 (1.4%) Module#delegate 51 (1.3%) 40 (1.0%) <top (required)> 51 (1.3%) 36 (0.9%) <top (required)> 40 (1.0%) 34 (0.9%) Gem::StubSpecification#data 138 (3.5%) 34 (0.9%) <top (required)> 137 (3.5%) 33 (0.8%) <top (required)> Time in GC
  8. An Object is a 40 byte chunk Ruby Object x

    = "omg this is a very long string!"
  9. An Object is a 40 byte chunk Ruby Object x

    = "omg this is a very long string!" "omg this is a very long string!" `malloc` GC
  10. Ruby Object Empty Slot An Object is a 40 byte

    chunk x = "omg this is a very long string!" "omg this is a very long string!" `malloc` Empty Slot Empty Slot Empty Slot Empty Slot Empty Slot Empty Slot Empty Slot Page
  11. Rails App Boot (default) $ stackprof profile.dump ================================== Mode: wall(1000)

    Samples: 4434 (8.22% miss rate) GC: 730 (16.46%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 1125 (25.4%) 817 (18.4%) ActiveSupport::Dependencies::Loadable#require 730 (16.5%) 730 (16.5%) (garbage collection) 202 (4.6%) 135 (3.0%) Kernel#require 82 (1.8%) 82 (1.8%) #<Module:0x00007ffd3533ce00>.mechanism 106 (2.4%) 60 (1.4%) Gem::Specification#validate 55 (1.2%) 54 (1.2%) Module#delegate 72 (1.6%) 53 (1.2%) <top (required)> 170 (3.8%) 47 (1.1%) <top (required)> 51 (1.2%) 45 (1.0%) <top (required)> 58 (1.3%) 38 (0.9%) <top (required)> 164 (3.7%) 36 (0.8%) <top (required)> 234 (5.3%) 34 (0.8%) <module:Helpers> 41 (0.9%) 32 (0.7%) Gem::StubSpecification#data 16.5% Spent in GC $ RAILS_ENV=production ruby -I. bench.rb
  12. Rails App Boot (RUBY_GC_HEAP_INIT_SLOTS=600000) $ stackprof profile.dump ================================== Mode: wall(1000)

    Samples: 3269 (7.03% miss rate) GC: 279 (8.53%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 924 (28.3%) 623 (19.1%) ActiveSupport::Dependencies::Loadable#require 279 (8.5%) 279 (8.5%) (garbage collection) 162 (5.0%) 97 (3.0%) Kernel#require 86 (2.6%) 86 (2.6%) #<Module: 0x00007f937b52a7e0>.mechanism 56 (1.7%) 55 (1.7%) Module#delegate 95 (2.9%) 51 (1.6%) Gem::Specification#validate 144 (4.4%) 35 (1.1%) <top (required)> 50 (1.5%) 35 (1.1%) <top (required)> 127 (3.9%) 31 (0.9%) <top (required)> 35 (1.1%) 29 (0.9%) <top (required)> 8.3% Spent in GC $ RUBY_GC_HEAP_INIT_SLOTS=600000 RAILS_ENV=production ruby -I. bench.rb
  13. Boot Time $ time RAILS_ENV=production ruby -I. -rconfig/environment -e' '

    real 0m4.105s user 0m3.464s sys 0m0.607s $ time RUBY_GC_HEAP_INIT_SLOTS=600000 RAILS_ENV=production ruby -I. -rconfig/environment -e' ' real 0m3.662s user 0m3.015s sys 0m0.607s
  14. Growth Test Program x = [] a = GC.stat(:count) 600000.times

    do x << Object.new end b = GC.stat(:count) p [a, b]
  15. Heap Growth Patterns Number of Slots 0 150000 300000 450000

    600000 Number of Heap Resizes 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Goal=0.2 Goal=0.7
  16. Try at Home! $ RUBY_GC_HEAP_FREE_SLOTS_MIN_RATIO=0.1 RUBY_GC_HEAP_FREE_SLOTS_MAX_RATIO=0.9 RUBY_GC_HEAP_FREE_SLOTS_GOAL_RATIO=0.7 ./ruby --disable-gems grow_heap.rb

    $ RUBY_GC_HEAP_FREE_SLOTS_MIN_RATIO=0.1 RUBY_GC_HEAP_FREE_SLOTS_MAX_RATIO=0.9 RUBY_GC_HEAP_FREE_SLOTS_GOAL_RATIO=0.2 ./ruby --disable-gems grow_heap.rb
  17. Object Tracing Output $ stackprof profile.dump ================================== Mode: object(1) Samples:

    951217 (0.00% miss rate) GC: 0 (0.00%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 263967 (27.8%) 199899 (21.0%) ActiveSupport::Dependencies::Loadable#require 51931 (5.5%) 43885 (4.6%) Kernel#require 43744 (4.6%) 23606 (2.5%) Kernel#require 19553 (2.1%) 19553 (2.1%) Module#delegate 57342 (6.0%) 17821 (1.9%) <top (required)> 28941 (3.0%) 15809 (1.7%) Gem::StubSpecification#data 60887 (6.4%) 10937 (1.1%) <module:Helpers> 12754 (1.3%) 10183 (1.1%) <top (required)>
  18. Heap Dumps require 'objspace' ObjectSpace.trace_object_allocations_start require 'config/environment' GC.start File.open("my_dump.json", "w")

    do |f| ObjectSpace.dump_all(output: f) end Enable Allocation Location Recording The Actual Benchmark Clear any garbage Dump the heap
  19. JSON file (1 line = 1 Object) {"address":"0x7fd43461bab0", "type":"ARRAY", "length":6,

    "references": ["0x7fd4342164d8", "0x7fd4338b28d8", "0x7fd4338b28d8"], "file":"/Users/ aaron/git/rails/activesupport/lib/active_support/dependencies.rb", "line":283, "method":"require", "generation":46, "memsize":200, "flags": {"wb_protected":true, "old":true, "uncollectible":true, "marked":true}}
  20. JSON file (1 line = 1 Object) { "address": "0x7fd43461bab0",

    "type": "ARRAY", "length": 6, "references": [ "0x7fd4342164d8", "0x7fd4338b28d8", "0x7fd4338b28d8" ], "file": "/Users/aaron/git/rails/activesupport/lib/active_support/dependencies.rb", "line": 283, "method": "require", "generation": 46, "memsize": 200, "flags": { "wb_protected": true, "old": true, "uncollectible": true, "marked": true } } Object Type References Allocation Location Size in Memory
  21. What File is Most Popular? $ jq .file my_dump.json |

    sort | uniq -c | sort -n | tail -10 Extract the "file" attribute Count Unique Files Sort Get Top 10
  22. Sample Output 1768 "lib/ruby/2.6.0/x86_64-darwin17/openssl.bundle" 2218 "lib/ruby/gems/2.6.0/gems/sass-3.5.4/lib/sass/engine.rb" 2365 "lib/ruby/2.6.0/rubygems/stub_specification.rb" 2469 "lib/ruby/2.6.0/rubygems/specification.rb"

    2805 "lib/ruby/2.6.0/delegate.rb" 2977 "activesupport/lib/active_support/core_ext/module/delegation.rb" 3187 "lib/ruby/gems/2.6.0/gems/tzinfo-1.2.5/lib/tzinfo/zoneinfo_data_source.rb" 10628 "lib/ruby/1.6.0/rubygems/core_ext/kernel_require.rb" 12448 null 25693 "activesupport/lib/active_support/dependencies.rb"
  23. What Line is Most Popular? $ jq 'select(.file == "activesupport/lib/active_support/

    dependencies.rb") | .line' my_dump.json | sort | uniq -c | sort -n | tail -10 Filter objects allocated in this file
  24. Sample Output $ jq 'select(.file == "/Users/aaron/git/rails/activesupport/lib/ active_support/dependencies.rb") | .line'

    my_dump.json | sort | uniq -c | sort -n | tail -10 2 304 3 102 3 173 3 215 3 289 4 559 4 92 48 277 53 16 25487 283
  25. Lambda References def foo(a, b) lambda { a + b

    } end m = foo(10, 11) Lambda Number Number
  26. Lambda References def foo(a, b) lambda { } end m

    = foo(10, 11) Lambda Number Number
  27. No Escape! def foo(a, b) lambda { } end m

    = foo(10, 11) env = m.binding # lambda environment p env.local_variables # [:a, :b] p env.local_variable_get(:a) # 10 p env.local_variable_get(:b) # 11
  28. Rails Boot Process Requiring Files "Compiling" Application Compiling Views Time

    Restarting the Webserver Warmup Time Business Logic Runtime Running Scripts
  29. Rails Boot Process Requiring Files "Compiling" Application Compiling Views Time

    Restarting the Webserver Warmup Time Business Logic Runtime Running Scripts
  30. New `rackup` Options $ ruby -I lib bin/rackup --help Usage:

    rackup [ruby options] [rack options] [rackup config] Profiling options: --heap HEAPFILE Build the application, then dump the heap to HEAPFILE --profile PROFILE Dump CPU or Memory profile to PROFILE (defaults to a tempfile) --profile-mode MODE Profile mode (cpu|wall|object)
  31. Rails Boot Process Requiring Files "Compiling" Application Compiling Views Time

    Restarting the Webserver Warmup Time Business Logic Runtime
  32. Output GitHub (master [490a985]) - Rails 4.2.10.github (development) 2018-04-19 12:16:11

    -0400 12:16 up 6 days, 19:54, 8 users, load averages: 2.94 2.27 2.39 [email protected]:/Users/aaron/github/github pid 35651 - 940 MB RSS - 54.303887s CPU - 2,163,221 objects 9763 loaded features {:TOTAL=>3899911, :FREE=>365601, :T_OBJECT=>370493, :T_CLASS=>138069, :T_MODULE=>32498, :T_FLOAT=>17, :T_STRING=>1001412, :T_REGEXP=>14153, :T_ARRAY=>398461, :T_HASH=>152491, :T_STRUCT=>11063, :T_BIGNUM=>46, :T_FILE=>81, :T_DATA=>166354, :T_MATCH=>37127, :T_COMPLEX=>1, :T_RATIONAL=>907, : T_SYMBOL=>10038, :T_IMEMO=>1083137, :T_NODE=>1729, :T_ICLASS=>116233} {:count=>54, :heap_allocated_pages=>9568, :heap_sorted_length=>9910, :heap_allocatable_pages=>342, :heap_available_slots=>3899911, :heap_live_slots=>3534410, :heap_free_slots=>365501, :heap_final_slots=>0, :heap_marked_slots=>3186821, :heap_eden_pages=>9568, :heap_tomb_pages=>0, :total_al located_pages=>9568, :total_freed_pages=>0, :total_allocated_objects=>37844811, :total_freed_objects=>34310401, :malloc_increase_bytes=>12849616, :malloc_increase_bytes_limit=>128000000, :minor_gc_count=>43, :major_gc_count=>11, :remembered_wb_unprotected_objects=>77457, :remembered_wb_un protected_objects_limit=>154914, :old_objects=>3009873, :old_objects_limit=>6019746, :oldmalloc_increase_bytes=>12850064, :oldmalloc_increase_bytes_limit=>128000000} /Users/aaron/github/github/vendor/ruby/0d9762bcc5e0bb319765c8e40c5524509159003b/bin/ruby ruby 2.4.3p200 (2017-09-15 revision 59914) [x86_64-darwin17] RUBY_GC_HEAP_GROWTH_FACTOR=1.25 RUBY_GC_HEAP_GROWTH_MAX_SLOTS=300000 RUBY_GC_OLDMALLOC_LIMIT=128000000 RUBY_GC_HEAP_INIT_SLOTS=600000 RUBY_GC_MALLOC_LIMIT=128000000 RUBY_GC_HEAP_FREE_SLOTS=600000 10 requests to http://localhost:3000/ as (anonymous) peak memory: 979 MB RSS response size: 504 KB total (50.4 KB/req) response time: 0ms total render time: 0ms total cpu time: 0ms total idle time: 0ms total oob gc: 0 / 0ms in-request gc: 0 / 0ms total allocations: 0 objs total ar objs: 0 objs total smoke: 0 / 0ms total mysql: 0 / 0ms total redis: 0 / 0ms total cache: 0 / 0ms total marshal: 0 / 0ms total zlib: 0 / 0ms total {:TOTAL=>4338897, :FREE=>1023915, :T_OBJECT=>296251, :T_CLASS=>138076, :T_MODULE=>32499, :T_FLOAT=>17, :T_STRING=>911236, :T_REGEXP=>17144, :T_ARRAY=>412142, :T_HASH=>166431, :T_STRUCT=>11006, :T_BIGNUM=>46, :T_FILE=>6, :T_DATA=>166436, :T_MATCH=>401, :T_COMPLEX=>1, :T_RATIONAL=>907, :T_S YMBOL=>10042, :T_IMEMO=>1034234, :T_NODE=>1729, :T_ICLASS=>116378} {Thread::Mutex=>4749, Time=>3263, Proc=>154783, StringIO=>4, UnboundMethod=>1166, Thread=>2, Concurrent::AtomicFixnum=>1, Random=>2, Method=>454, Google::Protobuf::FieldDescriptor=>647, Google::Protobuf::Descriptor=>147, Google::Protobuf::OneofDescriptor=>3, :autoload_i=>663, :autoload=>94, DateTime=>20, Google::Protobuf::EnumDescriptor=>49, Goomba::Sanitizer=>2, Goomba::Selector=>32, PrettyLights::Theme=>1, PrettyLights::Library=>1, Object=>4, FFI::DynamicLibrary=>5, BigDecimal=>3, Keiko=>1, Date=>15, Encoding=>101, ThreadGroup=>1, Binding=>2, RubyVM=>1, ARGF.class=>1, Data=>1, :unknown=>5, JSON::Ext::Generator::State=>3, swig_runtime_data=>1, OpenSSL::Digest::SHA256=>2, ObjectSpace::WeakMap=>1, GRPC::Core::TimeSpec=>3, FFI::FunctionType=>44, FFI::Function=>41, FFI::DynamicLibrary::Symbol=>42, FFI::Type::Mapped=>13, FFI::StructByReference=>4, FFI::VariadicInvoker=>1, FFI::StructLayout=>5, FFI::StructLayout::Number=>8, FFI::StructLayout::Array=>1, FFI::ArrayType=>1, FFI::StructLayout::Pointer=>3, FFI::StructLayout::String=>4, FFI::StructLayout::InnerStruct=>1, FFI::StructByValue=>1, FFI::StructLayout::Mapped=>2, Google::Protobuf::DescriptorPool=>1, MessagePack::Factory=>1, Concurrent::AtomicReference=>1, Rlibmemcached::MemcachedSt=>1, OpenSSL::PKey::EC::Group=>1, OpenSSL::X509::Store=>1, OpenSSL::PKey::DH=>2, RE2::Regexp=>1, FFI::Pointer=>1, FFI::Type::Builtin=>21, OpenSSL::Cipher=>1} {:count=>60, :heap_allocated_pages=>10645, :heap_sorted_length=>10645, :heap_allocatable_pages=>0, :heap_available_slots=>4338897, :heap_live_slots=>3315218, :heap_free_slots=>1023679, :heap_final_slots=>0, :heap_marked_slots=>3314982, :heap_eden_pages=>10645, :heap_tomb_pages=>0, :total_ allocated_pages=>10645, :total_freed_pages=>0, :total_allocated_objects=>42922679, :total_freed_objects=>39607461, :malloc_increase_bytes=>5792, :malloc_increase_bytes_limit=>128000000, :minor_gc_count=>48, :major_gc_count=>12, :remembered_wb_unprotected_objects=>77656, :remembered_wb_unp rotected_objects_limit=>155312, :old_objects=>3144841, :old_objects_limit=>6289682, :oldmalloc_increase_bytes=>6240, :oldmalloc_increase_bytes_limit=>128000000} 3314982 live, 1023915 free slots 979 MB RSS Stackprof results captured in: tmp/stackprof_4.2.10.github_20180419121610.prof
  33. Data I acquired (crap data) -100 925 1950 2975 4000

    Time 2017-01-23 15:53:51 2017-01-23 15:53:56 2017-01-23 15:54:01 2017-01-23 15:54:06 2017-01-23 15:54:11 2017-01-23 15:54:16 2017-01-23 15:54:21 2017-01-23 15:54:26 2017-01-23 15:54:31 2017-01-23 15:54:37 2017-01-23 15:54:42 2017-01-23 15:54:47 2017-01-23 15:54:52 2017-01-23 15:54:57 2017-01-23 15:55:02 2017-01-23 15:55:07 2017-01-23 15:55:12 2017-01-23 15:55:17 2017-01-23 15:55:22 2017-01-23 15:55:27 2017-01-23 15:55:32
  34. I made this Weight (g) -100 925 1950 2975 4000

    Time 2017-01-23 15:53:51 2017-01-23 15:53:56 2017-01-23 15:54:01 2017-01-23 15:54:06 2017-01-23 15:54:11 2017-01-23 15:54:16 2017-01-23 15:54:21 2017-01-23 15:54:26 2017-01-23 15:54:31 2017-01-23 15:54:37 2017-01-23 15:54:42 2017-01-23 15:54:47 2017-01-23 15:54:52 2017-01-23 15:54:57 2017-01-23 15:55:02 2017-01-23 15:55:07 2017-01-23 15:55:12 2017-01-23 15:55:17 2017-01-23 15:55:22 2017-01-23 15:55:27 2017-01-23 15:55:32 Cat Enters Cat Leaves