Pro Yearly is on sale from $80 to $50! »

SLOMO

 SLOMO

No one wants to be stuck in the slow lane, especially Rubyists. In this talk we'll look at the slow process of writing fast code. We'll look at several real world performance optimizations that may surprise you. We'll then rewind to see how these slow spots were found and fixed. Come to this talk and we will "C" how fast your Ruby can "Go".

Db953d125f5cc49756edb6149f1b813e?s=128

Richard Schneeman

November 18, 2016
Tweet

Transcript

  1. None
  2. None
  3. slomo the movie www. .com Special thanks

  4. Introducing Dr. John Kitchin The neurologist

  5. Introducing Dr. John Kitchin The inspiration

  6. Introducing Dr. John Kitchin The skater

  7. What is your goal in life?

  8. Grow up

  9. Pick a career

  10. Buy a sports car

  11. Retire

  12. The faster the better

  13. Slomo

  14. Said “no”

  15. Introducing Schneems The narrator

  16. Sounds like “schnapps”

  17. The narrator said, explaining his own introduction

  18. None
  19. None
  20. Maintain Sprockets

  21. Those are all my hobbies

  22. Full time job:

  23. Resisting the Trump Administration

  24. I consider myself a very chill person

  25. Like to mix my yoga habits with resistance

  26. “ohm”

  27. BTW

  28. In this talk we will not use “nil”

  29. Naan

  30. Sorry for ny naan jokes

  31. Sorry for ny naan jokes

  32. This

  33. talk

  34. is

  35. about

  36. speed

  37. None
  38. The fastest way to get where you’re going

  39. Depends on where you’re going

  40. To get there faster

  41. Let’s slow down

  42. Why would we want slower code?

  43. Measure it

  44. task "assets:profile" do puts "==============" StackProf.run(mode: :wall, out: "tmp/stackprof.dump") do

    Rake::Task["assets:precompile"].invoke end puts "Running: $ stackprof tmp/stackprof.dump" puts `stackprof tmp/stackprof.dump` end
  45. task "assets:profile" do puts "==============" StackProf.run(mode: :wall, out: "tmp/stackprof.dump") do

    Rake::Task["assets:precompile"].invoke end puts "Running: $ stackprof tmp/stackprof.dump" puts `stackprof tmp/stackprof.dump` end
  46. task "assets:profile" do puts "==============" StackProf.run(mode: :wall, out: "tmp/stackprof.dump") do

    Rake::Task["assets:precompile"].invoke end puts "Running: $ stackprof tmp/stackprof.dump" puts `stackprof tmp/stackprof.dump` end
  47. $ stackprof tmp/stackprof.dump

  48. Running: $ stackprof tmp/stackprof.dump ================================== Mode: wall(1000) Samples: 2083 (62.23%

    miss rate) GC: 282 (13.54%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 313 (15.0%) 313 (15.0%) Set#include? 307 (14.7%) 243 (11.7%) Sprockets::DigestUtils#digest 743 (35.7%) 137 (6.6%) Kernel#require 476 (22.9%) 124 (6.0%) Kernel#require 157 (7.5%) 92 (4.4%) SassC::Rails::Importer#imports 243 (11.7%) 81 (3.9%) #<Module:0x007fc76f34eb10> 67 (3.2%) 67 (3.2%) NumericWithFormat#to_s 160 (7.7%) 53 (2.5%) PathUtils#atomic_write 44 (2.1%) 44 (2.1%) #<Module:0x007fc76bdfb558> 239 (11.5%) 37 (1.8%) Sprockets::Cache::FileStore#set
  49. Running: $ stackprof tmp/stackprof.dump ================================== Mode: wall(1000) Samples: 2083 (62.23%

    miss rate) GC: 282 (13.54%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 313 (15.0%) 313 (15.0%) Set#include? 307 (14.7%) 243 (11.7%) Sprockets::DigestUtils#digest 743 (35.7%) 137 (6.6%) Kernel#require 476 (22.9%) 124 (6.0%) Kernel#require 157 (7.5%) 92 (4.4%) SassC::Rails::Importer#imports 243 (11.7%) 81 (3.9%) #<Module:0x007fc76f34eb10> 67 (3.2%) 67 (3.2%) NumericWithFormat#to_s 160 (7.7%) 53 (2.5%) PathUtils#atomic_write 44 (2.1%) 44 (2.1%) #<Module:0x007fc76bdfb558> 239 (11.5%) 37 (1.8%) Sprockets::Cache::FileStore#set Class and method
  50. Running: $ stackprof tmp/stackprof.dump ================================== Mode: wall(1000) Samples: 2083 (62.23%

    miss rate) GC: 282 (13.54%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 313 (15.0%) 313 (15.0%) Set#include? 307 (14.7%) 243 (11.7%) Sprockets::DigestUtils#digest 743 (35.7%) 137 (6.6%) Kernel#require 476 (22.9%) 124 (6.0%) Kernel#require 157 (7.5%) 92 (4.4%) SassC::Rails::Importer#imports 243 (11.7%) 81 (3.9%) #<Module:0x007fc76f34eb10> 67 (3.2%) 67 (3.2%) NumericWithFormat#to_s 160 (7.7%) 53 (2.5%) PathUtils#atomic_write 44 (2.1%) 44 (2.1%) #<Module:0x007fc76bdfb558> 239 (11.5%) 37 (1.8%) Sprockets::Cache::FileStore#set Total # of samples
  51. Running: $ stackprof tmp/stackprof.dump ================================== Mode: wall(1000) Samples: 2083 (62.23%

    miss rate) GC: 282 (13.54%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 313 (15.0%) 313 (15.0%) Set#include? 307 (14.7%) 243 (11.7%) Sprockets::DigestUtils#digest 743 (35.7%) 137 (6.6%) Kernel#require 476 (22.9%) 124 (6.0%) Kernel#require 157 (7.5%) 92 (4.4%) SassC::Rails::Importer#imports 243 (11.7%) 81 (3.9%) #<Module:0x007fc76f34eb10> 67 (3.2%) 67 (3.2%) NumericWithFormat#to_s 160 (7.7%) 53 (2.5%) PathUtils#atomic_write 44 (2.1%) 44 (2.1%) #<Module:0x007fc76bdfb558> 239 (11.5%) 37 (1.8%) Sprockets::Cache::FileStore#set Percentage
  52. Running: $ stackprof tmp/stackprof.dump ================================== Mode: wall(1000) Samples: 2083 (62.23%

    miss rate) GC: 282 (13.54%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 313 (15.0%) 313 (15.0%) Set#include? 307 (14.7%) 243 (11.7%) Sprockets::DigestUtils#digest 743 (35.7%) 137 (6.6%) Kernel#require 476 (22.9%) 124 (6.0%) Kernel#require 157 (7.5%) 92 (4.4%) SassC::Rails::Importer#imports 243 (11.7%) 81 (3.9%) #<Module:0x007fc76f34eb10> 67 (3.2%) 67 (3.2%) NumericWithFormat#to_s 160 (7.7%) 53 (2.5%) PathUtils#atomic_write 44 (2.1%) 44 (2.1%) #<Module:0x007fc76bdfb558> 239 (11.5%) 37 (1.8%) Sprockets::Cache::FileStore#set Total at TOP of stack
  53. Running: $ stackprof tmp/stackprof.dump ================================== Mode: wall(1000) Samples: 2083 (62.23%

    miss rate) GC: 282 (13.54%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 313 (15.0%) 313 (15.0%) Set#include? 307 (14.7%) 243 (11.7%) Sprockets::DigestUtils#digest 743 (35.7%) 137 (6.6%) Kernel#require 476 (22.9%) 124 (6.0%) Kernel#require 157 (7.5%) 92 (4.4%) SassC::Rails::Importer#imports 243 (11.7%) 81 (3.9%) #<Module:0x007fc76f34eb10> 67 (3.2%) 67 (3.2%) NumericWithFormat#to_s 160 (7.7%) 53 (2.5%) PathUtils#atomic_write 44 (2.1%) 44 (2.1%) #<Module:0x007fc76bdfb558> 239 (11.5%) 37 (1.8%) Sprockets::Cache::FileStore#set Percent at TOP of stack
  54. Running: $ stackprof tmp/stackprof.dump ================================== Mode: wall(1000) Samples: 2083 (62.23%

    miss rate) GC: 282 (13.54%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 313 (15.0%) 313 (15.0%) Set#include? 307 (14.7%) 243 (11.7%) Sprockets::DigestUtils#digest 743 (35.7%) 137 (6.6%) Kernel#require 476 (22.9%) 124 (6.0%) Kernel#require 157 (7.5%) 92 (4.4%) SassC::Rails::Importer#imports 243 (11.7%) 81 (3.9%) #<Module:0x007fc76f34eb10> 67 (3.2%) 67 (3.2%) NumericWithFormat#to_s 160 (7.7%) 53 (2.5%) PathUtils#atomic_write 44 (2.1%) 44 (2.1%) #<Module:0x007fc76bdfb558> 239 (11.5%) 37 (1.8%) Sprockets::Cache::FileStore#set
  55. 26% of total execution samples

  56. What is calling so much? Set#include?

  57. $ stackprof tmp/stackprof.dump --method Set#include? Set#include? (/Users/richardschneeman/.rubies/ruby-2.3.1/lib/ruby/2.3.0/set.rb:214) samples: 313 self

    (15.0%) / 313 total (15.0%) callers: 312 ( 99.7%) Sprockets::ProcessorUtils#valid_processor_metadata_value? 1 ( 0.3%) Sprockets::Utils#dfs_paths code: | 214 | def include?(o) 313 (15.0%) / 313 (15.0%) | 215 | @hash[o] | 216 | end
  58. $ stackprof tmp/stackprof.dump --method Set#include? Set#include? (/Users/richardschneeman/.rubies/ruby-2.3.1/lib/ruby/2.3.0/set.rb:214) samples: 313 self

    (15.0%) / 313 total (15.0%) callers: 312 ( 99.7%) Sprockets::ProcessorUtils#valid_processor_metadata_value? 1 ( 0.3%) Sprockets::Utils#dfs_paths code: | 214 | def include?(o) 313 (15.0%) / 313 (15.0%) | 215 | @hash[o] | 216 | end
  59. What is calling so much? valid_processor_metadata_value?

  60. $ stackprof tmp/stackprof.dump --method Sprockets::ProcessorUtils#valid_processor_metadata_value? Sprockets::ProcessorUtils#valid_processor_metadata_value? (/Users/richardschneeman/.gem/ruby/2.3.1/bundler/ gems/sprockets-3b0d6732c13f/lib/sprockets/processor_utils.rb:170) samples: 24

    self (1.2%) / 2129 total (102.2%) callers: 1793 ( 84.2%) Sprockets::ProcessorUtils#valid_processor_metadata_value? 336 ( 15.8%) Sprockets::ProcessorUtils#validate_processor_result! callees (2105 total): 1793 ( 85.2%) Sprockets::ProcessorUtils#valid_processor_metadata_value? 312 ( 14.8%) Set#include? code: | 170 | def valid_processor_metadata_value?(value) 261 (12.5%) / 2 (0.1%) | 171 | if VALID_METADATA_VALUE_TYPES.include?(value.class) | 172 | true 61 (2.9%) / 8 (0.4%) | 173 | elsif VALID_METADATA_COMPOUND_TYPES.include?(value.class) 1806 (86.7%) / 13 (0.6%) | 174 | value.all? { |v| valid_processor_metadata_value?(v) } | 175 | else 1 (0.0%) / 1 (0.0%) | 176 | false | 177 | end
  61. $ stackprof tmp/stackprof.dump --method Sprockets::ProcessorUtils#valid_processor_metadata_value? Sprockets::ProcessorUtils#valid_processor_metadata_value? (/Users/richardschneeman/.gem/ruby/2.3.1/bundler/ gems/sprockets-3b0d6732c13f/lib/sprockets/processor_utils.rb:170) samples: 24

    self (1.2%) / 2129 total (102.2%) callers: 1793 ( 84.2%) Sprockets::ProcessorUtils#valid_processor_metadata_value? 336 ( 15.8%) Sprockets::ProcessorUtils#validate_processor_result! callees (2105 total): 1793 ( 85.2%) Sprockets::ProcessorUtils#valid_processor_metadata_value? 312 ( 14.8%) Set#include? code: | 170 | def valid_processor_metadata_value?(value) 261 (12.5%) / 2 (0.1%) | 171 | if VALID_METADATA_VALUE_TYPES.include?(value.class) | 172 | true 61 (2.9%) / 8 (0.4%) | 173 | elsif VALID_METADATA_COMPOUND_TYPES.include?(value.class) 1806 (86.7%) / 13 (0.6%) | 174 | value.all? { |v| valid_processor_metadata_value?(v) } | 175 | else 1 (0.0%) / 1 (0.0%) | 176 | false | 177 | end
  62. $ stackprof tmp/stackprof.dump --method Sprockets::ProcessorUtils#valid_processor_metadata_value? Sprockets::ProcessorUtils#valid_processor_metadata_value? (/Users/richardschneeman/.gem/ruby/2.3.1/bundler/ gems/sprockets-3b0d6732c13f/lib/sprockets/processor_utils.rb:170) samples: 24

    self (1.2%) / 2129 total (102.2%) callers: 1793 ( 84.2%) Sprockets::ProcessorUtils#valid_processor_metadata_value? 336 ( 15.8%) Sprockets::ProcessorUtils#validate_processor_result! callees (2105 total): 1793 ( 85.2%) Sprockets::ProcessorUtils#valid_processor_metadata_value? 312 ( 14.8%) Set#include? code: | 170 | def valid_processor_metadata_value?(value) 261 (12.5%) / 2 (0.1%) | 171 | if VALID_METADATA_VALUE_TYPES.include?(value.class) | 172 | true 61 (2.9%) / 8 (0.4%) | 173 | elsif VALID_METADATA_COMPOUND_TYPES.include?(value.class) 1806 (86.7%) / 13 (0.6%) | 174 | value.all? { |v| valid_processor_metadata_value?(v) } | 175 | else 1 (0.0%) / 1 (0.0%) | 176 | false | 177 | end
  63. def valid_processor_metadata_value?(value) if VALID_METADATA_VALUE_TYPES.include?(value.class) true elsif VALID_METADATA_COMPOUND_TYPES.include?(value.class) value.all? { |v|

    valid_processor_metadata_value?(v) } else false end end
  64. Set#include?

  65. # File set.rb, line 214 def include?(o) @hash[o] end

  66. Set is powered by a Hash under the hood

  67. Skip the Set, just use a Hash

  68. But why?

  69. Ruby has optimized instructions for hash calls

  70. code = " foo = Hash.new foo[:bar] " puts RubyVM::InstructionSequence.compile(code).disasm

    # 0000 trace 1 ( 2) # 0002 getinlinecache 9, <is:0> # 0005 getconstant :Hash # 0007 setinlinecache <is:0> # 0009 opt_send_without_block <callinfo!mid:new, argc:0, ARGS_SIMPLE>, <callcache> # 0012 setlocal_OP__WC__0 2 # 0014 trace 1 ( 3) # 0016 getlocal_OP__WC__0 2 # 0018 putobject :bar # 0020 opt_aref <callinfo!mid:[], argc:1, ARGS_SIMPLE>, <callcache> # 0023 leave
  71. Compare to Set

  72. code = " s = Set.new s.include?(:bar) " puts RubyVM::InstructionSequence.compile(code).disasm

    # 0000 trace 1 ( 2) # 0002 getinlinecache 9, <is:0> # 0005 getconstant :Set # 0007 setinlinecache <is:0> # 0009 opt_send_without_block <callinfo!mid:new, argc:0, ARGS_SIMPLE>, <callcache> # 0012 setlocal_OP__WC__0 2 # 0014 trace 1 ( 3) # 0016 getlocal_OP__WC__0 2 # 0018 putobject :bar # 0020 opt_send_without_block <callinfo!mid:include?, argc:1, ARGS_SIMPLE>, <callcache> # 0023 leave
  73. None
  74. insns.def

  75. /** @c optimize @e [] @j 最適化された recv[obj]。 */ DEFINE_INSN

    opt_aref (CALL_INFO ci, CALL_CACHE cc) (VALUE recv, VALUE obj) (VALUE val) { if (!SPECIAL_CONST_P(recv)) { if (RBASIC_CLASS(recv) == rb_cArray && BASIC_OP_UNREDEFINED_P(BOP_AREF, ARRAY_REDEFINED_OP_FLAG) && FIXNUM_P(obj)) { val = rb_ary_entry(recv, FIX2LONG(obj)); } else if (RBASIC_CLASS(recv) == rb_cHash && BASIC_OP_UNREDEFINED_P(BOP_AREF, HASH_REDEFINED_OP_FLAG)) { val = rb_hash_aref(recv, obj); } else { goto INSN_LABEL(normal_dispatch); } } else { INSN_LABEL(normal_dispatch): PUSH(recv); PUSH(obj); CALL_SIMPLE_METHOD(recv); } }
  76. /** @c optimize @e [] @j 最適化された recv[obj]。 */ DEFINE_INSN

    opt_aref (CALL_INFO ci, CALL_CACHE cc) (VALUE recv, VALUE obj) (VALUE val) { if (!SPECIAL_CONST_P(recv)) { if (RBASIC_CLASS(recv) == rb_cArray && BASIC_OP_UNREDEFINED_P(BOP_AREF, ARRAY_REDEFINED_OP_FLAG) && FIXNUM_P(obj)) { val = rb_ary_entry(recv, FIX2LONG(obj)); } else if (RBASIC_CLASS(recv) == rb_cHash && BASIC_OP_UNREDEFINED_P(BOP_AREF, HASH_REDEFINED_OP_FLAG)) { val = rb_hash_aref(recv, obj); } else { goto INSN_LABEL(normal_dispatch); } } else { INSN_LABEL(normal_dispatch): PUSH(recv); PUSH(obj); CALL_SIMPLE_METHOD(recv); } }
  77. Versus

  78. /** @c optimize @e Invoke method without block @j Invoke

    method without block */ DEFINE_INSN opt_send_without_block (CALL_INFO ci, CALL_CACHE cc) (...) (VALUE val) // inc += -ci->orig_argc; { struct rb_calling_info calling; calling.blockptr = NULL; vm_search_method(ci, cc, calling.recv = TOPN(calling.argc = ci->orig_argc)); CALL_METHOD(&calling, ci, cc); }
  79. /** @c optimize @e Invoke method without block @j Invoke

    method without block */ DEFINE_INSN opt_send_without_block (CALL_INFO ci, CALL_CACHE cc) (...) (VALUE val) // inc += -ci->orig_argc; { struct rb_calling_info calling; calling.blockptr = NULL; vm_search_method(ci, cc, calling.recv = TOPN(calling.argc = ci->orig_argc)); CALL_METHOD(&calling, ci, cc); }
  80. Ruby optimizes Hash calls by Skipping Method Lookup

  81. BTW

  82. Don’t subclass Hash

  83. /** @c optimize @e [] @j 最適化された recv[obj]。 */ DEFINE_INSN

    opt_aref (CALL_INFO ci, CALL_CACHE cc) (VALUE recv, VALUE obj) (VALUE val) { if (!SPECIAL_CONST_P(recv)) { if (RBASIC_CLASS(recv) == rb_cArray && BASIC_OP_UNREDEFINED_P(BOP_AREF, ARRAY_REDEFINED_OP_FLAG) && FIXNUM_P(obj)) { val = rb_ary_entry(recv, FIX2LONG(obj)); } else if (RBASIC_CLASS(recv) == rb_cHash && BASIC_OP_UNREDEFINED_P(BOP_AREF, HASH_REDEFINED_OP_FLAG)) { val = rb_hash_aref(recv, obj); } else { goto INSN_LABEL(normal_dispatch); } } else { INSN_LABEL(normal_dispatch): PUSH(recv); PUSH(obj); CALL_SIMPLE_METHOD(recv); } } You lose speed
  84. 
 “I don’t subclass hash”

  85. Hash With Indifferent Access

  86. Hashie

  87. Rack:: Utils::
 HeaderHash

  88. None
  89. Rack is 23% faster without HeaderHash

  90. Don’t subclass Hash

  91. None
  92. Switch Set to a hash

  93. VALID_METADATA_VALUE_TYPES_HASH = VALID_METADATA_VALUE_TYPES. each_with_object({}) do |type, hash| hash[type] = true

    end.freeze def valid_processor_metadata_value?(value) if VALID_METADATA_VALUE_TYPES_HASH[value.class] true elsif VALID_METADATA_COMPOUND_TYPES_HASH[value.class] value.all? { |v| valid_processor_metadata_value?(v) } else false end end
  94. Skips Method Lookup

  95. Did it help?

  96. 18.325s 17.981s

  97. 1.8% Faster!

  98. Bugger

  99. None
  100. Keep gliding

  101. Round 2

  102. TOTAL (pct) SAMPLES (pct) FRAME 2328 (109.6%) 362 (17.0%) Sprockets::ProcessorUtils#

    valid_processor_metadata_value? 348 (16.4%) 256 (12.1%) Sprockets::DigestUtils#digest 486 (22.9%) 106 (5.0%) Kernel#require 97 (4.6%) 97 (4.6%) ActiveSupport:: NumericWithFormat#to_s 123 (5.8%) 94 (4.4%) Sprockets::PathUtils#atomic_write 581 (27.4%) 76 (3.6%) Kernel#require 61 (2.9%) 61 (2.9%) #<Module:0x007fb0a6027728> .mechanism 193 (9.1%) 52 (2.4%) Sprockets::Cache::FileStore#set 95 (4.5%) 48 (2.3%) SassC::Rails::Importer# imports 36 (1.7%) 36 (1.7%) ExecJS::ExternalRuntime# exec_runtime 59 (2.8%) 25 (1.2%) Kernel#require 75 (3.5%) 25 (1.2%) Module#delegate
  103. $ stackprof tmp/stackprof.dump --method Sprockets::DigestUtils#digest # . . . Sprockets::DigestUtils#digest

    (lib/sprockets/digest_utils.rb:46) samples: 4 self (0.2%) / 7 total (0.3%) callers: 5 ( 71.4%) Sprockets::Cache#expand_key 2 ( 28.6%) Sprockets::Loader#load_from_unloaded callees (3 total): 3 ( 100.0%) Sprockets::DigestUtils#digest_class code: | 46 | def digest(obj) 4 (0.2%) / 1 (0.0%) | 47 | digest = digest_class.new | 48 | queue = [obj] | 49 | | 50 | while queue.length > 0 | 51 | obj = queue.shift | 52 | klass = obj.class | 53 | 2 (0.1%) / 2 (0.1%) | 54 | if klass == String | 55 | digest << obj | 56 | elsif klass == Symbol | 57 | digest << 'Symbol' | 58 | digest << obj.to_s | 59 | elsif klass == Fixnum
  104. $ stackprof tmp/stackprof.dump --method Sprockets::DigestUtils#digest # . . . Sprockets::DigestUtils#digest

    (lib/sprockets/digest_utils.rb:46) samples: 4 self (0.2%) / 7 total (0.3%) callers: 5 ( 71.4%) Sprockets::Cache#expand_key 2 ( 28.6%) Sprockets::Loader#load_from_unloaded callees (3 total): 3 ( 100.0%) Sprockets::DigestUtils#digest_class code: | 46 | def digest(obj) 4 (0.2%) / 1 (0.0%) | 47 | digest = digest_class.new | 48 | queue = [obj] | 49 | | 50 | while queue.length > 0 | 51 | obj = queue.shift | 52 | klass = obj.class | 53 | 2 (0.1%) / 2 (0.1%) | 54 | if klass == String | 55 | digest << obj | 56 | elsif klass == Symbol | 57 | digest << 'Symbol' | 58 | digest << obj.to_s | 59 | elsif klass == Fixnum
  105. def digest(obj) digest = digest_class.new queue = [obj] while queue.length

    > 0 obj = queue.shift klass = obj.class if klass == String digest << obj elsif klass == Symbol digest << 'Symbol' digest << obj.to_s elsif klass == Fixnum digest << 'Fixnum' digest << obj.to_s elsif klass == Bignum digest << 'Bignum' digest << obj.to_s elsif klass == TrueClass digest << 'TrueClass' elsif klass == FalseClass digest << 'FalseClass'
  106. Let me paraphrase

  107. if String

  108. if String elsif Symbol

  109. if String elsif Symbol elsif Fixnum

  110. if String elsif Symbol elsif Fixnum elsif Bignum

  111. if String elsif Symbol elsif Fixnum elsif Bignum elsif TrueClass

  112. if String elsif Symbol elsif Fixnum elsif Bignum elsif TrueClass

    elsif FalseClass
  113. if String elsif Symbol elsif Fixnum elsif Bignum elsif TrueClass

    elsif FalseClass elsif NilClass
  114. if String elsif Symbol elsif Fixnum elsif Bignum elsif TrueClass

    elsif FalseClass elsif NilClass elsif Array
  115. if String elsif Symbol elsif Fixnum elsif Bignum elsif TrueClass

    elsif FalseClass elsif NilClass elsif Array elsif Hash
  116. if String elsif Symbol elsif Fixnum elsif Bignum elsif TrueClass

    elsif FalseClass elsif NilClass elsif Array elsif Hash elsif Set
  117. if String elsif Symbol elsif Fixnum elsif Bignum elsif TrueClass

    elsif FalseClass elsif NilClass elsif Array elsif Hash elsif Set elsif Encoding
  118. If we pass in a Set object, we must make

    10 comparisons
  119. if String elsif Symbol elsif Fixnum elsif Bignum elsif TrueClass

    elsif FalseClass elsif NilClass elsif Array elsif Hash elsif Set elsif Encoding Expand and Iterate
  120. if/elsif is hidden iteration

  121. How do we go faster?

  122. Get rid of iteration

  123. Case statements

  124. or

  125. Hash loOkups

  126. Before

  127. def digest(obj) digest = digest_class.new queue = [obj] while queue.length

    > 0 obj = queue.shift klass = obj.class if klass == String digest << obj elsif klass == Symbol digest << 'Symbol' digest << obj.to_s elsif klass == Fixnum digest << 'Fixnum' digest << obj.to_s elsif klass == Bignum digest << 'Bignum' digest << obj.to_s elsif klass == TrueClass digest << 'TrueClass' elsif klass == FalseClass digest << 'FalseClass' elsif klass == NilClass digest << 'NilClass'.freeze elsif klass == Array digest << 'Array' queue.concat(obj) elsif klass == Hash digest << 'Hash' queue.concat(obj.sort) elsif klass == Set digest << 'Set' queue.concat(obj.to_a) elsif klass == Encoding digest << 'Encoding' digest << obj.name else raise TypeError, "couldn't digest #{klass}" end end digest.digest end
  128. After

  129. def digest(obj) digest = digest_class.new ADD_VALUE_TO_DIGEST[obj.class].call(obj, digest) digest.digest end

  130. Store logic in a Hash

  131. def digest(obj) digest = digest_class.new ADD_VALUE_TO_DIGEST[obj.class].call(obj, digest) digest.digest end Constant

    time lookup
  132. Logic Lives in Lambdas

  133. ADD_VALUE_TO_DIGEST = { String => ->(val, digest) { digest <<

    val }, FalseClass => ->(val, digest) { digest << 'FalseClass'.freeze }, TrueClass => ->(val, digest) { digest << 'TrueClass'.freeze }, NilClass => ->(val, digest) { digest << 'NilClass'.freeze }, Symbol => ->(val, digest) { digest << 'Symbol'.freeze digest << val.to_s }, Integer => ->(val, digest) { digest << 'Integer'.freeze digest << val.to_s }, Array => ->(val, digest) { digest << 'Array'.freeze val.each do |element| ADD_VALUE_TO_DIGEST[element.class].call(element, digest) end },
  134. ADD_VALUE_TO_DIGEST = { String => ->(val, digest) { digest <<

    val }, FalseClass => ->(val, digest) { digest << 'FalseClass'.freeze }, TrueClass => ->(val, digest) { digest << 'TrueClass'.freeze }, NilClass => ->(val, digest) { digest << 'NilClass'.freeze }, Symbol => ->(val, digest) { digest << 'Symbol'.freeze digest << val.to_s }, Integer => ->(val, digest) { digest << 'Integer'.freeze digest << val.to_s }, Array => ->(val, digest) { digest << 'Array'.freeze val.each do |element| ADD_VALUE_TO_DIGEST[element.class].call(element, digest) end },
  135. ADD_VALUE_TO_DIGEST = { String => ->(val, digest) { digest <<

    val }, FalseClass => ->(val, digest) { digest << 'FalseClass'.freeze }, TrueClass => ->(val, digest) { digest << 'TrueClass'.freeze }, NilClass => ->(val, digest) { digest << 'NilClass'.freeze }, Symbol => ->(val, digest) { digest << 'Symbol'.freeze digest << val.to_s }, Integer => ->(val, digest) { digest << 'Integer'.freeze digest << val.to_s }, Array => ->(val, digest) { digest << 'Array'.freeze val.each do |element| ADD_VALUE_TO_DIGEST[element.class].call(element, digest) end },
  136. ADD_VALUE_TO_DIGEST = { String => ->(val, digest) { digest <<

    val }, FalseClass => ->(val, digest) { digest << 'FalseClass'.freeze }, TrueClass => ->(val, digest) { digest << 'TrueClass'.freeze }, NilClass => ->(val, digest) { digest << 'NilClass'.freeze }, Symbol => ->(val, digest) { digest << 'Symbol'.freeze digest << val.to_s }, Integer => ->(val, digest) { digest << 'Integer'.freeze digest << val.to_s }, Array => ->(val, digest) { digest << 'Array'.freeze val.each do |element| ADD_VALUE_TO_DIGEST[element.class].call(element, digest) end },
  137. zOMG WAT

  138. Recursive hash is Recursive

  139. Faster?

  140. 14.9s 12.54s

  141. 16% faster asset compilation with no cache

  142. Repeat for all supported types

  143. Who likes audience participation?

  144. Can we go faster?

  145. ADD_VALUE_TO_DIGEST[String] ADD_VALUE_TO_DIGEST[Hash] ADD_VALUE_TO_DIGEST[true] ADD_VALUE_TO_DIGEST[false] ADD_VALUE_TO_DIGEST[nil] ADD_VALUE_TO_DIGEST[Array] ADD_VALUE_TO_DIGEST[Set] ADD_VALUE_TO_DIGEST[Integer]

  146. String Hash true false nil Array Set Integer All Constants

  147. Hash# compare_by_identity

  148. Compare object, not value

  149. String Hash true false nil Array Set Integer All Constants

  150. None
  151. 7% speed improvement

  152. Let’s slow it down

  153. What did we do?

  154. Identify a goal

  155. None
  156. None
  157. Gather potentially useful tools

  158. None
  159. None
  160. None
  161. None
  162. Iterate on the problem

  163. None
  164. None
  165. None
  166. None
  167. RubyVM::InstructionSequence.compile(code).disasm

  168. Shoulders of giants

  169. None
  170. The more you understand about _how_ your code works, the

    faster you can make it
  171. Find a speed buddy

  172. (or buddies)

  173. “C” how fast your program will go

  174. gem 'sassc-rails'

  175. Sassc uses libsass

  176. 31% faster

  177. Sprockets 4 will support sassc out of the box (but

    not yet)
  178. Don’t abandon your language for speed

  179. Find the slow parts and tag team them with other

    languages
  180. C-extensions

  181. Use Rust with helix

  182. Use Go with gorb

  183. p.s. have you tried JRuby?

  184. What about cached compile times

  185. None
  186. Bootscale caches require lookups

  187. Without it, the more gems on your system the LONGER

    `require` takes
  188. 6s cached compile times 2s cached compile times

  189. 300% faster

  190. None
  191. Don’t be an asshole

  192. Don’t appologize for the programing language you love

  193. Set your own goals

  194. Go at your own speed

  195. These are the good days of our programming life

  196. Live them to the fullest

  197. None
  198. None
  199. None
  200. None