Let's Subclass Hash - What's the Worst That Could Happen?

Let's Subclass Hash - What's the Worst That Could Happen?

Have you ever been tempted to subclass a core class like Hash or String? Or have you read blog posts about why you shouldn't do that, but been left confused as to the specifics? As a maintainer of Hashie, a gem that commits this exact sin, I'm here to tell you why you want to reach for other tools instead of the subclass.

In this talk, you'll hear stories from the trenches about what can go wrong when you subclass core classes. We'll dig into Ruby internals and you will leave with a few new tools for tracking down seemingly inexplicable performance issues and bugs in your applications.

Dad095ea7038f89f760419ce475d5d14?s=128

Michael Herold

November 14, 2018
Tweet

Transcript

  1. None
  2. None
  3. None
  4. None
  5. Let’s Subclass Hash What’s the worst that could happen?

  6. My name is Michael Herold. Please tweet me at @mherold

    or say hello@michaeljherold.com.
  7. None
  8. @mherold

  9. This talk is about a little gem called … @mherold

  10. This talk is about a little gem called … Hashie

    @mherold
  11. “Hashie is a collection of classes and mixins that make

    hashes more powerful.” @mherold
  12. “Hashie is a collection of classes and mixins that make

    hashes more powerful.” @mherold
  13. @mherold

  14. @mherold @mherold

  15. @mherold @mherold

  16. @mherold

  17. –Uncle Ben “With great power comes great responsibility.” @mherold

  18. –Alexander Pope “To err is human …” @mherold

  19. @mherold

  20. 1. Indifferent Access 2. Mash keys 3. Destructuring a Dash

    @mherold
  21. 1. Indifferent Access @mherold

  22. None
  23. class MyHash < Hash end @mherold

  24. class MyHash < Hash include Hashie::Extensions::MergeInitializer end @mherold

  25. Merge Initializer @mherold hash = MyHash.new( cat: 'meow', dog: {

    name: 'Rover', sound: 'woof' } )
  26. Merge Initializer @mherold hash = MyHash.new( cat: 'meow', dog: {

    name: 'Rover', sound: 'woof' } ) hash[:cat] #=> "meow"
  27. Merge Initializer @mherold hash = MyHash.new( cat: 'meow', dog: {

    name: 'Rover', sound: 'woof' } ) hash[:cat] #=> "meow" hash[:dog] #=> {:name=>"Rover", :sound=>"woof"}
  28. class MyHash < Hash include Hashie::Extensions::MergeInitializer end @mherold

  29. class MyHash < Hash include Hashie::Extensions::MergeInitializer include Hashie::Extensions::IndifferentAccess end @mherold

  30. Indifferent Access @mherold hash = MyHash.new( cat: 'meow', 'dog' =>

    { name: 'Rover', sound: 'woof' } )
  31. Indifferent Access @mherold hash = MyHash.new( cat: 'meow', 'dog' =>

    { name: 'Rover', sound: 'woof' } ) hash['cat'] == hash[:cat] #=> true
  32. Indifferent Access @mherold hash = MyHash.new( cat: 'meow', 'dog' =>

    { name: 'Rover', sound: 'woof' } ) hash['cat'] == hash[:cat] #=> true hash['dog'] == hash[:dog] #=> true
  33. class MyHash < Hash include Hashie::Extensions::MergeInitializer include Hashie::Extensions::IndifferentAccess end @mherold

  34. class MyHash < Hash include Hashie::Extensions::MergeInitializer include Hashie::Extensions::IndifferentAccess end hash

    = MyHash.new( cat: 'meow', 'dog' => { name: 'Mango', sound: 'woof' } ) @mherold
  35. class MyHash < Hash include Hashie::Extensions::MergeInitializer include Hashie::Extensions::IndifferentAccess end hash

    = MyHash.new( cat: 'meow', 'dog' => { name: 'Mango', sound: 'woof' } ) new_dog = hash[:dog].merge(breed: 'Blue Heeler') #=> NoMethodError: undefined method `convert!' @mherold
  36. @mherold

  37. module Hashie::Extensions::IndifferentAccess def merge(*) super.convert! end end @mherold

  38. module Hashie::Extensions::IndifferentAccess def merge(*) super.convert! end def convert! # ...

    end end @mherold
  39. What is happening? @mherold

  40. hash = MyHash.new( cat: 'meow', 'dog' => { name: 'Mango',

    sound: 'woof' } ) @mherold
  41. hash = MyHash.new( cat: 'meow', 'dog' => { name: 'Mango',

    sound: 'woof' } ) hash.respond_to?(:convert!) #=> true @mherold
  42. hash = MyHash.new( cat: 'meow', 'dog' => { name: 'Mango',

    sound: 'woof' } ) hash.respond_to?(:convert!) #=> true hash[:dog].respond_to?(:convert!) #=> true @mherold
  43. We need to go deeper. @mherold

  44. Pry + Byebug = @mherold

  45. module Hashie::Extensions::IndifferentAccess def merge(*) super.convert! end end @mherold

  46. module Hashie::Extensions::IndifferentAccess def merge(*) super.tap { |result| binding.pry }.convert! end

    end @mherold
  47. module Hashie::Extensions::IndifferentAccess def merge(*) super.tap { |result| binding.pry }.convert! end

    end hash.merge(breed: 'Blue Heeler’) @mherold
  48. module Hashie::Extensions::IndifferentAccess def merge(*) super.tap { |result| binding.pry }.convert! end

    end hash.merge(breed: 'Blue Heeler’) 134: def merge(*args) => 135: super.tap { |result| binding.pry }.convert! 136: end [1] pry(#<Pry::Config>)> @mherold
  49. @mherold

  50. self.class #=> Hash @mherold

  51. self.class #=> Hash result.class #=> Hash @mherold

  52. self.class #=> Hash result.class #=> Hash respond_to?(:convert!) #=> true @mherold

  53. self.class #=> Hash result.class #=> Hash respond_to?(:convert!) #=> true result.respond_to?(:convert!)

    #=> false @mherold
  54. self.class #=> Hash result.class #=> Hash respond_to?(:convert!) #=> true result.respond_to?(:convert!)

    #=> false singleton_class.ancestors #=> […, Hashie::Extensions::IndifferentAccess, …] @mherold
  55. self.class #=> Hash result.class #=> Hash respond_to?(:convert!) #=> true result.respond_to?(:convert!)

    #=> false singleton_class.ancestors #=> […, Hashie::Extensions::IndifferentAccess, …] result.singleton_class.ancestors #=> No indifferent access @mherold
  56. module Hashie::Extensions::IndifferentAccess def merge(*) super.convert! end end @mherold

  57. module Hashie::Extensions::IndifferentAccess def merge(*) - super.convert! end end @mherold

  58. module Hashie::Extensions::IndifferentAccess def merge(*) - super.convert! + result = super

    + IndifferentAccess.inject!(result) + result.convert! end end @mherold
  59. hash = MyHash.new( cat: 'meow', 'dog' => { name: 'Mango',

    sound: 'woof' } ) @mherold
  60. hash = MyHash.new( cat: 'meow', 'dog' => { name: 'Mango',

    sound: 'woof' } ) new_dog = hash[:dog].merge(breed: 'Blue Heeler') #=> {"name"=>"Rover", "sound"=>"woof", "breed"=>"Blue Heeler"} @mherold
  61. Why was this a problem? @mherold

  62. Hash has 178 public methods @mherold

  63. None
  64. None
  65. 2. Mash keys @mherold

  66. None
  67. Hashie is almost synonymous with Mash @mherold

  68. None
  69. None
  70. Mash @mherold mash = Hashie::Mash.new mash.name? # => false mash.name

    # => nil mash.name = "My Mash” mash.name # => "My Mash" mash.name? # => true mash.inspect # => <Hashie::Mash name="My Mash">
  71. Mash @mherold def method_missing(method_name, *args, &blk) return self.[](method_name, &blk) if

    key?(method_name) end
  72. Mash @mherold def method_missing(method_name, *args, &blk) return self.[](method_name, &blk) if

    key?(method_name) name, suffix = method_name_and_suffix(method_name) end
  73. Mash @mherold def method_missing(method_name, *args, &blk) return self.[](method_name, &blk) if

    key?(method_name) name, suffix = method_name_and_suffix(method_name) case suffix when ‘='.freeze then assign_property(name, args.first) end end
  74. Mash @mherold def method_missing(method_name, *args, &blk) return self.[](method_name, &blk) if

    key?(method_name) name, suffix = method_name_and_suffix(method_name) case suffix when ‘='.freeze then assign_property(name, args.first) when ‘?'.freeze then !!self[name] end end
  75. Mash @mherold def method_missing(method_name, *args, &blk) return self.[](method_name, &blk) if

    key?(method_name) name, suffix = method_name_and_suffix(method_name) case suffix when ‘='.freeze then assign_property(name, args.first) when ‘?'.freeze then !!self[name] when ‘!'.freeze then initializing_reader(name) end end
  76. Mash @mherold def method_missing(method_name, *args, &blk) return self.[](method_name, &blk) if

    key?(method_name) name, suffix = method_name_and_suffix(method_name) case suffix when ‘='.freeze then assign_property(name, args.first) when ‘?'.freeze then !!self[name] when ‘!'.freeze then initializing_reader(name) when ‘_'.freeze then underbang_reader(name) end end
  77. Mash @mherold def method_missing(method_name, *args, &blk) return self.[](method_name, &blk) if

    key?(method_name) name, suffix = method_name_and_suffix(method_name) case suffix when ‘='.freeze then assign_property(name, args.first) when ‘?'.freeze then !!self[name] when ‘!'.freeze then initializing_reader(name) when ‘_'.freeze then underbang_reader(name) else self[method_name] end end
  78. The README used to say “use it for JSON responses”

    … @mherold
  79. … so that’s what people do. @mherold

  80. response = HTTP.get(“http://myawesomeapi.com”) @mherold

  81. response = HTTP.get(“http://myawesomeapi.com”) json = JSON.parse(response.body) @mherold

  82. response = HTTP.get(“http://myawesomeapi.com”) json = JSON.parse(response.body) mash = Hashie::Mash.new(json) @mherold

  83. But remember: a Mash is a Hash @mherold

  84. Hash has 178 public methods @mherold

  85. Would any of these conflict? @mherold class count hash length

    trust zip
  86. mash = Hashie::Mash.new( name: ‘Millenium Biltmore’, zip: ‘90071’ ) @mherold

  87. mash = Hashie::Mash.new( name: ‘Millenium Biltmore’, zip: ‘90071’ ) mash.zip

    #=> [[["name", "Millenium Biltmore"]], [["zip", “90071"]]] @mherold
  88. Enumerable#zip @mherold

  89. The method is not missing @mherold

  90. … so it behaves unexpectedly. @mherold

  91. What should we do? @mherold

  92. None
  93. Hashie::Extension::MethodAccessWithOverride @mherold class MyMash < Hashie::Mash end

  94. Hashie::Extension::MethodAccessWithOverride @mherold class MyMash < Hashie::Mash include Hashie::Extensions::MethodAccessWithOverride end

  95. Hashie::Extension::MethodAccessWithOverride @mherold class MyMash < Hashie::Mash include Hashie::Extensions::MethodAccessWithOverride end mash

    = MyMash.new
  96. Hashie::Extension::MethodAccessWithOverride @mherold class MyMash < Hashie::Mash include Hashie::Extensions::MethodAccessWithOverride end mash

    = MyMash.new mash.awesome = 'sauce'
  97. Hashie::Extension::MethodAccessWithOverride @mherold class MyMash < Hashie::Mash include Hashie::Extensions::MethodAccessWithOverride end mash

    = MyMash.new mash.awesome = 'sauce' mash['awesome'] #=> ‘sauce'
  98. Hashie::Extension::MethodAccessWithOverride @mherold class MyMash < Hashie::Mash include Hashie::Extensions::MethodAccessWithOverride end mash

    = MyMash.new mash.awesome = 'sauce' mash['awesome'] #=> 'sauce' mash.zip = 'a-dee-doo-dah'
  99. Hashie::Extension::MethodAccessWithOverride @mherold class MyMash < Hashie::Mash include Hashie::Extensions::MethodAccessWithOverride end mash

    = MyMash.new mash.awesome = 'sauce' mash['awesome'] #=> 'sauce' mash.zip = 'a-dee-doo-dah' mash.zip #=> 'a-dee-doo-dah'
  100. Hashie::Extension::MethodAccessWithOverride @mherold class MyMash < Hashie::Mash include Hashie::Extensions::MethodAccessWithOverride end mash

    = MyMash.new mash.awesome = 'sauce' mash['awesome'] #=> 'sauce' mash.zip = 'a-dee-doo-dah' mash.zip #=> 'a-dee-doo-dah' mash.__zip #=> [[['awesome', 'sauce'], ['zip', 'a-dee-doo-dah']]]
  101. 3. Destructuring a Dash @mherold

  102. None
  103. None
  104. None
  105. ruby = { name: ‘Ruby 2.5’, release_date: ‘Christmas’ } @mherold

  106. ruby = { name: ‘Ruby 2.5’, release_date: ‘Christmas’ } {

    **ruby, name: ‘Ruby 2.6’ } #=> {:name=>"Ruby 2.6", :release_date=>”Christmas"} @mherold
  107. Dash @mherold class PersonHash < Hashie::Dash end

  108. Dash @mherold class PersonHash < Hashie::Dash property :name property :nickname

    end
  109. Dash @mherold class PersonHash < Hashie::Dash property :name property :nickname

    end PersonHash.new(foo: ‘bar’) #=> NoMethodError: The property 'foo' is not defined
  110. @mherold

  111. @mherold sam = PersonHash.new(name: ‘Samwise’, nickname: ‘Sam’)

  112. @mherold sam = PersonHash.new(name: ‘Samwise’, nickname: ‘Sam’) result = {

    **sam, height: ‘1.66m’ } #=> {:name=>"Samwise", :nickname=>"Sam", :height=>"1.66m"}
  113. @mherold sam = PersonHash.new(name: ‘Samwise’, nickname: ‘Sam’) result = {

    **sam, height: ‘1.66m’ } #=> {:name=>"Samwise", :nickname=>"Sam", :height=>"1.66m"} result[:height] #=> NoMethodError: The property 'height' is not defined
  114. @mherold sam = PersonHash.new(name: ‘Samwise’, nickname: ‘Sam’) result = {

    **sam, height: ‘1.66m’ } #=> {:name=>"Samwise", :nickname=>"Sam", :height=>"1.66m"} result[:height] #=> NoMethodError: The property 'height' is not defined { height: ‘1.66m’, **sam }[:height] #=> “1.66m”
  115. @mherold sam = PersonHash.new(name: ‘Samwise’, nickname: ‘Sam’) result = {

    **sam, height: ‘1.66m’ } #=> {:name=>"Samwise", :nickname=>"Sam", :height=>"1.66m"} result[:height] #=> NoMethodError: The property 'height' is not defined { height: ‘1.66m’, **sam }[:height] #=> “1.66m” { **sam.to_h, height: ‘1.66m’ }[:height] #=> “1.66m”
  116. Why? @mherold

  117. What happens when we double-splat? @mherold

  118. class Test def to_hash { foo: ‘bar’ } end end

    @mherold
  119. class Test def to_hash { foo: ‘bar’ } end end

    { **Test.new, baz: ‘quux’ } => {:foo=>"bar", :baz=>”quux"} @mherold
  120. None
  121. What happens when we double-splat inside a Hash literal? @mherold

  122. @mherold { **sam, height: ‘1.66m’ }

  123. @mherold “{ **sam, height: ‘1.66m’ }”

  124. @mherold RubyVM::InstructionSequence.compile( “{ **sam, height: ‘1.66m’ }” )

  125. @mherold RubyVM::InstructionSequence.compile( “{ **sam, height: ‘1.66m’ }” ).disasm

  126. @mherold puts RubyVM::InstructionSequence.compile( “{ **sam, height: ‘1.66m’ }” ).disasm

  127. @mherold puts RubyVM::InstructionSequence.compile( “{ **sam, height: ‘1.66m’ }” ).disasm ==

    disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,26)>================= 0000 putspecialobject 1 ( 1)[Li] 0002 putself 0003 opt_send_without_block <callinfo!mid:sam, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache> 0006 opt_send_without_block <callinfo!mid:core#hash_merge_kwd, argc:1, ARGS_SIMPLE>, <callcache> 0009 opt_send_without_block <callinfo!mid:dup, argc:0, ARGS_SIMPLE>, <callcache> 0012 putspecialobject 1 0014 swap 0015 putobject :height 0017 putstring "1.66m" 0019 opt_send_without_block <callinfo!mid:core#hash_merge_ptr, argc:3, ARGS_SIMPLE>, <callcache> 0022 leave
  128. @mherold puts RubyVM::InstructionSequence.compile( “{ **sam, height: ‘1.66m’ }” ).disasm ==

    disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,26)>================= 0000 putspecialobject 1 ( 1)[Li] 0002 putself 0003 opt_send_without_block <callinfo!mid:sam, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache> 0006 opt_send_without_block <callinfo!mid:core#hash_merge_kwd, argc:1, ARGS_SIMPLE>, <callcache> 0009 opt_send_without_block <callinfo!mid:dup, argc:0, ARGS_SIMPLE>, <callcache> 0012 putspecialobject 1 0014 swap 0015 putobject :height 0017 putstring "1.66m" 0019 opt_send_without_block <callinfo!mid:core#hash_merge_ptr, argc:3, ARGS_SIMPLE>, <callcache> 0022 leave
  129. Look for core_hash_merge_kwd in Ruby’s source code @mherold

  130. @mherold static VALUE core_hash_merge_kwd(int argc, VALUE *argv) { VALUE hash,

    kw; rb_check_arity(argc, 1, 2); hash = argv[0]; kw = rb_to_hash_type(argv[argc-1]); if (argc < 2) hash = kw; rb_hash_foreach(kw, argc < 2 ? kwcheck_i : kwmerge_i, hash); return hash; }
  131. @mherold static VALUE core_hash_merge_kwd(int argc, VALUE *argv) { VALUE hash,

    kw; rb_check_arity(argc, 1, 2); hash = argv[0]; kw = rb_to_hash_type(argv[argc-1]); if (argc < 2) hash = kw; rb_hash_foreach(kw, argc < 2 ? kwcheck_i : kwmerge_i, hash); return hash; }
  132. Ruby’s VM casts the value to a hash … @mherold

  133. … but only when it isn’t already a Hash. @mherold

  134. Recall: a Dash is a Hash. @mherold

  135. The VM does not call #to_hash @mherold

  136. @mherold { **sam, height: ‘1.66m’ } #=> {:name=>"Samwise", :nickname=>"Sam", :height=>"1.66m"}

  137. @mherold { **sam, height: ‘1.66m’ } #=> {:name=>"Samwise", :nickname=>"Sam", :height=>"1.66m"}

    sam.merge(height: ‘1.66m’) #=> NoMethodError: The property 'height' is not defined
  138. @mherold

  139. @mherold static VALUE core_hash_merge_kwd(int argc, VALUE *argv) { VALUE hash,

    kw; rb_check_arity(argc, 1, 2); hash = argv[0]; kw = rb_to_hash_type(argv[argc-1]); if (argc < 2) hash = kw; rb_hash_foreach(kw, argc < 2 ? kwcheck_i : kwmerge_i, hash); return hash; }
  140. The VM does not call #merge @mherold

  141. Dash’s property logic exists in Ruby so it isn’t run.

    @mherold
  142. Unfortunately, we can’t “fix” this. @mherold

  143. So we wrote it up in the README. @mherold

  144. @mherold

  145. @mherold

  146. 1. Indifferent Access 2. Mash keys 3. Destructuring a Dash

    @mherold
  147. class MyHash < Hash @mherold

  148. Your interface is suddenly 173 methods (and counting) @mherold

  149. Do you think you can catch all the corner cases?

    @mherold
  150. (If so, please contact me - we’d love another co-maintainer!

    ) @mherold
  151. But wait …

  152. A wild PSA appears!

  153. Hashie::Mash @mherold

  154. @mherold

  155. None
  156. None
  157. Gem Name Total Downloads Rank omniauth 199 inspec 262 elasticsearch-api

    264 elasticsearch-transport 265 restforce 567 chef-zero 716 elasticsearch-model 782 ridley 890 zendesk_api 911 Data from the 2018-11-12 RubyGems.org data dump Queries can be found at https://michaeljherold.com/rubyconf2018 @mherold
  158. @mherold

  159. You might not need Hashie::Mash @mherold

  160. json = JSON.parse(<<JSON) { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON @mherold
  161. json = JSON.parse(<<JSON) { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = Hashie::Mash.new(json, object_class: OpenStruct) #=> #<Hashie::Mash bazes=["baz", "quux"]> foo="bar"> @mherold
  162. json = JSON.parse(<<JSON) { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = Hashie::Mash.new(json, object_class: OpenStruct) #=> #<Hashie::Mash bazes=["baz", "quux"]> foo="bar"> parsed.foo #=> "bar" @mherold
  163. json = JSON.parse(<<JSON) { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = Hashie::Mash.new(json, object_class: OpenStruct) #=> #<Hashie::Mash bazes=["baz", "quux"]> foo="bar"> parsed.foo #=> "bar" parsed['foo'] #=> "bar" @mherold
  164. json = JSON.parse(<<JSON) { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = Hashie::Mash.new(json, object_class: OpenStruct) #=> #<Hashie::Mash bazes=["baz", "quux"]> foo="bar"> parsed.foo #=> "bar" parsed['foo'] #=> "bar" parsed[:foo] #=> "bar" @mherold
  165. json = JSON.parse(<<JSON) { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = Hashie::Mash.new(json, object_class: OpenStruct) #=> #<Hashie::Mash bazes=["baz", "quux"]> foo="bar"> parsed.foo #=> "bar" parsed['foo'] #=> "bar" parsed[:foo] #=> "bar" parsed.bazes #=> ["baz", “quux"] @mherold
  166. Hashie::Mash @mherold

  167. @mherold

  168. require ‘json’ require ‘ostruct’ @mherold

  169. json = <<JSON { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON @mherold
  170. json = <<JSON { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = JSON.parse(json, object_class: OpenStruct) #=> #<OpenStruct foo="bar", bazes=["baz", "quux"]> @mherold
  171. json = <<JSON { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = JSON.parse(json, object_class: OpenStruct) #=> #<OpenStruct foo="bar", bazes=["baz", "quux"]> parsed.foo #=> "bar" @mherold
  172. json = <<JSON { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = JSON.parse(json, object_class: OpenStruct) #=> #<OpenStruct foo="bar", bazes=["baz", "quux"]> parsed.foo #=> "bar" parsed['foo'] #=> "bar" @mherold
  173. json = <<JSON { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = JSON.parse(json, object_class: OpenStruct) #=> #<OpenStruct foo="bar", bazes=["baz", "quux"]> parsed.foo #=> "bar" parsed['foo'] #=> "bar" parsed[:foo] #=> "bar" @mherold
  174. json = <<JSON { "foo": "bar", "bazes": [ "baz", "quux"

    ] } JSON parsed = JSON.parse(json, object_class: OpenStruct) #=> #<OpenStruct foo="bar", bazes=["baz", "quux"]> parsed.foo #=> "bar" parsed['foo'] #=> "bar" parsed[:foo] #=> "bar" parsed.bazes #=> ["baz", “quux"] @mherold
  175. @mherold

  176. My name is Michael Herold. Please tweet me at @mherold

    or say hello@michaeljherold.com.
  177. None