Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to optimize Ruby internal.

Watson
September 18, 2017

How to optimize Ruby internal.

"Ruby 3" has aimed to optimize performance which is one of goals to release. I have made some patches to optimize Ruby internal to realize it.

This talk describes how optimized Ruby internal at Ruby 2.5.

Watson

September 18, 2017
Tweet

More Decks by Watson

Other Decks in Technology

Transcript

  1. About this talk • How to measure Ruby Internal •

    Idea to optimize Ruby internal • Optimize
  2. Prepare benchmark code hash1 = {aaa: 12, bbb: 34} hash2

    = {ccc: 56, ddd: 78} loop do hash1.merge(hash2) end
  3. Mesure $ iprofiler -timeprofiler ./miniruby ~/benchmark.rb hash1 = {aaa: 12,

    bbb: 34} hash2 = {ccc: 56, ddd: 78} loop do hash1.merge(hash2) end
  4. Method execution... • Method dispatching • Look up constants /

    methods • Ruby method executing • Implemented in Hash/String/Array/Time...
  5. • Focused to reduce method execution time • Remove dispatching

    in method • Remove redundant allocations
  6. rb_obj_dup() • It calls Object#initialize_dup via rb_funcall() • Replace rb_obj_dup()

    to something like rb_ary_dup() to remove redundant Object#initialize_dup
  7. Hash#merge performance Y Y Y Y Y Y 3VCZ 3VCZEFW

    Y Y hash1 = { "a" => 100, "b" => 200 } hash2 = { "b" => 254, "c" => 300 } hash1.merge(hash2)
  8. Time methods • Time methods called Ruby methods via rb_funcall()

    • Added some internal APIs to call method directly
  9. Result (2.4.1 vs 2.5.0-dev) "SSBZ Y Y Y Y Y

    )BTI Y Y Y Y Y 4USJOH Y Y Y Y Y 5JNF Y Y Y Y Y Ubuntu 17.04 gcc version 7.0.1 ruby 2.5.0dev (2017-08-27 trunk 59665) [x86_64-linux]
  10. Top 10 4USJOH<OUI> PUIFS  4USJOHJOTFSU QPT PUIFS  "SSBZSBTTPD

    PCK  5JNFTVCTFD  5JNFUP@J  5JNFUW@TFD  )BTIIBT@WBMVF OPWBMVF  )BTIWBMVF OPWBMVF  5JNFUP@S  "SSBZNBY O 
  11. Worst 10 "SSBZDZDMF O \cPCKcCMPDL^  "SSBZFBDI@JOEFY\cJOEFYc^  4USJOHUP@J 

    "SSBZBOZ \cYcCMPDL^  "SSBZSJOEFY WBM OPUGPVOE  4USJOHMJOFT   5JNFVTFD  "SSBZCTFBSDI@JOEFY\cYcCMPDL^  4USJOHMJOFT  \cMJOFc^  )BTIMJUFSBM 
  12. Worst 10 "SSBZDZDMF O \cPCKcCMPDL^  "SSBZFBDI@JOEFY\cJOEFYc^  4USJOHUP@J 

    "SSBZBOZ \cYcCMPDL^  "SSBZSJOEFY WBM OPUGPVOE  4USJOHMJOFT   5JNFVTFD  "SSBZCTFBSDI@JOEFY\cYcCMPDL^  4USJOHMJOFT  \cMJOFc^  )BTIMJUFSBM 
  13. Hash Internal RBasic st_table * int VALUE RHash char char

    char int st_hash_type * st_index_t st_index_t * st_index_t st_index_t st_table_entry * st_table st_index_t st_index_t st_index_t st_index_t …. st_index_t [] st_table_entry st_table_entry …. st_table_entry st_table_entry []
  14. 4 allocations RBasic st_table * int VALUE RHash char char

    char int st_hash_type * st_index_t st_index_t * st_index_t st_index_t st_table_entry * st_table st_index_t st_index_t st_index_t st_index_t …. st_index_t [] st_table_entry st_table_entry …. st_table_entry st_table_entry []
  15. Reused & faster Slow allocating RBasic st_table * int VALUE

    RHash char char char int st_hash_type * st_index_t st_index_t * st_index_t st_index_t st_table_entry * st_table st_index_t st_index_t st_index_t st_index_t …. st_index_t [] st_table_entry st_table_entry …. st_table_entry st_table_entry []
  16. Always allocating RBasic st_table * int VALUE RHash char char

    char int st_hash_type * st_index_t st_index_t * st_index_t st_index_t st_table_entry * st_table st_index_t st_index_t st_index_t st_index_t …. st_index_t [] st_table_entry st_table_entry …. st_table_entry st_table_entry []
  17. Hash literal performance Y Y Y Y Y Y Y

    Before After h = {foo: 12, bar: 34, baz: 56} Caution: This is just prototype
  18. vs. Ruby 2.4.1 After Ruby 2.4.1 Base : ruby 2.5.0dev

    (2017-09-10 trunk 59745) [x86_64-linux] Y Y Y Y Y Y Y Caution: This is just prototype h = {foo: 12, bar: 34, baz: 56}
  19. You might learn: • How to measure • Some ways

    to optimize effectively • A part of current Ruby-dev status