Slide 1

Slide 1 text

How to optimize Ruby internal Shizuo Fujita

Slide 2

Slide 2 text

Self • @watson1978 • Ubiregi Inc. • Ruby committer

Slide 3

Slide 3 text

Ruby 3x3 • Ruby 3 need 3 times faster performance than Ruby 2

Slide 4

Slide 4 text

About this talk • How to measure Ruby Internal • Idea to optimize Ruby internal • Optimize

Slide 5

Slide 5 text

How to measure

Slide 6

Slide 6 text

Prepare benchmark code hash1 = {aaa: 12, bbb: 34} hash2 = {ccc: 56, ddd: 78} loop do hash1.merge(hash2) end

Slide 7

Slide 7 text

Mesure $ iprofiler -timeprofiler ./miniruby ~/benchmark.rb hash1 = {aaa: 12, bbb: 34} hash2 = {ccc: 56, ddd: 78} loop do hash1.merge(hash2) end

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

Idea to optimize Ruby internal

Slide 13

Slide 13 text

Method execution... • Method dispatching • Look up constants / methods • Ruby method executing • Implemented in Hash/String/Array/Time...

Slide 14

Slide 14 text

Dispatching Method execution

Slide 15

Slide 15 text

Dispatching Method execution

Slide 16

Slide 16 text

Dispatching Method execution

Slide 17

Slide 17 text

Dispatching Method execution

Slide 18

Slide 18 text

• Focused to reduce method execution time

Slide 19

Slide 19 text

• Focused to reduce method execution time • Remove dispatching in method • Remove redundant allocations

Slide 20

Slide 20 text

Remove dispatching Dispatching Method execution Dispatching Ruby method via rb_funcall()

Slide 21

Slide 21 text

Remove dispatching Dispatching Method execution

Slide 22

Slide 22 text

Remove redundant allocations BMMPDBUJPOT Dispatching Method execution

Slide 23

Slide 23 text

Remove redundant allocations BMMPDBUJPOT BMMPDBUJPOT Dispatching Method execution

Slide 24

Slide 24 text

Optimize

Slide 25

Slide 25 text

Hash#merge • It has used rb_obj_dup() and it’s calling rb_funcall()

Slide 26

Slide 26 text

rb_obj_dup() • It calls Object#initialize_dup via rb_funcall()

Slide 27

Slide 27 text

rb_obj_dup() • It calls Object#initialize_dup via rb_funcall() • Replace rb_obj_dup() to something like rb_ary_dup() to remove redundant Object#initialize_dup

Slide 28

Slide 28 text

Patch for Hash#merge

Slide 29

Slide 29 text

Patch for Hash#merge • Replaced rb_obj_dup() to remove rb_funcall()

Slide 30

Slide 30 text

Hash#merge performance Y Y Y Y Y Y 3VCZ 3VCZEFW Y Y hash1 = { "a" => 100, "b" => 200 } hash2 = { "b" => 254, "c" => 300 } hash1.merge(hash2)

Slide 31

Slide 31 text

Patch for Time (1)

Slide 32

Slide 32 text

Patch for Time (2)

Slide 33

Slide 33 text

Patch for Time (3)

Slide 34

Slide 34 text

Time methods • Time methods called Ruby methods via rb_funcall()

Slide 35

Slide 35 text

Time methods • Time methods called Ruby methods via rb_funcall() • Added some internal APIs to call method directly

Slide 36

Slide 36 text

Time#- performance Y Y Y Y Y Y 3VCZ 3VCZEFW Y Y Time.now - Time.at(0)

Slide 37

Slide 37 text

Others

Slide 38

Slide 38 text

Result (2.4.1 vs 2.5.0-dev) "SSBZ Y Y Y Y Y )BTI Y Y Y Y Y 4USJOH Y Y Y Y Y 5JNF Y Y Y Y Y Ubuntu 17.04 gcc version 7.0.1 ruby 2.5.0dev (2017-08-27 trunk 59665) [x86_64-linux]

Slide 39

Slide 39 text

Top 10 4USJOH PUIFS 4USJOHJOTFSU QPT PUIFS "SSBZSBTTPD PCK 5JNFTVCTFD 5JNFUP@J 5JNFUW@TFD )BTIIBT@WBMVF OPWBMVF )BTIWBMVF OPWBMVF 5JNFUP@S "SSBZNBY O

Slide 40

Slide 40 text

Worst 10 "SSBZDZDMF O \cPCKcCMPDL^ "SSBZFBDI@JOEFY\cJOEFYc^ 4USJOHUP@J "SSBZBOZ \cYcCMPDL^ "SSBZSJOEFY WBM OPUGPVOE 4USJOHMJOFT 5JNFVTFD "SSBZCTFBSDI@JOEFY\cYcCMPDL^ 4USJOHMJOFT \cMJOFc^ )BTIMJUFSBM

Slide 41

Slide 41 text

Worst 10 "SSBZDZDMF O \cPCKcCMPDL^ "SSBZFBDI@JOEFY\cJOEFYc^ 4USJOHUP@J "SSBZBOZ \cYcCMPDL^ "SSBZSJOEFY WBM OPUGPVOE 4USJOHMJOFT 5JNFVTFD "SSBZCTFBSDI@JOEFY\cYcCMPDL^ 4USJOHMJOFT \cMJOFc^ )BTIMJUFSBM

Slide 42

Slide 42 text

Ruby 2.5.0-dev 35.7 % slow down

Slide 43

Slide 43 text

3FHSFTTJPOXBTpYFECZTIZPVIFJ

Slide 44

Slide 44 text

One more thing…

Slide 45

Slide 45 text

Hash • Hash object need to allocate some heap areas

Slide 46

Slide 46 text

Hash Internal RBasic st_table * int VALUE RHash char char char int st_hash_type * st_index_t st_index_t * st_index_t st_index_t st_table_entry * st_table st_index_t st_index_t st_index_t st_index_t …. st_index_t [] st_table_entry st_table_entry …. st_table_entry st_table_entry []

Slide 47

Slide 47 text

4 allocations RBasic st_table * int VALUE RHash char char char int st_hash_type * st_index_t st_index_t * st_index_t st_index_t st_table_entry * st_table st_index_t st_index_t st_index_t st_index_t …. st_index_t [] st_table_entry st_table_entry …. st_table_entry st_table_entry []

Slide 48

Slide 48 text

Reused & faster Slow allocating RBasic st_table * int VALUE RHash char char char int st_hash_type * st_index_t st_index_t * st_index_t st_index_t st_table_entry * st_table st_index_t st_index_t st_index_t st_index_t …. st_index_t [] st_table_entry st_table_entry …. st_table_entry st_table_entry []

Slide 49

Slide 49 text

Always allocating RBasic st_table * int VALUE RHash char char char int st_hash_type * st_index_t st_index_t * st_index_t st_index_t st_table_entry * st_table st_index_t st_index_t st_index_t st_index_t …. st_index_t [] st_table_entry st_table_entry …. st_table_entry st_table_entry []

Slide 50

Slide 50 text

…. …. …. …. Before After Concatenate heap areas 2 allocations 1 allocation

Slide 51

Slide 51 text

Hash literal performance Y Y Y Y Y Y Y Before After h = {foo: 12, bar: 34, baz: 56} Caution: This is just prototype

Slide 52

Slide 52 text

vs. Ruby 2.4.1 After Ruby 2.4.1 Base : ruby 2.5.0dev (2017-09-10 trunk 59745) [x86_64-linux] Y Y Y Y Y Y Y Caution: This is just prototype h = {foo: 12, bar: 34, baz: 56}

Slide 53

Slide 53 text

You might learn: • How to measure • Some ways to optimize effectively • A part of current Ruby-dev status

Slide 54

Slide 54 text

Thank you !!