Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Milion rzeczy, które musi zrobić YARV zanim wykona Twój kod

Milion rzeczy, które musi zrobić YARV zanim wykona Twój kod

Slides from a presentation delivered on 19.12.2012 at the Warsaw Ruby Users Group meetup.

http://wrug.eu/2012/12/12/spotkanie-grudniowe/

Jan Stępień

December 19, 2012
Tweet

More Decks by Jan Stępień

Other Decks in Programming

Transcript

  1. Historya prawdziwa opowiedziana
    19 grudnia 2012, w której
    Jan Stępień
    wraz z gronem śmiałków spod chorągwi Czerwonego
    Kryształu pogrążą się w niezbadanych czeluściach
    Maszyn Wirtualnych
    a na dnie samym staną oko w oko z monstrum
    przez człowieczy rozum nieogarnionym, co
    YARV
    się zowie.

    View full-size slide

  2. Maszyny wirtualne czynią życie prostszym
    Abstrakcja nad
    pamięcią
    Dynamiczny
    wpływ na
    działanie
    Abstrakcja nad
    sprzętem
    Nie ma nic za darmo

    View full-size slide

  3. Lisp, SmallTalk, Self… To były czasy!

    View full-size slide

  4. So, Ruby was a Lisp originally,
    in theory. Let’s call it
    MatzLisp from now on. ;-)
    Matz
    Re: Ruby’s lisp features
    13 lutego 2006

    View full-size slide

  5. I have always been more interested
    in designing the language than
    implementing it. So Ruby interpreter
    is always slower than it should be.
    Matz
    The Ruby VM: Episode I
    2007

    View full-size slide

  6. Matz’s Ruby
    Interpreter

    View full-size slide

  7. Yet another
    Ruby VM
    笹田耕一 (Koichi Sasada)

    View full-size slide

  8. parse.c
    Kod → abstract syntax tree

    View full-size slide

  9. irb(main):000:0> require ’ripper’
    irb(main):001:0> pp Ripper.sexp ”Math.sqrt 3 + 4”
    [:program,
    [[:command_call,
    [:var_ref, [:@const, ”Math”, [1, 0]]],
    :”.”,
    [:@ident, ”sqrt”, [1, 5]],
    [:args_add_block,
    [[:binary, [:@int, ”3”, [1, 10]], :+,
    [:@int, ”4”, [1, 14]]]],
    false]]]]

    View full-size slide

  10. compile.c
    abstract syntax tree → bytecode

    View full-size slide

  11. RubyVM::InstructionSequence
    (znany również jako iseq)

    View full-size slide

  12. irb(main):000:0> iseq =
    RubyVM::InstructionSequence.compile ”x + 6”
    irb(main):001:0> puts iseq.disasm
    == disasm: ==
    0000 trace 1 ( 1)
    0002 putself
    0003 send :x, 0, nil, 24,
    0009 putobject 6
    0011 opt_plus
    0013 leave

    View full-size slide

  13. irb(main):002:0> pp iseq.to_a
    [”YARVInstructionSequence/SimpleDataFormat”,
    1, 2, 1,
    {arg_size: 0, local_size: 1, stack_max: 2},
    ””, ””, nil, 1, :top,
    [], 0, [],
    [1,
    [:trace, 1],
    [:putself],
    [:send, :a, 0, nil, 24, 0],
    [:putself],
    [:send, :b, 0, nil, 24, 1],
    [:opt_plus, 3],
    [:leave]]]

    View full-size slide

  14. #if SUPPORT_JOKE
    ...
    CONST_ID(goto_id, ”__goto__”);
    CONST_ID(label_id, ”__label__”);
    if (nd_type(node) == NODE_FCALL &&
    (mid == goto_id || mid == label_id)) {
    ...
    #endif

    View full-size slide

  15. $ cd ruby
    $ sed -i vm_opts.h -e
    ’s/define SUPPORT_JOKE *0/define SUPPORT_JOKE 1/’
    $ make

    View full-size slide

  16. __label__ :start
    puts ”haters gonna hate”
    __goto__ :start

    View full-size slide

  17. haters gonna hate haters gonna hate haters gonna ha
    gonna hate haters gonna hate haters gonna hate hate
    hate haters gonna hate haters gonna hate haters go
    haters gonna hate haters gonna hate haters gonna ha
    gonna hate haters gonna hate haters gonna hate hate
    hate haters gonna hate haters gonna hate haters go
    haters gonna hate haters gonna hate haters gonna ha
    gonna hate haters gonna hate haters gonna hate hate
    hate haters gonna hate haters gonna hate haters go
    haters gonna hate haters gonna hate haters gonna ha
    gonna hate haters gonna hate haters gonna hate hate
    hate haters gonna hate haters gonna hate haters go
    haters gonna hate haters gonna hate haters gonna ha

    View full-size slide

  18. insns.def
    Definicje instrukcji
    maszyny wirtualnej

    View full-size slide

  19. /**
    @c put
    @e put new array.
    @j 新しい配列をスタック上の num
    個の値で初期化して生成しプッシュする。
    */
    DEFINE_INSN
    newarray
    (rb_num_t num)
    (...)
    (VALUE val) // inc += 1 - num;
    {
    val = rb_ary_new4((long)num,
    STACK_ADDR_FROM_TOP(num));
    POPN(num);
    }

    View full-size slide

  20. /**
    @c joke
    @e The Answer to Life, the Universe,
    and Everything
    @j 人生、宇宙、すべての答え。
    */
    DEFINE_INSN
    answer
    ()
    ()
    (VALUE ret)
    {
    ret = INT2FIX(42);
    }

    View full-size slide

  21. irb(main):000:0> RubyVM::INSTRUCTION_NAMES
    => [”nop”, ”getlocal”, ”setlocal”, ”getspecial”,
    ”setspecial”, ”getdynamic”, ”setdynamic”,
    ”getinstancevariable”, ”setinstancevariable”,
    ”getclassvariable”, ”setclassvariable”,
    ”getconstant”, ”setconstant”, ...

    View full-size slide

  22. YARVinstructions — How Ruby 1.9
    Executes Your Ruby Script
    http://yarvinstructions.heroku.com

    View full-size slide

  23. $ cd ruby
    $ make V=1
    ...
    ruby ./tool/insns2vm.rb srcdir=”.” insns.inc
    ruby ./tool/insns2vm.rb srcdir=”.” insns_info.inc
    ruby ./tool/insns2vm.rb srcdir=”.” optinsn.inc
    ruby ./tool/insns2vm.rb srcdir=”.” optunifs.inc
    ruby ./tool/insns2vm.rb srcdir=”.” opt_sc.inc
    ruby ./tool/insns2vm.rb srcdir=”.” vmtc.inc
    ruby ./tool/insns2vm.rb srcdir=”.” vm.inc
    ...

    View full-size slide

  24. vm.inc
    To tu siedzi gros maszyny wirtualnej

    View full-size slide

  25. Przykład:
    getinstancevariable
    label_insn_getinstancevariable:
    id = stack_pop();
    ic = stack_pop();
    val = vm_get_ivar(get_self(), id, ic);
    stack_push(val);
    instruction_pointer += 1;
    goto instruction_labels[instruction_pointer];

    View full-size slide

  26. vm_exec_core
    Funkcja, która nie przeszłaby żadnego code review

    View full-size slide

  27. Chwila na zebranie myśli

    View full-size slide

  28. vm_call_cfunc
    7 (0.4%)
    of 1285 (80.9%)
    rb_ary_each
    0 (0.0%)
    of 981 (61.8%)
    1656
    rb_f_catch
    0 (0.0%)
    of 898 (56.5%)
    883
    rb_vm_invoke_proc
    1 (0.1%)
    of 810 (51.0%)
    872
    t_run_machine_without_threads
    0 (0.0%)
    of 724 (45.6%)
    706
    rb_require_safe
    0 (0.0%)
    of 679 (42.8%)
    1109
    rb_obj_tap
    0 (0.0%)
    of 538 (33.9%)
    522
    rb_ary_collect
    0 (0.0%)
    of 416 (26.2%)
    819
    rb_push_glob
    0 (0.0%)
    of 208 (13.1%)
    208
    rb_class_new_instance
    0 (0.0%)
    of 195 (12.3%)
    230
    rb_ensure
    0 (0.0%)
    of 184 (11.6%)
    35
    enum_all
    0 (0.0%)
    of 115 (7.2%)
    115
    yield_under
    0 (0.0%)
    of 102 (6.4%)
    134
    enum_any
    0 (0.0%)
    of 90 (5.7%)
    90
    rb_yield
    1 (0.1%)
    of 1216 (76.6%)
    1891
    rb_catch_obj
    0 (0.0%)
    of 915 (57.6%)
    898
    vm_exec
    6 (0.4%)
    of 1285 (80.9%)
    910
    EventMachine_t
    Run
    0 (0.0%)
    of 737 (46.4%)
    724
    rb_load_internal
    0 (0.0%)
    of 689 (43.4%)
    1309
    search_required
    0 (0.0%)
    of 95 (6.0%)
    95
    542 818
    push_glob
    0 (0.0%)
    of 207 (13.0%)
    207
    vm_call0
    5 (0.3%)
    of 1160 (73.0%)
    214
    34
    rb_block_call
    0 (0.0%)
    of 305 (19.2%)
    115
    vm_yield_with_cref
    0 (0.0%)
    of 103 (6.5%)
    137
    90
    vm_call_method
    35 (2.2%)
    of 1285 (80.9%)
    8049
    35
    6
    vm_exec_core
    73 (4.6%)
    of 1285 (80.9%)
    8628
    8101
    73
    rb_funcall
    3 (0.2%)
    of 933 (58.8%)
    3
    rb_autoload_load
    0 (0.0%)
    of 145 (9.1%)
    3
    vm_get_ev_const.isra.6
    1 (0.1%)
    of 141 (8.9%)
    285
    rb_const_get_0
    0 (0.0%)
    of 125 (7.9%)
    4
    2
    1112 280
    114
    171
    183
    vm_yield
    1 (0.1%)
    of 1235 (77.8%)
    4414
    vm_yield_with_cfunc
    1 (0.1%)
    of 263 (16.6%)
    273
    39
    all_iter_i
    1 (0.1%)
    of 115 (7.2%)
    115
    any_iter_i
    0 (0.0%)
    of 88 (5.5%)
    88
    3479
    228
    35
    1307
    5
    catch_i
    0 (0.0%)
    of 926 (58.3%)
    926
    914
    event_callback_wrapper
    0 (0.0%)
    of 835 (52.6%)
    event_callback_wrapper
    (inline)
    0 (0.0%)
    of 835 (52.6%)
    835
    835
    ConnectionDescriptor
    _DispatchInboundData
    0 (0.0%)
    of 808 (50.9%)
    808
    EventMachine_t
    _RunSelectOnce
    0 (0.0%)
    of 787 (49.6%)
    ConnectionDescriptor
    Read
    0 (0.0%)
    of 779 (49.1%)
    772
    779
    EventMachine_t
    _RunOnce
    0 (0.0%)
    of 765 (48.2%)
    759
    737
    rb_iseq_eval
    0 (0.0%)
    of 697 (43.9%)
    1286
    1269
    rb_feature_p
    2 (0.1%)
    of 124 (7.8%)
    79
    rb_iseq_eval_main
    0 (0.0%)
    of 396 (24.9%)
    396
    ruby_exec_internal
    0 (0.0%)
    of 382 (24.1%)
    382
    ruby_exec_node
    0 (0.0%)
    of 369 (23.2%)
    369
    ruby_run_node
    0 (0.0%)
    of 358 (22.5%)
    352
    main
    0 (0.0%)
    of 342 (21.5%)
    342
    __libc_start_main
    0 (0.0%)
    of 330 (20.8%)
    330
    _start
    0 (0.0%)
    of 306 (19.3%)
    306
    rb_iterate
    0 (0.0%)
    of 305 (19.2%)
    314
    313
    114 88
    ruby_brace_glob0
    0 (0.0%)
    of 207 (13.0%)
    207
    ruby_brace_expand.constprop.1
    7 (0.4%)
    of 193 (12.2%)
    193
    ruby_glob0
    0 (0.0%)
    of 182 (11.5%)
    14 366
    168
    glob_helper
    6 (0.4%)
    of 111 (7.0%)
    111
    125
    2
    rb_get_expanded_load_path
    0 (0.0%)
    of 106 (6.7%)
    105
    139
    eval_string_with_cref
    1 (0.1%)
    of 98 (6.2%)
    67
    1
    st_foreach
    56 (3.5%)
    of 339 (21.3%)
    56
    mark_entry
    2 (0.1%)
    of 241 (15.2%)
    743
    mark_method_entry_i
    8 (0.5%)
    of 233 (14.7%)
    986
    mark_const_entry_i
    3 (0.2%)
    of 165 (10.4%)
    323
    mark_keyvalue
    0 (0.0%)
    of 138 (8.7%)
    233
    gc_mark_children
    53 (3.3%)
    of 284 (17.9%)
    729
    proc_mark
    0 (0.0%)
    of 165 (10.4%)
    6
    gc_mark
    127 (8.0%)
    of 128 (8.1%)
    15
    8
    96
    iseq_mark
    32 (2.0%)
    of 233 (14.7%)
    739
    117
    12
    3
    319
    212
    2
    7
    3059
    mark_tbl
    3 (0.2%)
    of 256 (16.1%)
    711
    mark_m_tbl
    0 (0.0%)
    of 234 (14.7%)
    982
    332
    86
    mark_const_tbl
    0 (0.0%)
    of 163 (10.3%)
    325
    mark_hash
    0 (0.0%)
    of 137 (8.6%)
    230
    60
    751
    3
    993
    1050
    62
    22
    21
    env_mark
    0 (0.0%)
    of 158 (9.9%)
    199
    328
    234
    117
    vm_xmalloc
    1 (0.1%)
    of 223 (14.0%)
    vm_malloc_prepare
    3 (0.2%)
    of 186 (11.7%)
    173
    2
    garbage_collect
    0 (0.0%)
    of 187 (11.8%)
    183
    gc_marks
    0 (0.0%)
    of 110 (6.9%)
    91
    gc_sweep
    0 (0.0%)
    of 96 (6.0%)
    96
    31
    slot_sweep
    53 (3.3%)
    of 99 (6.2%)
    93
    mark_locations_array
    4 (0.3%)
    of 158 (9.9%)
    255
    203
    60
    4
    str_new
    2 (0.1%)
    of 115 (7.2%)
    103
    2
    53
    rb_usascii_str_new
    1 (0.1%)
    of 94 (5.9%)
    93

    View full-size slide

  29. glob_helper
    6 (0.4%)
    of 111 (7.0%)
    125 st_foreach
    56 (3.5%)
    of 339 (21.3%)
    56
    mark_entry
    2 (0.1%)
    of 241 (15.2%)
    743
    mark_method_entry_i
    8 (0.5%)
    of 233 (14.7%)
    986
    mark_const_entry_i
    3 (0.2%)
    of 165 (10.4%)
    323
    mark_keyvalue
    0 (0.0%)
    of 138 (8.7%)
    233
    gc_mark_children
    53 (3.3%)
    of 284 (17.9%)
    729
    proc_mark
    0 (0.0%)
    of 165 (10.4%)
    6
    15
    8
    96
    iseq_mark
    32 (2.0%)
    of 233 (14.7%)
    739
    117
    12
    3
    319
    212
    2 3059
    mark_tbl
    3 (0.2%)
    of 256 (16.1%)
    711 982
    332
    86
    mark_const_tbl
    0 (0.0%)
    of 163 (10.3%)
    325
    mark_hash
    0 (0.0%)
    of 137 (8.6%)
    230
    60
    751
    3
    9
    1050
    62
    22
    21
    env_mark
    0 (0.0%)
    of 158 (9.9%)
    199
    328
    234
    gc_marks
    0 (0.0%)
    of 110 (6.9%)
    91
    31
    255
    203
    60

    View full-size slide

  30. 872
    1109
    vm_exec
    6 (0.4%)
    of 1285 (80.9%)
    910
    rb_block_call
    0 (0.0%)
    of 305 (19.2%)
    115
    vm_yield_with_cref
    0 (0.0%)
    of 103 (6.5%)
    137
    90
    vm_call_method
    35 (2.2%)
    of 1285 (80.9%)
    8049
    35
    6
    vm_exec_core
    73 (4.6%)
    of 1285 (80.9%)
    8628
    8101
    73
    3 3
    vm_get_ev_const.isra.6
    1 (0.1%)
    of 141 (8.9%)
    285
    rb_const_get_0
    0 (0.0%)
    of 125 (7.9%)
    4
    114
    171
    183
    14
    1307
    ventMachine_t
    RunSelectOnce
    0 (0.0%)
    of 787 (49.6%)
    onDescriptor
    Read
    0 (0.0%)
    of 779 (49.1%)
    772
    9
    759
    1286
    rb_iseq_eval_main
    0 (0.0%)
    of 396 (24.9%)
    396
    382
    314
    139
    eval_string_with_cref
    1 (0.1%)
    of 98 (6.2%)
    67
    1
    mar

    View full-size slide

  31. Wywołanie metody
    w pięciu prostych krokach

    View full-size slide

  32. 1. Przygotuj argumenty i blok
    2. Przygotuj stos jeśli metoda
    zwraca wiele wartości
    3. Ustal klasę odbiorcy
    4. Znajdź wołaną metodę
    5. Wywołaj metodę

    View full-size slide

  33. Cache
    lekarstwem na wszystkie zmartwienia

    View full-size slide

  34. Liczba klas obiektów, które pojawiają się
    w miejscach wywołań metod

    View full-size slide

  35. Inline cache
    Zapisuj znalezioną metodę w miejscach
    wywołania

    View full-size slide

  36. 1. Przygotuj argumenty i blok
    2. Przygotuj stos jeśli metoda
    zwraca wiele wartości
    3. Ustal klasę odbiorcy
    4. Zajrzyj do inline cache
    5. Zajrzyj do globalnego cache
    6. Znajdź wołaną metodę normalnie
    7. Wywołaj metodę

    View full-size slide

  37. Specjalizacja instrukcji
    Przyspieszanie standardowo
    wykonywanych operacji

    View full-size slide

  38. send
    Instrukcja maszyny wirtualnej, która służy
    do wywoływania metod

    View full-size slide

  39. a + b → opt_plus

    View full-size slide

  40. a[b] → opt_aref

    View full-size slide

  41. YARV: Yet Another RubyVM
    Innovating the Ruby Interpreter
    Koichi Sasada
    Graduate School of Technology,
    Tokyo University of Agriculture and Technology
    2-24-16 Nakacho, Koganei-shi, Tokyo, Japan.
    [email protected]
    ABSTRACT
    Ruby - an Object-Oriented scripting language - is used world-
    wide because of its ease of use. However, the current in-
    terpreter is slow. To solve this problem, some virtual ma-
    chines were developed, but none with adequate performance
    or functionality. To fill this gap, I have developed a Ruby
    interpreter called YARV (Yet Another Ruby VM). YARV
    is based on a stack machine architecture and features op-
    timizations for high speed execution of Ruby programs. In
    this poster, I introduce the Ruby programming language,
    discuss certain characteristics of Ruby from the aspect of
    a Ruby interpreter implementer, and explain methods of
    implementation and optimization. Benchmark results are
    given at the end.
    Categories and Subject Descriptors
    D.3 [PROGRAMMING LANGUAGES]: Processors—
    Interpreters
    General Terms
    Languages
    • Normal OO features (class, method call, etc.)
    • Advanced OO features (all values are objects, Min-in,
    Singleton method, etc.)
    • Dynamic-typing, re-definable behavior, dynamic eval-
    uation
    • Operator overloading
    • Exception handling
    • Closure and method invocation with a block
    • Garbage collection support
    • Dynamic module loading
    • Many useful libraries
    • Highly portable
    However, the current Ruby intepreter (old-ruby) is slow.
    This is because it works by traversing abstract syntax tree
    and evaluating each node. To solve this problem, I have de-

    View full-size slide

  42. O czym się nawet nie zająknąłem
    Garbage collector MRI 2.0
    Inne maszyny
    wirtualne

    View full-size slide

  43. Podsumowanie!
    1. Poczynając od MRI 1.9 Ruby jest wykonywany na
    prawdziwej maszynie wirtualnej,
    2. Kod → abstract syntax tree → bytecode,
    3. 80 instrukcji YARV jest zdefiniowanych w insns.def,
    4. Optymalizacje: cache metod, inline cache, specjalizacja
    instrukcji.

    View full-size slide

  44. [email protected]
    hp://stepien.cc/~jan
    @janstepien
    Serdecznie
    dziękuję

    View full-size slide

  45. Prezentację przygotowano przy pomocy pakietu L
    A
    TEX.
    © 2012 Jan Stępień. Część praw zastrzeżono.

    View full-size slide