Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learn Programming Essence from Ruby patches

takkanm
September 09, 2016

Learn Programming Essence from Ruby patches

takkanm

September 09, 2016
Tweet

More Decks by takkanm

Other Decks in Technology

Transcript

  1. Learn Programming Essence
    from Ruby patches
    Mitsutaka Mimura
    @takkanm

    View full-size slide

  2. Today’s Topic
    The aim of the session is to review the
    essence of programming, which you
    can read from several textbooks for
    programming, by reading some Ruby
    code as well as libraries.

    View full-size slide

  3. about me
    me:
    name: Mitsutaka Mimura
    github: takkanm
    company: EiwaSystemManagement
    community: Asakusa.rb
    book: PerfectRuby

    View full-size slide

  4. IUUQCJUMZFTNSVCZLBJHJCFOUP

    View full-size slide

  5. l"KZBSJNPDIJTPVOETMJLF"HJMFEBKBSF@DMVCz
    .ZCPTTTBJE

    View full-size slide

  6. Agenda
    ✤What is a programming knowledge?
    ✤Learn from patches.
    ✤Example case

    View full-size slide

  7. What is Programming Knowladge?

    View full-size slide

  8. What is Knowledge of
    Programming at Daily
    Work?

    View full-size slide

  9. Knowledge of
    Programming Used in
    Daily Tasks
    What is required in daily tasks in knowledge
    on the followings.
    ✤ Language feature
    ✤ How to use libraries
    ✤ How to use freameworks.

    View full-size slide

  10. Knowledge of Programming
    Seen in Occasions Like
    University Leactures

    View full-size slide

  11. Knowledge of Programming
    Seen in Occasions Like
    University Leactures
    ✤Algorithm
    ✤Data Structure
    ✤Computational Complexity

    View full-size slide

  12. Are they unnecessary at
    daily work?
    We usually don't have chances to use such
    knowledge for writing business codes.
    But we believe such textbooks still should hold
    much more secrets, such as an algorithm for
    writing smarter code or more efficient code.

    View full-size slide

  13. How to study ?

    View full-size slide

  14. Boring Textbooks
    From textbooks, we can learn much like
    implementation of algorithms or pros/cons of that.
    But it is not easy for us to imagine how such
    knowledges can be applied to our actual writing
    codes.

    View full-size slide

  15. Book ʮPrograming Techniqueʯ
    This book is a book that will learn programming
    techniques read the source of the UNIX command.
    I was able to get some idea of how they have been
    utilized in the actual programming is what I
    learned at school in this book.

    View full-size slide

  16. Proposal
    -FUTUPUIFNPSFGBNJMJBS
    UIBUZPVIBWFUPMFBSOUP
    SFBEUIFQBUDIPG3VCZ

    View full-size slide

  17. Learn from patches

    View full-size slide

  18. Why patch?
    ✤ Clear code of interest
    ✤ Information that the many explanation for
    the patch
    ✤ Choose the ones likely to read to suit your
    level

    View full-size slide

  19. What you do learn from a
    patch?
    ✤Understanding of libraries and Ruby
    implementation
    ✤Understanding of the algorithm
    ✤Practical implementation

    View full-size slide

  20. At an Asakus.rb Meetup
    I was taught about Ruby and patches that
    improve its performance.
    By reading this, I again about improvements due
    to change in algorithm.

    View full-size slide

  21. Improving Performance in
    Ruby
    I might remind you of advanced and esoteric
    works like improving GC algorithms or JIT
    something like that.
    But some patches are not so hard to read.

    View full-size slide

  22. But, You should be learn
    about Ruby Interpreter.
    That said, you need to know the mechanism of Ruby
    even a little to read the patch of Ruby.
    So, I will introduce my recommended materials.

    View full-size slide

  23. Ruby Under
    a Microscope

    View full-size slide

  24. Ruby Under
    a Microscope

    View full-size slide

  25. doc/extention.rdoc (in ruby source tree)

    View full-size slide

  26. yotii23’s
    slide :
    WALK
    AROUND
    THE
    Ruby
    FOREST
    MORE
    DEEPLY.

    View full-size slide

  27. Example case

    View full-size slide

  28. https://bugs.ruby-lang.org/issues/12142

    View full-size slide

  29. about st_table
    Ruby's core holds a 'st_table' structure that have
    been used in many codes.
    The contents of 'Hash' in Ruby are also contained
    within the strcture.

    View full-size slide

  30. review of Hash structure

    View full-size slide

  31. review of Hash
    hash = Hash.new
    hash[:key_a] = 1

    View full-size slide

  32. review of Hash
    )BTI5BCMF








    View full-size slide

  33. review of Hash
    ,&:

    7"-6&
    LFZ@B
    )BTI5BCMF








    View full-size slide

  34. review of Hash
    ,&:

    7"-6&
    LFZ@B
    )BTI
    GVODUJPO








    )BTI5BCMF

    View full-size slide

  35. review for Hash
    ,&:

    7"-6&
    LFZ@B
    )BTI
    GVODUJPO









    )BTI5BCMF

    View full-size slide

  36. review for Hash
    ,&:

    7"-6&
    LFZ@B








    )BTI5BCMF

    View full-size slide

  37. in Ruby
    ,&:

    7"-6&
    LFZ@B








    View full-size slide

  38. in Ruby
    ,&:

    7"-6&
    LFZ@B
    TU@UBCMF
    TU@UBCMF@FOUSZ








    View full-size slide

  39. st_table
    struct st_table {
    const struct st_hash_type *type;
    st_index_t num_bins;
    unsigned int entries_packed : 1;
    st_index_t num_entries : ST_INDEX_BITS - 1;
    union {
    struct {
    struct st_table_entry **bins;
    void *private_list_head[2];
    } big;
    struct {
    struct st_packed_entry *entries;
    st_index_t real_entries;
    } packed;
    } as;
    };

    View full-size slide

  40. st_table_entry
    struct st_table_entry {
    st_index_t hash;
    st_data_t key;
    st_data_t record;
    st_table_entry *next;
    struct list_node olist;
    };

    View full-size slide

  41. Within Ruby
    ,&:

    7"-6&
    LFZ@B
    TU@UBCMF
    TU@UBCMF@FOUSZ








    View full-size slide

  42. Within Ruby
    LFZ

    SFDPSE
    LFZ@B
    TU@UBCMFCJOT
    TU@UBCMF@FOUSZ








    View full-size slide

  43. Further add a key
    LFZ

    SFDPSE
    LFZ@B








    LFZ

    SFDPSE
    LFZ@C
    )BTI
    GVODUJPO

    View full-size slide

  44. Within Ruby
    LFZ

    SFDPSE
    LFZ@B








    LFZ

    SFDPSE
    LFZ@C

    View full-size slide

  45. If Hash values conflict
    LFZ

    SFDPSE
    LFZ@B








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D
    )BTI
    GVODUJPO

    View full-size slide

  46. Algorithms to detour
    Hash conflict

    View full-size slide

  47. Separate chaining
    In case the evaluated values of a hash function are
    identical, a single address in a hash table is
    allowed to hold multiple elements.

    View full-size slide

  48. Separate chaining
    LFZ

    SFDPSE
    LFZ@B








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D
    )BTI
    GVODUJPO

    View full-size slide

  49. Separate chaining
    LFZ

    SFDPSE
    LFZ@B








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D

    View full-size slide

  50. Current Ruby
    Ruby use Separate chaining.
    Too see its implementation, you can refer
    to the function that inserts into hashes.

    View full-size slide

  51. st_insert
    int
    st_insert(register st_table *table, register st_data_t key, st_data_t
    value)
    {
    st_index_t hash_val;
    register st_index_t bin_pos;
    register st_table_entry *ptr;
    hash_val = do_hash(key, table);
    if (table->entries_packed) {
    st_index_t i = find_packed_index(table, hash_val, key);
    if (i < table->real_entries) {
    PVAL_SET(table, i, value);
    return 1;
    }
    add_packed_direct(table, key, value, hash_val);
    return 0;
    }

    View full-size slide

  52. st_insert
    FIND_ENTRY(table, ptr, hash_val, bin_pos);
    if (ptr == 0) {
    add_direct(table, key, value, hash_val, bin_pos);
    return 0;
    }
    else {
    ptr->record = value;
    return 1;
    }
    }

    View full-size slide

  53. add_direct
    static inline void
    add_direct(st_table *table, st_data_t key, st_data_t value,
    st_index_t hash_val, register st_index_t bin_pos)
    {
    register st_table_entry *entry;
    if (table->num_entries > ST_DEFAULT_MAX_DENSITY * table->num_bins) {
    rehash(table);
    bin_pos = hash_pos(hash_val, table->num_bins);
    }
    entry = new_entry(table, key, value, hash_val, bin_pos);
    list_add_tail(st_head(table), &entry->olist);
    table->num_entries++;
    }

    View full-size slide

  54. new_entry
    static inline st_table_entry *
    new_entry(st_table * table, st_data_t key, st_data_t value,
    st_index_t hash_val, register st_index_t bin_pos)
    {
    register st_table_entry *entry = st_alloc_entry();
    entry->next = table->bins[bin_pos];
    table->bins[bin_pos] = entry;
    entry->hash = hash_val;
    entry->key = key;
    entry->record = value;
    return entry;
    }

    View full-size slide

  55. If conflict Hash value








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D
    )BTI
    GVODUJPO

    View full-size slide

  56. If conflict Hash value








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D

    View full-size slide

  57. If conflict Hash value








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D

    View full-size slide

  58. Feature #12142
    Hash tables with open addressing
    switching to open addressing hash tables
    for access by keys.
    Removing hash collision lists lets us avoid
    *pointer chasing*, a common problem that
    produces bad data locality. I see a tendency
    to move from chaining hash tables to open
    addressing hash tables due to their better fit
    to modern CPU memory organizations.

    View full-size slide

  59. Open addressing
    A way to insert all Hash elements in a
    Hash table. When the evaluation results of
    Hash function conflict, the element is
    placed in another empty space.

    View full-size slide

  60. If conflict Hash value








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D
    )BTI
    GVODUJPO

    View full-size slide

  61. If conflict Hash value








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D
    )BTI
    GVODUJPO
    JOD

    View full-size slide

  62. If conflict Hash value








    LFZ

    SFDPSE
    LFZ@C
    LFZ
    ̍
    SFDPSE
    LFZ@D

    View full-size slide

  63. Corresponding Patches
    Half an year has passed since the original
    proposal, and various refinements have
    been adopted since. This talk will be based
    on the latest attached st_table_with_array2
    branch of funny-falcon/ruby.

    View full-size slide

  64. st_table
    struct st_table {
    const struct st_hash_type *type;
    union {
    struct st_table_entry* entries;
    st_idx_t* bins;
    uint8_t* smallbins;
    uint16_t* medbins;
    } as;
    st_idx_t num_entries;
    st_idx_t first, last;
    unsigned sz : 8;
    unsigned rebuild_num : 24;
    };

    View full-size slide

  65. st_table_entry
    struct st_table_entry {
    st_idx_t hash;
    st_idx_t next;
    st_data_t key;
    st_data_t record;
    };

    View full-size slide

  66. new st_table
















    TU@UBCMF
    CJOT FOUSJFT

    View full-size slide

  67. st_insert
    int
    st_insert(register st_table *table, register st_data_t key, st_data_t value)
    {
    st_idx_t hash_val, idx;
    hash_val = do_hash(key, table);
    idx = find_entry(table, key, hash_val);
    if (idx == IDX_NULL) {
    add_direct(table, key, value, hash_val);
    return 0;
    }
    else {
    table->as.entries[idx].record = value;
    return 1;
    }
    }

    View full-size slide

  68. find_entry
    static inline st_idx_t
    find_entry(const st_table *table, st_data_t key, st_idx_t hash_val)
    {
    unsigned rebuild_num ST_UNUSED = table->rebuild_num;
    if (st_sz[table->sz].nbins == 0) {
    st_idx_t idx = table->first, last = table->last;
    st_table_entry* ptr = &table->as.entries[idx];
    for (; idx < last; idx++, ptr++) {
    if (ptr->hash == hash_val && EQUAL(table, key, ptr)) {
    return idx;
    }
    }
    st_assert(rebuild_num == table->rebuild_num);
    return IDX_NULL;

    View full-size slide

  69. find_entry
    } else {
    st_idx_t bin_pos = hash_pos(hash_val, table->sz);
    st_idx_t idx = bin_get(table, bin_pos);
    FOUND_ENTRY;
    while (PTR_NOT_EQUAL(table, idx, hash_val, key)) {
    COLLISION;
    idx = table->as.entries[idx].next;
    }
    st_assert(rebuild_num == table->rebuild_num);
    return idx;
    }
    }

    View full-size slide

  70. add_direct
    static inline void
    add_direct(st_table *table, st_data_t key, st_data_t value,
    st_idx_t hash_val)
    {
    register st_table_entry *entry;
    st_idx_t en_idx, bin_pos;
    if (table->last == st_sz[table->sz].nentries) {
    st_rehash(table);
    }

    View full-size slide

  71. add_direct
    en_idx = table->last;
    table->last++;
    entry = &table->as.entries[en_idx];
    if (st_sz[table->sz].nbins != 0) {
    bin_pos = hash_pos(hash_val, table->sz);
    entry->next = bin_get(table, bin_pos);
    bin_set(table, bin_pos, en_idx);
    }
    entry->hash = hash_val;
    entry->key = key;
    entry->record = value;
    table->num_entries++;
    }

    View full-size slide

  72. new st_table
















    TU@UBCMF
    CJOT FOUSJFT LFZ WBMVF IBTI
    LFZ@B

    View full-size slide

















  73. TU@UBCMF
    CJOT FOUSJFT LFZ WBMVF IBTI
    LFZ@B
    new st_table
    LFZ@B

    View full-size slide

  74. ̌
















    TU@UBCMF
    CJOT FOUSJFT
    LFZ@B
    LFZ WBMVF IBTI
    LFZ@B
    new st_table

    View full-size slide

  75. ̌
















    TU@UBCMF
    CJOT FOUSJFT
    LFZ@B
    LFZ WBMVF IBTI
    LFZ@C
    new st_table

    View full-size slide

  76. ̌
















    TU@UBCMF
    CJOT FOUSJFT
    LFZ@B
    LFZ@C
    LFZ WBMVF IBTI
    LFZ@C
    new st_table

    View full-size slide

  77. ̌
    ̍
















    TU@UBCMF
    CJOT FOUSJFT
    LFZ@B
    LFZ@C
    LFZ WBMVF IBTI
    LFZ@C
    new st_table

    View full-size slide

  78. ̌
    ̍
















    TU@UBCMF
    CJOT FOUSJFT LFZ WBMVF IBTI
    LFZ@D
    new st_table
    LFZ@B
    LFZ@C

    View full-size slide

  79. ̌
    ̍
















    TU@UBCMF
    CJOT FOUSJFT LFZ WBMVF IBTI
    LFZ@D
    LFZ@D
    new st_table
    LFZ@B
    LFZ@C

    View full-size slide

  80. ̌
    ̍
















    TU@UBCMF
    CJOT FOUSJFT LFZ WBMVF IBTI
    LFZ@D
    ̍
    new st_table
    LFZ@D
    LFZ@B
    LFZ@C

    View full-size slide

  81. ̌
    ̎
















    TU@UBCMF
    CJOT FOUSJFT LFZ WBMVF IBTI
    LFZ@D
    new st_table
    ̍
    LFZ@D
    LFZ@B
    LFZ@C

    View full-size slide

  82. Feeling after read patch
    ✤ By reading the patch of the description, it
    is easy to guess the contents, it became
    help Read.
    ✤ Patch format was easy to follow the
    changes.
    ✤ Than what you have learned in the book,
    knowing the practical implementation.

    View full-size slide

  83. Conclusion
    By reading real code, We suppose you will be able
    to tell the scenarios where ideas described in
    books are used. By reading patches, I think you can
    easily tell the main changes you are interested in.
    I recommend you to try code-reading to rediscover
    the knowledge of programming you learned. How
    about that?

    View full-size slide