Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Learn Programming Essence from Ruby patches

takkanm
September 09, 2016

Learn Programming Essence from Ruby patches

takkanm

September 09, 2016
Tweet

More Decks by takkanm

Other Decks in Technology

Transcript

  1. Today’s Topic The aim of the session is to review

    the essence of programming, which you can read from several textbooks for programming, by reading some Ruby code as well as libraries.
  2. Knowledge of Programming Used in Daily Tasks What is required

    in daily tasks in knowledge on the followings. ✤ Language feature ✤ How to use libraries ✤ How to use freameworks.
  3. Are they unnecessary at daily work? We usually don't have

    chances to use such knowledge for writing business codes. But we believe such textbooks still should hold much more secrets, such as an algorithm for writing smarter code or more efficient code.
  4. Boring Textbooks From textbooks, we can learn much like implementation

    of algorithms or pros/cons of that. But it is not easy for us to imagine how such knowledges can be applied to our actual writing codes.
  5. Book ʮPrograming Techniqueʯ This book is a book that will

    learn programming techniques read the source of the UNIX command. I was able to get some idea of how they have been utilized in the actual programming is what I learned at school in this book.
  6. Why patch? ✤ Clear code of interest ✤ Information that

    the many explanation for the patch ✤ Choose the ones likely to read to suit your level
  7. What you do learn from a patch? ✤Understanding of libraries

    and Ruby implementation ✤Understanding of the algorithm ✤Practical implementation
  8. At an Asakus.rb Meetup I was taught about Ruby and

    patches that improve its performance. By reading this, I again about improvements due to change in algorithm.
  9. Improving Performance in Ruby I might remind you of advanced

    and esoteric works like improving GC algorithms or JIT something like that. But some patches are not so hard to read.
  10. But, You should be learn about Ruby Interpreter. That said,

    you need to know the mechanism of Ruby even a little to read the patch of Ruby. So, I will introduce my recommended materials.
  11. about st_table Ruby's core holds a 'st_table' structure that have

    been used in many codes. The contents of 'Hash' in Ruby are also contained within the strcture.
  12. st_table struct st_table { const struct st_hash_type *type; st_index_t num_bins;

    unsigned int entries_packed : 1; st_index_t num_entries : ST_INDEX_BITS - 1; union { struct { struct st_table_entry **bins; void *private_list_head[2]; } big; struct { struct st_packed_entry *entries; st_index_t real_entries; } packed; } as; };
  13. Further add a key LFZ  SFDPSE LFZ@B  

          LFZ  SFDPSE LFZ@C )BTI GVODUJPO 
  14. Within Ruby LFZ  SFDPSE LFZ@B    

        LFZ  SFDPSE LFZ@C
  15. If Hash values conflict LFZ  SFDPSE LFZ@B  

          LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D )BTI GVODUJPO 
  16. Separate chaining In case the evaluated values of a hash

    function are identical, a single address in a hash table is allowed to hold multiple elements.
  17. Separate chaining LFZ  SFDPSE LFZ@B    

        LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D )BTI GVODUJPO 
  18. Separate chaining LFZ  SFDPSE LFZ@B    

        LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D
  19. Current Ruby Ruby use Separate chaining. Too see its implementation,

    you can refer to the function that inserts into hashes.
  20. st_insert int st_insert(register st_table *table, register st_data_t key, st_data_t value)

    { st_index_t hash_val; register st_index_t bin_pos; register st_table_entry *ptr; hash_val = do_hash(key, table); if (table->entries_packed) { st_index_t i = find_packed_index(table, hash_val, key); if (i < table->real_entries) { PVAL_SET(table, i, value); return 1; } add_packed_direct(table, key, value, hash_val); return 0; }
  21. st_insert FIND_ENTRY(table, ptr, hash_val, bin_pos); if (ptr == 0) {

    add_direct(table, key, value, hash_val, bin_pos); return 0; } else { ptr->record = value; return 1; } }
  22. add_direct static inline void add_direct(st_table *table, st_data_t key, st_data_t value,

    st_index_t hash_val, register st_index_t bin_pos) { register st_table_entry *entry; if (table->num_entries > ST_DEFAULT_MAX_DENSITY * table->num_bins) { rehash(table); bin_pos = hash_pos(hash_val, table->num_bins); } entry = new_entry(table, key, value, hash_val, bin_pos); list_add_tail(st_head(table), &entry->olist); table->num_entries++; }
  23. new_entry static inline st_table_entry * new_entry(st_table * table, st_data_t key,

    st_data_t value, st_index_t hash_val, register st_index_t bin_pos) { register st_table_entry *entry = st_alloc_entry(); entry->next = table->bins[bin_pos]; table->bins[bin_pos] = entry; entry->hash = hash_val; entry->key = key; entry->record = value; return entry; }
  24. If conflict Hash value      

      LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D )BTI GVODUJPO 
  25. If conflict Hash value      

      LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D
  26. If conflict Hash value      

      LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D
  27. Feature #12142 Hash tables with open addressing switching to open

    addressing hash tables for access by keys. Removing hash collision lists lets us avoid *pointer chasing*, a common problem that produces bad data locality. I see a tendency to move from chaining hash tables to open addressing hash tables due to their better fit to modern CPU memory organizations.
  28. Open addressing A way to insert all Hash elements in

    a Hash table. When the evaluation results of Hash function conflict, the element is placed in another empty space.
  29. If conflict Hash value      

      LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D )BTI GVODUJPO 
  30. If conflict Hash value      

      LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D )BTI GVODUJPO JOD  
  31. If conflict Hash value      

      LFZ  SFDPSE LFZ@C LFZ ̍ SFDPSE LFZ@D
  32. Corresponding Patches Half an year has passed since the original

    proposal, and various refinements have been adopted since. This talk will be based on the latest attached st_table_with_array2 branch of funny-falcon/ruby.
  33. st_table struct st_table { const struct st_hash_type *type; union {

    struct st_table_entry* entries; st_idx_t* bins; uint8_t* smallbins; uint16_t* medbins; } as; st_idx_t num_entries; st_idx_t first, last; unsigned sz : 8; unsigned rebuild_num : 24; };
  34. new st_table        

            TU@UBCMF CJOT FOUSJFT
  35. st_insert int st_insert(register st_table *table, register st_data_t key, st_data_t value)

    { st_idx_t hash_val, idx; hash_val = do_hash(key, table); idx = find_entry(table, key, hash_val); if (idx == IDX_NULL) { add_direct(table, key, value, hash_val); return 0; } else { table->as.entries[idx].record = value; return 1; } }
  36. find_entry static inline st_idx_t find_entry(const st_table *table, st_data_t key, st_idx_t

    hash_val) { unsigned rebuild_num ST_UNUSED = table->rebuild_num; if (st_sz[table->sz].nbins == 0) { st_idx_t idx = table->first, last = table->last; st_table_entry* ptr = &table->as.entries[idx]; for (; idx < last; idx++, ptr++) { if (ptr->hash == hash_val && EQUAL(table, key, ptr)) { return idx; } } st_assert(rebuild_num == table->rebuild_num); return IDX_NULL;
  37. find_entry } else { st_idx_t bin_pos = hash_pos(hash_val, table->sz); st_idx_t

    idx = bin_get(table, bin_pos); FOUND_ENTRY; while (PTR_NOT_EQUAL(table, idx, hash_val, key)) { COLLISION; idx = table->as.entries[idx].next; } st_assert(rebuild_num == table->rebuild_num); return idx; } }
  38. add_direct static inline void add_direct(st_table *table, st_data_t key, st_data_t value,

    st_idx_t hash_val) { register st_table_entry *entry; st_idx_t en_idx, bin_pos; if (table->last == st_sz[table->sz].nentries) { st_rehash(table); }
  39. add_direct en_idx = table->last; table->last++; entry = &table->as.entries[en_idx]; if (st_sz[table->sz].nbins

    != 0) { bin_pos = hash_pos(hash_val, table->sz); entry->next = bin_get(table, bin_pos); bin_set(table, bin_pos, en_idx); } entry->hash = hash_val; entry->key = key; entry->record = value; table->num_entries++; }
  40. new st_table        

            TU@UBCMF CJOT FOUSJFT LFZ WBMVF IBTI LFZ@B  
  41.          

          TU@UBCMF CJOT FOUSJFT LFZ WBMVF IBTI LFZ@B   new st_table LFZ@B 
  42. ̌         

           TU@UBCMF CJOT FOUSJFT LFZ@B  LFZ WBMVF IBTI LFZ@B   new st_table
  43. ̌         

           TU@UBCMF CJOT FOUSJFT LFZ@B  LFZ WBMVF IBTI LFZ@C   new st_table
  44. ̌         

           TU@UBCMF CJOT FOUSJFT LFZ@B  LFZ@C  LFZ WBMVF IBTI LFZ@C   new st_table
  45. ̌ ̍        

            TU@UBCMF CJOT FOUSJFT LFZ@B  LFZ@C  LFZ WBMVF IBTI LFZ@C   new st_table
  46. ̌ ̍        

            TU@UBCMF CJOT FOUSJFT LFZ WBMVF IBTI LFZ@D   new st_table LFZ@B  LFZ@C 
  47. ̌ ̍        

            TU@UBCMF CJOT FOUSJFT LFZ WBMVF IBTI LFZ@D   LFZ@D  new st_table LFZ@B  LFZ@C 
  48. ̌ ̍        

            TU@UBCMF CJOT FOUSJFT LFZ WBMVF IBTI LFZ@D   ̍ new st_table LFZ@D  LFZ@B  LFZ@C 
  49. ̌ ̎        

            TU@UBCMF CJOT FOUSJFT LFZ WBMVF IBTI LFZ@D   new st_table ̍ LFZ@D  LFZ@B  LFZ@C 
  50. Feeling after read patch ✤ By reading the patch of

    the description, it is easy to guess the contents, it became help Read. ✤ Patch format was easy to follow the changes. ✤ Than what you have learned in the book, knowing the practical implementation.
  51. Conclusion By reading real code, We suppose you will be

    able to tell the scenarios where ideas described in books are used. By reading patches, I think you can easily tell the main changes you are interested in. I recommend you to try code-reading to rediscover the knowledge of programming you learned. How about that?