Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GC in Ruby 2.2

00e4a880b1262a125b5e342e4b536765?s=47 Zete
December 17, 2014

GC in Ruby 2.2

Slides of my local tech share in Beijing, 2014-12-06



December 17, 2014

More Decks by Zete

Other Decks in Programming


  1. GC in Ruby 2.2 zete@odigo.jp

  2. Memory Management Application View • stack • alloca • heap

    • with regard to stack: RTTI, auto_ptr, __attribute__(destructor), … • manual with help: arena, buddy memory, memory pool, … • reference counted (shared_ptr, regexp, IO) • GC
  3. Memory Management Operating System View • Virtual memory • Segment

    - segment fault is serious error… however segment is rarely used now • Page - page fault is not error, but may recall very slow disk access • The translation table is stored in TLB (movl cr3, eax) • IPC memory • Pipes • Process shared memory • With regard to IPC memory management, raptor uses mbuf and many ways to avoid copying
  4. Memory Management Hardware View • CPU talks to memory through

    Address Bus and Data Bus — Bus clock cycle is several times slower than CPU clock • SDRAM and RDRAM are High bandwidth (throughput), High latency (100+ CPU cycles) • L1, L2, L3 caches — 90% of memory access is through cache • Multi-way cache lines: the more “ways” the more precise and more complicated circuit • DMA (direct memory access) mode: read from or write memory to device directly • Memory fences: loadload, loadsave, saveload, savesave, volatile (rb_gc_guarded_ptr_val)
  5. Simple to use Hard to implement

  6. Implemenation Considers… • CPU interruptions (Boehm GC page-fault) • Locality

    (heap allocations) • Predicting performance (G1GC -XX:MaxGCPauseMillis) • Debugging (how to debug a segfault in GC?) • Pointer compressing (Jikes VM, LLVM compressing on linked list) • Language features (Erlang and Haskell take advantage of immutability) • Internal of C APIs (tcmalloc, jemalloc, … which to use?) • OS APIs (mmap) • (Disable) Compiler optimisations (volatile) • CPU arch (memory fence to ensure execution sequence)
  7. Many GCs conservative mark sweep generational CMS (Java) N Y

    Y G1GC (Java 7) N Y Infinite CPython N N Y Rubinius N Y Y Lua N Y N Go Y Y N Boehm GC Y Y N
  8. CRuby GC • Conservative • Bit marking • Lazy sweep

    • Generational • Incremental marking
  9. Implementation Choices • Ruby is not fast, but easy to

    optimize with C-ext — conservative GC makes C-ext easier to write • GIL, GC don’t need to add locks or spinlocks yet • Cross-architect requirement and code simplicity • GC provides tools for C-ext use
  10. Experiments for GC • Rubinius GC • MRuby GC (root_scan,

    incremental_mark, incremental_sweep)
  11. Parallel GC • Many threads mark and sweep • Java

    6 - (CMS) concurrent mark and sweep is in fact parallel GC… (-XX:+CMSIncrementalMode)
  12. None
  13. Concurrent GC • Low to zero stop time • Usually

    achieved by incremental mark/sweep or a separate GC thread • No STW (stop-the-world)
  14. None
  15. How CRuby Achieves “Concurrent” • Trade throughput (~10%) and code

    complexity to reduce pause time • one mark -> one sweep • one mark -> many sweeps (lazy sweep) • many marks -> many sweeps (tri-color marking)
  16. None
  17. Generational • Based on heuristics: young objects die young •

    Can not do semispace or mark-compact GC for conservative GC, hard to make efficient pointer- rewriting for platforms, hard for C-Ext
  18. Concurrent and Generational • Problem 1: object changed during sweeping

    • Solve: write barriers
  19. Inserting Write Barriers • WB_OBJ • most macros are covered

  20. Concurrent and Generational • Problem 2: marked object changed •

    Solve: Tri-color invariant
  21. Other Optimizations: Bit Marking • There was bit Marking for

    Copy-On-Write friendly • To represent colors, 2 bits per object is used (the result is 4 bits)
  22. When GC? • new object • rb_gc_malloc() • method returned

    • rb_gc_start() • GC.stress
  23. Ways to Control GC • OOB GC (out of band

    GC) in unicorn, passenger • GC.stress • GC.stop … GC.start • rb_gc_mark() rb_gc_register()
  24. Performance Tools • gdb/lldb • rbtrace • tmm1/stackprof • ko1/gc_tracer

    • require 'gc_tracer' • GC::Tracer.start_logging("log.txt")
  25. Useful Methods • GC.latest_gc_info • GC.stat • GC::INTERNAL_CONSTANTS • “string”.freeze

  26. References • http://www.atdot.net/~ko1/activities/#idx4 • https://speakerdeck.com/samsaffron/why-ruby-2-dot-1- excites-me • https://speakerdeck.com/pat_shaughnessy/visualizing- garbage-collection-in-rubinius-jruby-and-ruby-2-dot-0 •

    పఈղ๤ʮG1GCʯ • github’s ruby