Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Garbage Collection as a Joint Venture

Garbage Collection as a Joint Venture

A common problem arising from embedding a virtual machine as a component in a larger software system is the management of references between the VM’s managed heap and the embedder’s heap. Direct references between components allow fast communication, but can result in cycles over component boundaries, making these hybrid systems prone to memory leaks. In this paper we present a design and implementation of a tracing mechanism for effective and efficient garbage collection over component boundaries which is implemented and shipped in Chrome’s JavaScript virtual machine V8 and the Blink rendering engine. Tracing through the C++ heap of Blink poses several challenges on which we elaborate: 1) an abstract visitation mechanism for C++ objects that can be used by the V8 garbage collector, 2) write barriers for C++ objects to reduce pauses through incremental marking, and 3) a mechanism to verify correctness of write barrier usage.

Presented at MoreVMs workshop, Brussels, Belgium, 2017.
https://2017.programming-conference.org/event/morevms-2017-papers-garbage-collection-as-a-joint-venture

Michael Lippautz

March 06, 2018
Tweet

More Decks by Michael Lippautz

Other Decks in Research

Transcript

  1. Garbage Collection as a Joint Venture Michael Lippautz Ulan Degenbaev,

    Jochen Eisinger, Kentaro Hara, Marcel Hlopko, Hannes Payer Google Inc MoreVMs Workshop, Brussels, 2017
  2. Contents The Web, JavaScript, and Chrome The Cycle Problem Cross-Component

    Garbage Collection Tracing from JS to C++ and back Incremental tracing: Write barriers Correctness Benchmarking / Results
  3. Contents The Web, JavaScript, and Chrome The Cycle Problem Cross-Component

    Garbage Collection Tracing from JS to C++ and back Incremental tracing: Write barriers Correctness Benchmarking / Results- SPOILER: Enabled in Chrome M57 -
  4. • Document Object Model (DOM) • Cross-platform language-independent representation of

    HTML • Web Interface Definition Language (IDL) • Traditionally encoded as objects in the rendering engine (e.g. Blink) Web - JavaScript - Chrome
  5. Web - JavaScript - Chrome <script type="text/javascript"> var newDiv; function

    createDiv() { newDiv = document.createElement("div"); document.body.appendChild(newDiv); } window.onload = createDiv; </script>
  6. other other V8 (JS VM) other Blink (Renderer) V8 (JS

    VM) Browser Web - JavaScript - Chrome Blink (Renderer) Chrome Idea: Abstraction of the browser into components (and most of the time subcomponents) at runtime
  7. Web - JavaScript - Chrome V8 Embedder / Chrome /

    Blink HTMLDocument HTMLBodyElement HTMLDivElement JSObject (newDiv) String "div" JSObject (document) Bindings JSObject (body)
  8. Bindings: Gluing together V8 and Blink V8 • Managed heap

    for JavaScript objects • Dynamic object layout (shape changes) • Precise GC • Mark-Sweep-Compact Collector ◦ Incremental marking ◦ Concurrent sweeping ◦ Parallel compaction Blink • Managed heap for (most of) the DOM • Static object layout • Precise and conservative GC (until Q1’16 ref counting) • Stop-the-world Mark-Sweep(-Compact) Collector ◦ Incremental sweeping ◦ (Compaction on backing stores of collections)
  9. Bindings: Gluing together V8 and Blink V8 • Managed heap

    for JavaScript objects • Dynamic object layout (shape changes) • Precise GC • Mark-Sweep-Compact Collector ◦ Incremental marking ◦ Concurrent sweeping ◦ Parallel compaction Blink • Managed heap for (most of) the DOM • Static object layout • Precise and conservative GC (until Q1’16 ref counting) • Stop-the-world Mark-Sweep(-Compact) Collector ◦ Incremental sweeping ◦ (Compaction on backing stores of collections) Two very different systems communicating with each other!
  10. The Cycle Problem V8 Embedder / Chrome / Blink HTMLDocument

    HTMLBodyElement HTMLDivElement JSObject (newDiv) String "div" JSObject (document) Bindings JSObject (body)
  11. The Cycle Problem • Manual breaking of cycles using weak

    references ◦ Not possible when cross-component reference requires object to stay alive Observations • Similar to cycle problem when using reference counts ◦ Lots of algorithms and literature on how to deal with cycles • Cross-component information required to break cycles automatically
  12. Cross-Component Garbage Collection A B It is sufficient to provide

    enough information to break one link of the cycle
  13. Tracing across component boundaries trace :: Object -> [Objects] Return

    all objects that are transitively reachable from a given object
  14. V8: Tracing from JavaScript to C++ and back • Batch-based

    communication • Control over when to start tracing V8 exposes an interface that embedders need to implement RegisterReferences([Object]) Allows communicating objects that are entry points into the embedders heap RegisterExternallyReferencedObject(Object) Communicate that Object is reachable through some entry point object AdvanceTracing(deadline, force_completion) Trace objects on the embedder’s heap
  15. Interlude: Incremental Marking • Main thread gets interrupted and marks

    n objects until all live objects are marked • Ingredients: ◦ One marking deque ◦ Three colors ▪ White: unknown ▪ Grey: on marking deque and will be processed ▪ Black: marked ◦ Barriers ensuring consistency ◦ A driver that ensures marking progress Mark JS Mark JS Mark JS Mark JS JS Finalize
  16. Blink: Write Barrier • Preserving strong tri-color invariant ◦ No

    black to white edges • Dijkstra-style write barrier write(Object source, Object value): if (is_tracing && is_marked(source) && !is_marked(value)) mark(value) marking_deque.push(value)
  17. Correctness: C++ Write Barriers Key problem: Not emitting the write

    barrier • Idea: Rely on C++ type system to enforce the barriers • GC pointers: GC-aware smart pointers ◦ Back pointer to object header ◦ Emit barrier on write, (copy) construction, move • Restrict tracing methods to accepting only those GC pointers ◦ Result: Compile failure if raw pointers are used Successfully annotated ~250 fields in the DOM that required a barrier
  18. Baseline: Object Grouping • Objects are grouped together based on

    rules, e.g., DOM tree • A group is considered as live if one object of the group is alive • Incremental over approximation possible but expensive ◦ Needs to consider live and dead objects • Prone to memory leaks in certain situations
  19. Benchmarking • Real-world workloads using Catapult benchmarking framework • Trace

    every single GC event, e.g. marking, marking finalization, atomic pause • Hypothesis testing using Wilcoxon Rank Sum Test