Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Down Memory Lane: Two Decades with the Slab Allocator

Down Memory Lane: Two Decades with the Slab Allocator

My presentation at Systems We Love in San Francisco, December 13, 2016. Video: https://www.youtube.com/watch?v=IznEq2Uf2xk

Bryan Cantrill

December 14, 2016
Tweet

More Decks by Bryan Cantrill

Other Decks in Technology

Transcript

  1. Down Memory Lane:
    Two Decades with the Slab Allocator
    CTO
    [email protected]
    Bryan Cantrill
    @bcantrill

    View Slide

  2. Two decades ago…

    View Slide

  3. Aside: Me, two decades ago…

    View Slide

  4. Aside: Me, two decades ago…

    View Slide

  5. Slab allocator, per Vahalia

    View Slide

  6. First <3: Allocator footprint

    View Slide

  7. I <3: Cache-aware data structures

    View Slide

  8. I <3: Cache-aware data structures
    Hardware Antiques Roadshow!

    View Slide

  9. I <3: Magazine layer

    View Slide

  10. I <3: Magazine autoscaling

    View Slide

  11. I <3: Debugging support
    • For many, the most salient property of the slab allocator is its
    eponymous allocation properties…
    • …but for me, its debugging support is much more meaningful
    • Rich support for debugging memory corruption issues:
    allocation auditing + detection of double-free/use-after-free/
    use-before-initialize/buffer overrun
    • Emphasis was leaving the functionality compiled in (and
    therefore available in production) — even if off by default
    • kmem_flags often used to debug many kinds of problems!

    View Slide

  12. I <3: Debugging support

    View Slide

  13. I <3: Debugger support

    View Slide

  14. I <3: MDB support
    • Work in crash(1M) inspired deeper kmem debugging support
    in MDB, including commands to:
    • Walk the buffers from a cache (::walk kmem)
    • Display buffers allocated by a thread (::allocdby)
    • Verify integrity of kernel memory (::kmem_verify)
    • Determine the kmem cache of a buffer (::whatis)
    • Find memory leaks (::findleaks)
    • And of course, for Bonwick, ::kmastat and ::kmausers!

    View Slide

  15. I <3: libumem
    • Tragic to have a gold-plated kernel facility while user-land
    suffered in the squalor of malloc(3C)…
    • In the (go-go!) summer of 2000, Jonathan Adams (then a Sun
    Microsystems intern from CalTech) ported the kernel memory
    allocator to user-land as libumem
    • Jonathan returned to Sun as a full-time engineer to finish and
    integrate libumem; became open source with OpenSolaris
    • libumem is the allocator for illumos and derivatives like
    SmartOS — and has since been ported to other systems

    View Slide

  16. libumem performance
    • The slab allocator was designed to be scalable — not
    necessarily to accommodate pathological software
    • At user-level, pathological software is much more common…
    • Worse, the flexibility of user-level means that operations that
    are quick in the kernel (e.g., grabbing an uncontended lock)
    are more expensive at user-level
    • Upshot: while libumem provided great scalability, its latency
    was worse than other allocators for applications with small,
    short-lived allocations

    View Slide

  17. libumem performance: node.js running MD5

    View Slide

  18. I <3: Per-thread caching libumem
    • Robert Mustacchi added per-thread caching to libumem:
    • free() doesn’t free buffer, but rather enqueues on a per-
    size cache on ulwp_t (thread) structure
    • malloc() checks the thread’s cache at given size first
    • Several problems:
    • How to prevent cache from growing without bound?
    • How to map dynamic libumem object cache sizes to fixed
    range in ulwp_t structure without slowing fast path?

    View Slide

  19. I <3: Per-thread caching libumem
    • Cache growth problem solved by having each thread track
    the total memory in its cache, and allowing per-thread cache
    size to be tuned (default is 1M)
    • Solving problem of optimal malloc() in light of arbitrary
    libumem cache sizes is a little (okay, a lot) gnarlier; from

    usr/src/lib/libumem/umem.c:

    View Slide

  20. I <3 the slab allocator!
    • It shows the importance of well-designed, well-considered,
    well-described core services
    • We haven’t need to rewrite it — it has withstood ~4 orders of
    magnitude increase in machine size
    • We have been able to meaningfully enhance it over the years
    • It is (for me) the canonical immaculate system: more one to
    use and be inspired by than need to fix or change
    • It remains at the heart of our system — and its ethos very
    much remains the zeitgeist of illumos!

    View Slide

  21. Acknowledgements
    • Jeff Bonwick for creating not just an allocator, but a way of
    thinking about systems problems and of implementing their
    solutions — and the courage to rewrite broken software!
    • Jonathan Adams for not just taking on libumem, but also
    writing about it formally and productizing it
    • Robert Mustacchi for per-thread caching libumem
    • Ryan Zezeski for bringing the slab allocator to a new
    generation with his Papers We Love talk

    View Slide

  22. Further reading
    • Jeff Bonwick, The Slab Allocator: An Object-Caching Kernel Memory
    Allocator, Summer USENIX, 1994
    • Jeff Bonwick and Jonathan Adams, Magazines and Vmem: Extending
    the Slab Allocator to Many CPUs and Arbitrary Resources, USENIX
    Annual Technical Conference, 2001
    • Robert Mustacchi, Per-thread caching in libumem, blog entry on
    dtrace.org, July 2012
    • Ryan Zezeski, Memory by the Slab: The Tale of Bonwick’s Slab
    Allocator, Papers We Love NYC, September 2015
    • Uresh Vahalia, UNIX Internals, 1996

    View Slide