Slide 1

Slide 1 text

Down Memory Lane: Two Decades with the Slab Allocator CTO [email protected] Bryan Cantrill @bcantrill

Slide 2

Slide 2 text

Two decades ago…

Slide 3

Slide 3 text

Aside: Me, two decades ago…

Slide 4

Slide 4 text

Aside: Me, two decades ago…

Slide 5

Slide 5 text

Slab allocator, per Vahalia

Slide 6

Slide 6 text

First <3: Allocator footprint

Slide 7

Slide 7 text

I <3: Cache-aware data structures

Slide 8

Slide 8 text

I <3: Cache-aware data structures Hardware Antiques Roadshow!

Slide 9

Slide 9 text

I <3: Magazine layer

Slide 10

Slide 10 text

I <3: Magazine autoscaling

Slide 11

Slide 11 text

I <3: Debugging support • For many, the most salient property of the slab allocator is its eponymous allocation properties… • …but for me, its debugging support is much more meaningful • Rich support for debugging memory corruption issues: allocation auditing + detection of double-free/use-after-free/ use-before-initialize/buffer overrun • Emphasis was leaving the functionality compiled in (and therefore available in production) — even if off by default • kmem_flags often used to debug many kinds of problems!

Slide 12

Slide 12 text

I <3: Debugging support

Slide 13

Slide 13 text

I <3: Debugger support

Slide 14

Slide 14 text

I <3: MDB support • Work in crash(1M) inspired deeper kmem debugging support in MDB, including commands to: • Walk the buffers from a cache (::walk kmem) • Display buffers allocated by a thread (::allocdby) • Verify integrity of kernel memory (::kmem_verify) • Determine the kmem cache of a buffer (::whatis) • Find memory leaks (::findleaks) • And of course, for Bonwick, ::kmastat and ::kmausers!

Slide 15

Slide 15 text

I <3: libumem • Tragic to have a gold-plated kernel facility while user-land suffered in the squalor of malloc(3C)… • In the (go-go!) summer of 2000, Jonathan Adams (then a Sun Microsystems intern from CalTech) ported the kernel memory allocator to user-land as libumem • Jonathan returned to Sun as a full-time engineer to finish and integrate libumem; became open source with OpenSolaris • libumem is the allocator for illumos and derivatives like SmartOS — and has since been ported to other systems

Slide 16

Slide 16 text

libumem performance • The slab allocator was designed to be scalable — not necessarily to accommodate pathological software • At user-level, pathological software is much more common… • Worse, the flexibility of user-level means that operations that are quick in the kernel (e.g., grabbing an uncontended lock) are more expensive at user-level • Upshot: while libumem provided great scalability, its latency was worse than other allocators for applications with small, short-lived allocations

Slide 17

Slide 17 text

libumem performance: node.js running MD5

Slide 18

Slide 18 text

I <3: Per-thread caching libumem • Robert Mustacchi added per-thread caching to libumem: • free() doesn’t free buffer, but rather enqueues on a per- size cache on ulwp_t (thread) structure • malloc() checks the thread’s cache at given size first • Several problems: • How to prevent cache from growing without bound? • How to map dynamic libumem object cache sizes to fixed range in ulwp_t structure without slowing fast path?

Slide 19

Slide 19 text

I <3: Per-thread caching libumem • Cache growth problem solved by having each thread track the total memory in its cache, and allowing per-thread cache size to be tuned (default is 1M) • Solving problem of optimal malloc() in light of arbitrary libumem cache sizes is a little (okay, a lot) gnarlier; from
 usr/src/lib/libumem/umem.c:

Slide 20

Slide 20 text

I <3 the slab allocator! • It shows the importance of well-designed, well-considered, well-described core services • We haven’t need to rewrite it — it has withstood ~4 orders of magnitude increase in machine size • We have been able to meaningfully enhance it over the years • It is (for me) the canonical immaculate system: more one to use and be inspired by than need to fix or change • It remains at the heart of our system — and its ethos very much remains the zeitgeist of illumos!

Slide 21

Slide 21 text

Acknowledgements • Jeff Bonwick for creating not just an allocator, but a way of thinking about systems problems and of implementing their solutions — and the courage to rewrite broken software! • Jonathan Adams for not just taking on libumem, but also writing about it formally and productizing it • Robert Mustacchi for per-thread caching libumem • Ryan Zezeski for bringing the slab allocator to a new generation with his Papers We Love talk

Slide 22

Slide 22 text

Further reading • Jeff Bonwick, The Slab Allocator: An Object-Caching Kernel Memory Allocator, Summer USENIX, 1994 • Jeff Bonwick and Jonathan Adams, Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources, USENIX Annual Technical Conference, 2001 • Robert Mustacchi, Per-thread caching in libumem, blog entry on dtrace.org, July 2012 • Ryan Zezeski, Memory by the Slab: The Tale of Bonwick’s Slab Allocator, Papers We Love NYC, September 2015 • Uresh Vahalia, UNIX Internals, 1996