day 21 • Prerequisite ◦ The basic knowledge about OS/OSv [2] ◦ The knowledge about C++ 11 specification ◦ The basic knowledge about algorithm and data structure ◦ It”s better If you’re familiar with Linux kernel source • Source tree: The latest master branch [3] at Dec 21, 2014 ◦ HEAD commit: f7d3ddd648b38789daa8287626a66863a780f139 ◦ For simplicity、omit debug feature, trace feature and exclusive control mechanism • Pathnames in this document are the relative path from the top of OSv source tree
The implementation is very simple ◦ A comment says ”Our malloc() is very coarse” :-) • Specialized to small object (I guess it’s for JVM) • One malloc() code is used for both kernel and user apps: It’s OSv way! ◦ It’s like calling kmalloc() directly from user application in linux kernel • Allocating size is aligned to 2^n >=8 (n is a positive integer) ◦ e.g) malloc(4) returns a 8byte memory object
mechanisms depend on claiming size ◦ <= 1KiB: mempool ◦ > 1KiB: page allocator user or kernel mempool (It ‘s like sl[auo]b allocator in Linux kernel) memory management subsystem page allocator (It’s like buddy allocator in Linux kernel) malloc()/free()
by other memory allocation APIs(*1) Align size to 2^n >= 8 byte (n is a positive integer) if (size <= 1KiB && after SMP initialization(always true for applications)) malloc_pools[lg(n)].alloc() # Allocate from mempool else if (1KiB < size <= 4KiB) memory::alloc_page() # Allocate one page from page allocator else malloc_large() # Allocate pages from page allocator # Use the most complex logic among # the three ways here (Omit to explain here) *1) If you explicitly set alignment, call trace becomes more complex. For more information, please refer to the implementation of std_malloc()
described before ◦ Managed by class malloc_pool (inherit from class pool) • Definition: malloc_pool malloc_pools[] ◦ One pool for each object size ▪ # of this array is lg(page size)+1. In x86_64 (its page size is 4KiB), it’s 12+1=13 ▪ Each pool corresponds to allocation size, 1,2,4,8,...,4KiB,8KiB • I wonder why last entry corresponds to 8KiB exists... ◦ malloc() from applications never use mempool[11,…,(page_size+1)]
<= page size ◦ Managed by class pool # I consider it’s name is too abstract • Handle memory objects from 8byte to 1page(4KiB) ◦ [8,1KiB) => Multiple objects in one page ◦ (1KiB,4KiB] => one object in one page • Have per-CPU cache for improving scalability in MP system ◦ Because of the locality of reference, improving the probability of allocating object which callee CPU recently used ◦ Exclusive control is not necessary on memory allocation by mempool • For more information, please refer to the comment beginning from ” Memory allocation strategy” and source code
◦ Managed by “class page_range_allocator” • Have two level caches, named L1 and L2 ◦ L1: per-CPU cache ◦ L2: global cache • Global page allocator: The bottom layer under these two caches kernel subsystem (including mempool) L1 cache (per-CPU cache) L2 cache (Shared among all CPUs) global page allocator page allocator
• Managed by class l2 ◦ Cache up to “l2::max” pages • UI (used by mempool): “try_*” don’t synchronously fill/refill pages ◦ Multiple page allocation: l2::{try_,}alloc_page_batch ◦ Multiple page free: l2::{try_,}free_page_batch • Interface to global page allocator ◦ Async: Use per-CPU thread, “page_pool_l2” ▪ # of pages < l2::max*¼ => Fill some pages ▪ # of pages > l2::max*¾ => Refill some pages ◦ Sync: If # of pages becomes 0 or l2::max, fill/refill some pages
Tokyo/Fall, Takuya ASADA, Cloudius Systems http://www.slideshare.net/syuu1228/osv-in-osc2014-tokyofall 3. The source tree of OSv https://github.com/cloudius-systems/osv 4. slab allocation at Wikipedia http://en.wikipedia.org/wiki/Slab_allocation 5. mallocの旅(Glibc編), こさき@ぬまづ http://www.slideshare.net/kosaki55tea/glibc-malloc
tried to just the code of malloc(). However, finally, I read most of memory management code. ◦ I forgot that kernel and user apps run on the same memory space • The source is simple and easy to read (than the giant linux kernel code) ◦ There seems to be plenty of room to improve performance ◦ Nice code to learn OS • There is malloc(size, align) which is similar to posix_memalign() ◦ Only kernel can use it because it’s not exported to user applications ◦ It can be accomplished by C++’s overload feature. Viva C++! ▪ When I encountered these two functions, I couldn’t understand why it works since I forgot the overload feature at that time • My last experience in C++ is 10 years ago (C++89 era)...