Slide 1

Slide 1 text

Managed Runtime Systems Memory Management Foivos Zakkak https://foivos.zakkak.net Except where otherwise noted, this presentation is licensed under the Creative Commons Attribution 4.0 International License. Third party marks and brands are the property of their respective holders.

Slide 2

Slide 2 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 2 Acknowledgments The following slides are based on the corresponding slides of Mario Wolczko about Memory Management: ■ Memory management part 1 ■ Memory management part 2 ■ Memory management debugging hints

Slide 3

Slide 3 text

Static vs Dynamic Allocation

Slide 4

Slide 4 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 4 Static Memory Allocation ■ Binding of name to memory address at compile/link time ■ All sizes are fixed, i.e., known at compile time ■ No stack allocation ■ Used in early FORTRAN, BASIC, and various languages for embedded/real-time systems

Slide 5

Slide 5 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 5 Static Memory Allocation ■ Pros – No runtime overheads for allocation/de-allocation – Memory requirements known at compile time – No failures due to lack of memory ■ Cons – Need to allocate and keep the maximum possible memory footprint for the whole program execution – No recursion due to lack of stack allocation

Slide 6

Slide 6 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 6 Dynamic Allocation (on the stack) ■ Languages (at least most of them) are based on procedures ■ LIFO/Depth-First invocation order (in most cases) – See Scheme and Smalltalk for counter-examples ■ Memory used by procedures can be managed as a stack – Allocate on invocation – Release on return ■ Hardware support (SP register, call/ret instr.) since the 1960s

Slide 7

Slide 7 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 7 Dynamic Allocation (on the stack) ■ Pros – Low runtime overheads – Bump stack pointer for allocation – Bulk de-allocation on return – No memory leaks ■ Cons – Names passed as parameters, the deeper a procedure is invoked the more parameters it gets – Cannot return memory to previous procedures – Data-lifetime equals the procedure’s lifetime – Can’t handle complex data structures like graphs

Slide 8

Slide 8 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 8 Dynamic Allocation (on the heap) ■ Arbitrary requests for memory segments ■ Allocations may fail ■ Fast allocation vs Fast de-allocation trade-offs ■ Heap may not be contiguous

Slide 9

Slide 9 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 9 Dynamic Allocation (on the heap) ■ Pros – Arbitrary allocation sizes – Allocation and de-allocation from different procedures – Handling of complex data structures ■ Cons – Noticeable runtime overheads – Need to perform a de-allocation for each allocation ■ See region-based allocation for an enhancement – Memory leaks are possible

Slide 10

Slide 10 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 10 Heap Allocation Free

Slide 11

Slide 11 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 11 Heap Allocation Free

Slide 12

Slide 12 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 12 Heap Allocation Free

Slide 13

Slide 13 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 13 Heap Allocation Free

Slide 14

Slide 14 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 14 Heap Allocation Free

Slide 15

Slide 15 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 15 Heap Allocation Free ? ?

Slide 16

Slide 16 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 16 Heap Allocation: Free-list variations ■ First-Fit: – Search free-list from the beginning, peek first block that fits – Add the remaining of the block (if any) as a new block to the list – May result in a number of small blocks at beginning of free-list ■ Best-Fit: – Search free-list from the beginning, peek the block that best fits – Add the remaining of the block (if any) as a new block to the list – Reduces fragmentation – Slow since we need to traverse the whole free-list ■ Next-Fit: – Search from where we stopped last time, peek the block that best fits – Add the remaining of the block (if any) as a new block to the list – Might increase fragmentation – Often faster than First-fit

Slide 17

Slide 17 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 17 First-Fit Head Request

Slide 18

Slide 18 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 18 First-Fit Head Request

Slide 19

Slide 19 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 19 First-Fit Head Request

Slide 20

Slide 20 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 20 First-Fit Head Request

Slide 21

Slide 21 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 21 Best-Fit Head Request

Slide 22

Slide 22 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 22 Best-Fit Head Request

Slide 23

Slide 23 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 23 Best-Fit Head Request

Slide 24

Slide 24 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 24 Best-Fit Head Request

Slide 25

Slide 25 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 25 Best-Fit Head Request

Slide 26

Slide 26 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 26 Best-Fit Head Request

Slide 27

Slide 27 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 27 Best-Fit Head Request

Slide 28

Slide 28 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 28 Next-Fit Head Request

Slide 29

Slide 29 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 29 Next-Fit Head Request

Slide 30

Slide 30 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 30 Next-Fit Head Request

Slide 31

Slide 31 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 31 Next-Fit Head Request

Slide 32

Slide 32 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 32 Next-Fit Head Request Request

Slide 33

Slide 33 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 33 Next-Fit Head Request Request

Slide 34

Slide 34 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 34 Next-Fit Head Request Request

Slide 35

Slide 35 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 35 Next-Fit Head Request Request

Slide 36

Slide 36 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 36 Next-Fit Head Request Request

Slide 37

Slide 37 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 37 Fragmentation ■ Fragmentation is the phenomenon of not being able to use parts of memory because of inefficient management ■ A heavily fragmented system may have plenty of free memory, but chopped in small blocks that don’t fit a new request

Slide 38

Slide 38 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 38 Fragmentation Categorization ■ Internal Fragmentation is the result of allocating larger chunks than actually needed (often due to alignment restrictions) ■ External Fragmentation is the result of constantly splitting free blocks, resulting in multiple small non-contiguous free blocks that cannot be used

Slide 39

Slide 39 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 39 Free-list Allocation Main Overheads 1) Loop over free-blocks, even over obvious non-matches 2) For each block check if it fits 3) Split the block if it’s bigger

Slide 40

Slide 40 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 40 Single-size Free-lists 1) A set of free-lists, one for each size (for a set of common sizes) 2) A generic free-list for the rest 3) Always peek the first block from the list of the desired size 4) If empty take a block from the generic one and split it

Slide 41

Slide 41 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 41 Single-size Free-lists Head1 Head2 Head3 Heads

Slide 42

Slide 42 text

What about multi-threaded applications???

Slide 43

Slide 43 text

Garbage Collection

Slide 44

Slide 44 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 44 Garbage Collection 1) Ease programming, no need to argue about object lifetimes 2) Eliminate errors due to dangling pointers 3) Take care of the previous issues 4) Still possible to leak memory!

Slide 45

Slide 45 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 45 How does it work? public static void main(String args[]) { List lines= Files.readAllLines(Paths.get(args[0]), Charset.defaultCharset()); int nLines= lines.size(); // reclaim lines?? System.out.println(nLines); } public static void main(String args[]) { List lines= Files.readAllLines(Paths.get(args[0]), Charset.defaultCharset()); int nLines= lines.size(); // reclaim lines?? System.out.println(nLines); }

Slide 46

Slide 46 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 46 Liveness An object is dead when it is no longer needed

Slide 47

Slide 47 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 47 Liveness An object is dead when it is no longer needed “But, VMs (and compilers) have severely limited crystal balls” – Mario Wolczko

Slide 48

Slide 48 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 48 Liveness in the Real World ■ An object is dead when it is no longer reachable – Reachable is an object that can be reached by following pointers starting from the system’s roots – The system’s roots are all the variables in scope (of all threads) – Requires traversal of stacks and globals

Slide 49

Slide 49 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 49 Reachability Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F B C E D H G A F H D G E

Slide 50

Slide 50 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 50 Reference Counting ■ Keep a reference counter per object ■ Increment it when a reference to that object is assigned to a variable ■ Decrement it when a reference to that object is overwritten ■ If the counter is zero, the object can be reclaimed

Slide 51

Slide 51 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 51 Reference Counting Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F B C E D H G A F H D G E C 0 1 1 1 3 1 2 1

Slide 52

Slide 52 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 52 Reference Counting Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F C E D H G A F H D G E 0 1 1 3 1 2 1 B 0

Slide 53

Slide 53 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 53 Reference Counting Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F E D H G A F H D G E 0 1 1 1 2 1 B 0 C 2

Slide 54

Slide 54 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 54 Reference Counting Drawbacks 1) Reference counting has to be performed on all variables (stack, global, and heap) 2) References in an activation record have to be decremented before de-allocating the frame upon return 3) Decrementing the reference counter often incurs a cache miss 4) Decrementing the reference counter always incurs a write 5) Concurrent threads might contend on the reference counter 6) Cannot reclaim cycles

Slide 55

Slide 55 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 55 Reference Counting Cycle Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. F E D H G F 1 1 1 1 2

Slide 56

Slide 56 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 56 Reference Counting with Delayed Reclamation ■ Avoid unbound recursion ■ Reduce the number of pauses for Garbage Collection

Slide 57

Slide 57 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 57 Tracing Collection: Mark-n-Sweep ■ Mark: Follow the system’s roots and mark reachable objects ■ Sweep: Reclaim unmarked objects at the end

Slide 58

Slide 58 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 58 Mark-n-sweep Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F B C E D H G

Slide 59

Slide 59 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 59 Mark-n-sweep Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F B C E D H G A F

Slide 60

Slide 60 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 60 Mark-n-sweep Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F B C E D H G A F H

Slide 61

Slide 61 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 61 Mark-n-sweep Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F B C E D H G A F H D G

Slide 62

Slide 62 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 62 Mark-n-sweep Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F B C E D H G A F H D G E

Slide 63

Slide 63 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 63 Mark-n-sweep Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F B C E D H G A F H D G E

Slide 64

Slide 64 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 64 Mark-n-sweep Roots: ● Static variables ● Java stack frame ● Native stack frame ● JNI ● etc. A F E D H G A F H D G E B C

Slide 65

Slide 65 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 65 Mark Implementations ■ Recursive: Worst case each object creates an activation (i.e., marking a single linked-list) ■ Work queue: Each object creates a new node in the queue – Dominant approach in parallel garbage collectors

Slide 66

Slide 66 text

Sweep Implementations

Slide 67

Slide 67 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 67 Sweep with Free-list ■ Add reclaimed chunks to a free-list ■ Requires parsing the whole heap to find the non-marked objects – Possible (using the object headers), but inefficient ■ Coalescing adjacent free blocks

Slide 68

Slide 68 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 68 Compacting Sweep ■ Move live objects to consecutive memory addresses – Moving objects breaks references though ■ Create a contiguous large free space after the live objects ■ Run on every collection or when heap is heavily fragmented

Slide 69

Slide 69 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 69 Compacting Sweep ■ Move live objects to consecutive memory addresses – Moving objects breaks references though ■ Create a contiguous large free space after the live objects ■ Run on every collection or when heap is heavily fragmented

Slide 70

Slide 70 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 70 Compacting Sweep ■ Move live objects to consecutive memory addresses – Moving objects breaks references though ■ Create a contiguous large free space after the live objects ■ Run on every collection or when heap is heavily fragmented

Slide 71

Slide 71 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 71 Compacting Sweep with Forwarding Pointers Add an extra field in each object’s header – the forwarding pointer 1) Compute forwarding pointers 2) Update all pointers using the forwarding pointers 3) Move the objects

Slide 72

Slide 72 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 72 Compacting Sweep with Temporary Table ■ Instead of using a forwarding pointer replace actual header with pointer to a temporary table entry ■ Each temporary table entry holds a header and a forwarding location

Slide 73

Slide 73 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 73 Compacting Sweep with Threading 1) Replace the object header with a pointer to a list 2) This starts from the object and goes through all the fields that reference it 3) The last field in the list contains the initial content of the object header 4) When the object is moved the list is traversed to update the corresponding fields

Slide 74

Slide 74 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 74 Copying Collection  Combination of Trace and Compaction  Split heap memory in from and to semi-spaces  Copy live objects on trace to to semi-space  Leave forwarding pointers in from semi-space  At the end, from becomes to and vice versa

Slide 75

Slide 75 text

Managed Runtime Systems CC-BY https://Foivos.Zakkak.net 75 Copying Collection ■ Pros – Bump allocation – Traverse only live objects – Can be used by parallel GCs – Increase locality? ■ Cons – Requires twice the memory (at least during collection) – Copies the whole heap in each collection