Slide 1

Slide 1 text

Control Theory and Concurrent Garbage Collection: A Deep Dive Into The Go GC Pacer Madhav Jivrajani

Slide 2

Slide 2 text

$ whoami ● I get super excited about systems-y stuff ● Work @ VMware (Kubernetes Upstream) ● Within the Kubernetes community - SIG-{API Machinery, Scalability, Architecture, ContribEx}. ○ Please reach out if you’d like to get started in the community! ● Doing Go stuff for ~3 years, particularly love things around the Go runtime!

Slide 3

Slide 3 text

Agenda ● Control Theory ● A Typical Go Application ● The GC Pacer ● GC Pacer Prior to Go 1.18 ● GC Pacer Since Go 1.18 ● How Did This Affect A Kubernetes Release? ● Mitigating These Effects ● Small Note On Go 1.19 Soft Memory Limit

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

● SV - Set Variable (what we hope to achieve) ● PV - Process Variable (output of the system) ● Error - Difference between SV and PV

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

Transient and Steady State

Slide 20

Slide 20 text

Transient and Steady State

Slide 21

Slide 21 text

Transient and Steady State

Slide 22

Slide 22 text

Transient and Steady State

Slide 23

Slide 23 text

Transient and Steady State

Slide 24

Slide 24 text

Transient and Steady State

Slide 25

Slide 25 text

Transient and Steady State

Slide 26

Slide 26 text

Transient and Steady State

Slide 27

Slide 27 text

Transient and Steady State

Slide 28

Slide 28 text

Transient and Steady State The lifetime of the controller can be looked at as a series of steady states “stringed” together by a series of transient states.

Slide 29

Slide 29 text

However, it’s often not this ideal. With the controller applying adjustments, the following questions come to mind: ● What if the adjustment applied overshoots or undershoots the SV? ● Can we take past experiences into account and adjust accordingly or in other words, can we compensate? ● Can we look at our current state and predict what the state is going to be in the future?

Slide 30

Slide 30 text

Past, Present and Future - PID Controller

Slide 31

Slide 31 text

P Controller ● P - Proportional: Adjust proportional to the error ● Advantages: ○ Easy to reason about ○ Minimal state to maintain ● Disadvantages: ○ Very prone to under and over-shoot. ○ Steady state error does not converge to 0. ○ Proportional Droop

Slide 32

Slide 32 text

I Controller ● I - Integral: Adjust based on what the error has been in the past. ● Advantages: ○ Drives steady state error to 0. ● Disadvantages: ○ Prone to under and over-shoot.

Slide 33

Slide 33 text

D Controller ● D - Derivative: Adjust based on how the error is changing (anticipate the future). ● Advantages: ○ Great for applying corrective actions. ○ Speeds up time taken to reach steady state. ● Disadvantages: ○ Highly sensitive to noise.

Slide 34

Slide 34 text

What does a typical Go application look like?

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

It’s time to cleanup! How can we go about doing that?

Slide 44

Slide 44 text

Hmm… One way could be to stop the application and do a sweep.

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

Another way could be to do things concurrently…

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

Go does GC this way!

Slide 51

Slide 51 text

But…

Slide 52

Slide 52 text

Application is still allocating!

Slide 53

Slide 53 text

So, how does Go do GC?

Slide 54

Slide 54 text

Interlude

Slide 55

Slide 55 text

Considering mutators and collectors will run together at some point, there is a fundamental tradeoff involved, let’s consider the following:

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

Over a period of time (and normal circumstances), this tradeoff translates to: ● More collector CPU Usage => More time spent doing GC work => Potentially lower mem. footprint for that time period. ○ But: less time given to application => Higher application latencies. ● More mutator CPU Usage => More time given to application => Potentially lower application latencies for that time period ○ But: less time spent doing GC work => Higher mem. footprint.

Slide 61

Slide 61 text

Over a period of time (and normal circumstances), this tradeoff translates to:

Slide 62

Slide 62 text

Over a period of time (and normal circumstances), this tradeoff translates to:

Slide 63

Slide 63 text

GOGC

Slide 64

Slide 64 text

GOGC ● Let H m (n) be the size of the objects that were marked live after cycle n. ○ “Objects” is intentionally vague for now!

Slide 65

Slide 65 text

GOGC ● Let H m (n) be the size of the objects that were marked live after cycle n. ○ “Objects” is intentionally vague for now! ● Let H g (n) be the value to which we are willing to let the memory footprint grow before we start a GC cycle.

Slide 66

Slide 66 text

GOGC ● Let H m (n) be the size of the objects that were marked live after cycle n. ○ “Objects” is intentionally vague for now! ● Let H g (n) be the value to which we are willing to let the memory footprint grow before we start a GC cycle. ○ Or in other words, this is the heap goal.

Slide 67

Slide 67 text

GOGC ● Let H m (n) be the size of the objects that were marked live after cycle n. ○ “Objects” is intentionally vague for now! ● Let H g (n) be the value to which we are willing to let the memory footprint grow before we start a GC cycle. ○ Or in other words, this is the heap goal. H g (n)= H m (n-1) x [1 + GOGC/100]

Slide 68

Slide 68 text

GOGC “The GOGC variable sets the initial garbage collection target percentage. A collection is triggered when the ratio of freshly allocated data to live data remaining after the previous collection reaches this percentage.” https://pkg.go.dev/runtime

Slide 69

Slide 69 text

Interlude: GC Mark Assists Let’s take a look at our app again

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

● If a goroutine starts allocating faster than the GC can keep up, it can lead to unbounded heap growth.

Slide 73

Slide 73 text

● If a goroutine starts allocating faster than the GC can keep up, it can lead to unbounded heap growth. ● To deal with this, the GC asks this goroutine to assist with marking.

Slide 74

Slide 74 text

Mark assists are a reactive measure to keep memory usage stable in unstable conditions, at the expense of CPU time.

Slide 75

Slide 75 text

The question we now need to try and answer is: when do we start a GC cycle?

Slide 76

Slide 76 text

When do we start a GC cycle?

Slide 77

Slide 77 text

When do we start a GC cycle? ● Prior to Go 1.5, the GC was a parallel STW collector. ○ No allocations during GC

Slide 78

Slide 78 text

When do we start a GC cycle? ● Prior to Go 1.5, the GC was a parallel STW collector. ○ No allocations during GC ○ We could start a cycle when the H m (n-1)= H g (n)

Slide 79

Slide 79 text

When do we start a GC cycle? ● Prior to Go 1.5, the GC was a parallel STW collector. ○ No allocations during GC ○ We could start a cycle when the H m (n-1)= H g (n) ● But now, mutators and collectors run concurrently ○ Allocations happen during the concurrent marking phase

Slide 80

Slide 80 text

When do we start a GC cycle? ● Prior to Go 1.5, the GC was a parallel STW collector. ○ No allocations during GC ○ We could start a cycle when the H m (n-1)= H g (n) ● But now, mutators and collectors run concurrently ○ Allocations happen during the concurrent marking phase ○ How do we still respect H g (n)?

Slide 81

Slide 81 text

When do we start a GC cycle? ● Prior to Go 1.5, the GC was a parallel STW collector. ○ No allocations during GC ○ We could start a cycle when the H m (n-1)= H g (n) ● But now, mutators and collectors run concurrently ○ Allocations happen during the concurrent marking phase ○ How do we still respect H g (n)? ○ We need to start early! ○ How early?

Slide 82

Slide 82 text

When do we start a GC cycle? ● Prior to Go 1.5, the GC was a parallel STW collector. ○ No allocations during GC ○ We could start a cycle when the H m (n-1)= H g (n) ● But now, mutators and collectors run concurrently ○ Allocations happen during the concurrent marking phase ○ How do we still respect H g (n)? ○ We need to start early! ○ How early? That’s a question for the GC Pacer.

Slide 83

Slide 83 text

The GC Pacer Has 2 Fundamental Goals 1. Maintain a target GC CPU utilization. 2. Get the size of the live heap as close to the heap goal as possible.

Slide 84

Slide 84 text

Note: ● It helps thinking of the Pacer as a level-triggered system as opposed to an edge-triggered one.

Slide 85

Slide 85 text

Note: ● It helps thinking of the Pacer as a level-triggered system as opposed to an edge-triggered one. a. The Pacer concerns itself with a macro view of the system, and cares about how the behaviour is aggregating over a period of time.

Slide 86

Slide 86 text

Note: ● It helps thinking of the Pacer as a level-triggered system as opposed to an edge-triggered one. a. The Pacer concerns itself with a macro view of the system, and cares about how the behaviour is aggregating over a period of time. b. It does not concern itself with moment-to-moment, individual allocations.

Slide 87

Slide 87 text

Note: ● It helps thinking of the Pacer as a level-triggered system as opposed to an edge-triggered one. Instead, it expects the application to settle on some state: ✨The steady state ✨ The goals of the pacer are defined for this steady state.

Slide 88

Slide 88 text

✨ Steady State ✨ ● Constant allocation rate ● Constant heap size ● Constant heap composition

Slide 89

Slide 89 text

GC Pacer Prior To Go 1.18

Slide 90

Slide 90 text

GC Pacer Prior To Go 1.18 Optimization Goal 1: Target CPU Utilization

Slide 91

Slide 91 text

GC Pacer Prior To Go 1.18 Optimization Goal 1: Target CPU Utilization

Slide 92

Slide 92 text

GC Pacer Prior To Go 1.18 Optimization Goal 1: Target CPU Utilization

Slide 93

Slide 93 text

GC Pacer Prior To Go 1.18 Optimization Goal 1: Target CPU Utilization

Slide 94

Slide 94 text

GC Pacer Prior To Go 1.18 Optimization Goal 1: Target CPU Utilization

Slide 95

Slide 95 text

GC Pacer Prior To Go 1.18 Optimization Goal 1: Target CPU Utilization

Slide 96

Slide 96 text

GC Pacer Prior To Go 1.18 Optimization Goal 1: Target CPU Utilization

Slide 97

Slide 97 text

GC Pacer Prior To Go 1.18 Optimization Goal 2: Get the size of the live heap as close to the heap goal as possible.

Slide 98

Slide 98 text

GC Pacer Prior To Go 1.18 Optimization Goal 2: Get the size of the live heap as close to the heap goal as possible.

Slide 99

Slide 99 text

GC Pacer Prior To Go 1.18 Optimization Goal 2: Get the size of the live heap as close to the heap goal as possible. ● From our earlier GOGC discussion: H g (n)= H m (n-1) x [1 + GOGC/100] ● H m (n-1)here is the amount of heap memory marked live after cycle n. ● We do not take into account other sources of GC work. ○ Assume they are negligible compared to the heap.

Slide 100

Slide 100 text

GC Pacer Prior To Go 1.18 Optimization Goal 2: Get the size of the live heap as close to the heap goal as possible.

Slide 101

Slide 101 text

GC Pacer Prior To Go 1.18 Optimization Goal 2: Get the size of the live heap as close to the heap goal as possible.

Slide 102

Slide 102 text

GC Pacer Prior To Go 1.18 Optimization Goal 2: Get the size of the live heap as close to the heap goal as possible.

Slide 103

Slide 103 text

GC Pacer Prior To Go 1.18 Optimization Goal 2: Get the size of the live heap as close to the heap goal as possible.

Slide 104

Slide 104 text

GC Pacer Prior To Go 1.18 Optimization Goal 2: Get the size of the live heap as close to the heap goal as possible.

Slide 105

Slide 105 text

GC Pacer Prior To Go 1.18 ● H t is the size of the heap at which we to trigger a GC cycle. ● We determine this value by using our optimization goals as “guides” ○ Or more formally - constraints. ● We know where we are currently, we know where we’d like to be - given this, how do we compute H t ?

Slide 106

Slide 106 text

GC Pacer Prior To Go 1.18 ● H t is the size of the heap at which we to trigger a GC cycle. ● We determine this value by using our optimization goals as “guides” ○ Or more formally - constraints. ● We know where we are currently, we know where we’d like to be - given this, how do we compute H t The Go GC Pacer made use of a proportional controller for this.

Slide 107

Slide 107 text

GC Pacer Prior To Go 1.18 ● How does the controller work?

Slide 108

Slide 108 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ We could just adjust the trigger point based on how much the heap over or under-shot the goal.

Slide 109

Slide 109 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ We could just adjust the trigger point based on how much the heap over or under-shot the goal. ■ H g - sizeOfHeapEndOfCycle

Slide 110

Slide 110 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ We could just adjust the trigger point based on how much the heap over or under-shot the goal. ■ H g - sizeOfHeapEndOfCycle ■ But this does not take into account our CPU utilization goal ■ So, instead we ask the following:

Slide 111

Slide 111 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ Assuming that we are at goal utilization, how much would the heap have grown since last cycle?

Slide 112

Slide 112 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ Assuming that we are at goal utilization, how much would the heap have grown since last cycle? ○ If we are at double the utilization: ■ This is probably because we do double the scan work (through dedicated mark workers or assists)

Slide 113

Slide 113 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ Assuming that we are at goal utilization, how much would the heap have grown since last cycle? ○ If we are at double the utilization: ■ This is probably because we do double the scan work (through dedicated mark workers or assists) ■ Which implies the heap grew to twice the size it was expected to (heap goal).

Slide 114

Slide 114 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ Assuming that we are at goal utilization, how much would the heap have grown since last cycle? ○ If we are at double the utilization: ■ This is probably because we do double the scan work (through dedicated mark workers or assists) ■ Which implies the heap grew to twice the size it was expected to (heap goal). ■ Which means we should try and start the cycle earlier next time. We are essentially trying to determine a point such that we optimize our 2 goals.

Slide 115

Slide 115 text

GC Pacer Prior To Go 1.18 We are essentially trying to determine a point such that we optimize our 2 goals.

Slide 116

Slide 116 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ If the heap does end up overshooting: ■ There should be a maximum amount by which this should happen ■ This is defined as the “hard” goal

Slide 117

Slide 117 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ If the heap does end up overshooting: ■ There should be a maximum amount by which this should happen ■ This is defined as the “hard” goal The hard goal is defined as 1.1 times the heap goal.

Slide 118

Slide 118 text

GC Pacer Prior To Go 1.18 Let’s Talk About Assists

Slide 119

Slide 119 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ Ideally, in the steady state - we should not have any mark assists.

Slide 120

Slide 120 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ Ideally, in the steady state - we should not have any mark assists. ○ Due to the way the error term of the P controller is, it can go to 0 even when our optimization goals are not met ■ If this persists, this can trick the controller into thinking that all’s good - because look! No error!

Slide 121

Slide 121 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ Ideally, in the steady state - we should not have any mark assists. ○ Due to the way the error term of the P controller is, it can go to 0 even when our optimization goals are not met ■ If this persists, this can trick the controller into thinking that all’s good - because look! No error!

Slide 122

Slide 122 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ If the error is 0 due to goals being met or not means asking ” are we under pacing, over pacing or pacing on point?”

Slide 123

Slide 123 text

GC Pacer Prior To Go 1.18 ● How does the controller work? ○ If the error is 0 due to goals being met or not means asking ” are we under pacing, over pacing or pacing on point?” ○ To know the answer to this question, we need to actually perform GC assists. ■ This is where the 5% extension comes from!

Slide 124

Slide 124 text

GC Pacer Prior To Go 1.18 So, that sounds great and everything, but what’s the downside?

Slide 125

Slide 125 text

GC Pacer Prior To Go 1.18 Downside 1: When non-heap sources of work are not negligible.

Slide 126

Slide 126 text

GC Pacer Prior To Go 1.18 Downside 2: When GOGC is large, changes to live heap size causes excessive assists.

Slide 127

Slide 127 text

GC Pacer Prior To Go 1.18 Downside 2: When GOGC is large, changes to live heap size causes excessive assists. ● When GOGC is large, there is a lot of runway in terms of how much the heap can grow.

Slide 128

Slide 128 text

GC Pacer Prior To Go 1.18 Downside 2: When GOGC is large, changes to live heap size causes excessive assists. ● When GOGC is large, there is a lot of runway in terms of how much the heap can grow. ● If at some point during the cycle, all of the memory turns out to be live:

Slide 129

Slide 129 text

GC Pacer Prior To Go 1.18 Downside 2: When GOGC is large, changes to live heap size causes excessive assists. ● When GOGC is large, there is a lot of runway in terms of how much the heap can grow. ● If at some point during the cycle, all of the memory turns out to be live: ○ And we have the hard heap goal to adhere to:

Slide 130

Slide 130 text

GC Pacer Prior To Go 1.18 Downside 2: When GOGC is large, changes to live heap size causes excessive assists. ● When GOGC is large, there is a lot of runway in terms of how much the heap can grow. ● If at some point during the cycle, all of the memory turns out to be live: ○ And we have the hard heap goal to adhere to: ■ To try and meet it, rate of assists will skyrocket, starving mutators.

Slide 131

Slide 131 text

GC Pacer Prior To Go 1.18 Downside 2: When GOGC is large, changes to live heap size causes excessive assists. ● When GOGC is large, there is a lot of runway in terms of how much the heap can grow. ● If at some point during the cycle, all of the memory turns out to be live: ○ And we have the hard heap goal to adhere to: ■ To try and meet it, rate of assists will skyrocket, starving mutators. ● And recovering from this itself can take a while!

Slide 132

Slide 132 text

GC Pacer Prior To Go 1.18 Downside 3: The steady state error of a P Controller will never converge to 0.

Slide 133

Slide 133 text

GC Pacer Prior To Go 1.18 Downside 4 : Mark assists are part of the steady state (5% extension from the 25% goal)

Slide 134

Slide 134 text

GC Pacer Prior To Go 1.18 Downside 1: When non-heap sources of work are not negligible. Downside 2: When GOGC is large, changes to live heap size causes excessive assists. Downside 3: The steady state error of a P Controller will never converge to 0. Downside 4 : Mark assists elimination in the steady state (30%).

Slide 135

Slide 135 text

The GC Pacer Redesign! GC Pacer Since Go 1.18

Slide 136

Slide 136 text

GC Pacer Since Go 1.18 Major Trends:

Slide 137

Slide 137 text

GC Pacer Since Go 1.18 Major Trends: 1. Include non-heap sources of work in pacing decisions.

Slide 138

Slide 138 text

GC Pacer Since Go 1.18 Major Trends: 1. Include non-heap sources of work in pacing decisions. 2. Re-frame the pacing decision as a “search problem”.

Slide 139

Slide 139 text

GC Pacer Since Go 1.18 Major Trends: 1. Include non-heap sources of work in pacing decisions. 2. Re-frame the pacing decision as a “search problem” 3. Use a PI Controller.

Slide 140

Slide 140 text

GC Pacer Since Go 1.18 Major Trends: 1. Include non-heap sources of work in pacing decisions. 2. Re-frame the pacing decision as a “search problem” 3. Use a PI Controller 4. Change target CPU utilization to 25%.

Slide 141

Slide 141 text

GC Pacer Since Go 1.18 Include Non-Heap Sources of Work In Pacing Decisions

Slide 142

Slide 142 text

GC Pacer Since Go 1.18 Include Non-Heap Sources of Work In Pacing Decisions ● Previously, we just considered heap as the source of GC work. ● Now, we also include non heap sources of work, namely stacks and globals.

Slide 143

Slide 143 text

GC Pacer Since Go 1.18 Include Non-Heap Sources of Work In Pacing Decisions ● Previously, we just considered heap as the source of GC work. ● Now, we also include non heap sources of work, namely stacks and globals. H g (n) = [H m (n-1) + S n + G n ] x [1 + GOGC/100]

Slide 144

Slide 144 text

GC Pacer Since Go 1.18 Include Non-Heap Sources of Work In Pacing Decisions H g (n) = [H m (n-1) + S n + G n ] x [1 + GOGC/100] ● This now essentially changes the expected behaviour of GOGC!

Slide 145

Slide 145 text

GC Pacer Since Go 1.18 Include Non-Heap Sources of Work In Pacing Decisions H g (n) = [H m (n-1) + S n + G n ] x [1 + GOGC/100] ● This now essentially changes the expected behaviour of GOGC! ● Most programs with default GOGC values, are likely to now end up using more memory.

Slide 146

Slide 146 text

GC Pacer Since Go 1.18 Let’s take a step back for a minute

Slide 147

Slide 147 text

GC Pacer Since Go 1.18 The GC Pacer Knows 2 Notions of time:

Slide 148

Slide 148 text

GC Pacer Since Go 1.18 ● T a - time taken to allocate after trigger

Slide 149

Slide 149 text

GC Pacer Since Go 1.18 ● T a - time taken to allocate after trigger. ● T s - time taken to perform GC work.

Slide 150

Slide 150 text

GC Pacer Since Go 1.18 ● T a - time taken to allocate after trigger. ● T s - time taken to perform GC work. ● Ideally, the pacer needs to “complete” these in the same amount.

Slide 151

Slide 151 text

GC Pacer Since Go 1.18 ● T a - time taken to allocate after trigger. ● T s - time taken to perform GC work. ● Ideally, the pacer needs to “complete” these in the same amount. ● In the steady state, the amount of GC work is roughly going to be constant.

Slide 152

Slide 152 text

GC Pacer Since Go 1.18 ● Continuing from this: ○ Our application can either spend its time on itself (could be allocating too) or doing GC work.

Slide 153

Slide 153 text

GC Pacer Since Go 1.18 ● Continuing from this: ○ Our application can either spend its time on itself (could be allocating too) or doing GC work. ○ So, these 2 notions of time could be thought of as “bytes allocated” and “bytes scanned”.

Slide 154

Slide 154 text

GC Pacer Since Go 1.18 ● Considering these 2 notions of time need to “complete” at the same time … ● If B a is bytes allocated and B s is bytes scanned, the amount we scan is going to be proportional to the amount we allocate, which means: B a = someConstant x B s

Slide 155

Slide 155 text

GC Pacer Since Go 1.18 ● Considering these 2 notions of time need to “complete” at the same time … ● If B a is bytes allocated and B s is bytes scanned, the amount we scan is going to be proportional to the amount we allocate, which means: B a = r x B s

Slide 156

Slide 156 text

GC Pacer Since Go 1.18 ● Considering these 2 notions of time need to “complete” at the same time … ● If B a is bytes allocated and B s is bytes scanned (heap, stacks, globals included): r = B a / B s ● This acts as a conversion factor between these 2 notions of time.

Slide 157

Slide 157 text

GC Pacer Since Go 1.18 r = B a / B s ● Subsequently, we’d like these 2 notions of time to “complete” in the same amount while maintaining the target CPU utilization. ● To achieve this, we scale r: r = [B a / B s ] x K(u T , u n )

Slide 158

Slide 158 text

GC Pacer Since Go 1.18 ● If we know what our goal is, and we know somehow know how many bytes will be allocated in a GC cycle, we can reliably calculate when to start a GC cycle.

Slide 159

Slide 159 text

GC Pacer Since Go 1.18 ● If we know what our goal is, and we know somehow know how many bytes will be allocated in a GC cycle, we can reliably calculate when to start a GC cycle. T n = H g - rB s

Slide 160

Slide 160 text

GC Pacer Since Go 1.18 T n = H g - rB s ● Intuitively, the size of the live heap (A) when we start a GC cycle, will always be greater than (or in some extreme cases, equal to) T n A ≥ T n

Slide 161

Slide 161 text

GC Pacer Since Go 1.18 T n = H g - rB s ● Intuitively, the size of the live heap (A) when we start a GC cycle, will always be greater than (or in some extreme cases, equal to) T n A ≥ T n => A ≥ H g - rB s

Slide 162

Slide 162 text

GC Pacer Since Go 1.18 A ≥ H g - rB s ● This is a condition not a predetermined trigger point as before. ● We now have a search space formulated by a condition that encapsulates both our optimization goals!

Slide 163

Slide 163 text

GC Pacer Since Go 1.18 A ≥ H g - rB s

Slide 164

Slide 164 text

GC Pacer Since Go 1.18 A ≥ H g - rB s

Slide 165

Slide 165 text

GC Pacer Since Go 1.18 ● Since we know H g and the amount of scan work, we need to search for a value of r such that we trigger at the right point. ● This converts our pacing problem into a search problem from an optimization one.

Slide 166

Slide 166 text

GC Pacer Since Go 1.18

Slide 167

Slide 167 text

GC Pacer Since Go 1.18 How do we calculate r over a GC cycle?

Slide 168

Slide 168 text

GC Pacer Since Go 1.18 How do we calculate r over a GC cycle? r = [B a / B s ] x K(u T , u n ) ● At the end of a GC cycle, B a is PeakLiveHeap - Trigger ○ Or in other words - the amount we have allocated since the cycle started.

Slide 169

Slide 169 text

GC Pacer Since Go 1.18 How do we calculate r over a GC cycle? r = [B a / B s ] x K(u T , u n ) ● At the end of a GC cycle, B a is PeakLiveHeap - Trigger ○ Or in other words - the amount we have allocated since the cycle started. ● This value can be calculated only at the end of the cycle, and it is what the value of r should have been in order to meet our target.

Slide 170

Slide 170 text

GC Pacer Since Go 1.18 How do we calculate r over a GC cycle? r = [B a / B s ] x K(u T , u n ) ● At the end of a GC cycle, B a is PeakLiveHeap - Trigger ○ Or in other words - the amount we have allocated since the cycle started. ● This value can be calculated only at the end of the cycle, and it is what the value of r should have been in order to meet our target. ● In the steady state, we would expect the next GC cycle to also be similar to this one, if that is true, it stands to reason that we can use this value of r for the next cycle.

Slide 171

Slide 171 text

GC Pacer Since Go 1.18 How do we calculate r over a GC cycle? ● Turns out using this value directly is very noisy! And this might end up missing the target.

Slide 172

Slide 172 text

GC Pacer Since Go 1.18 How do we calculate r over a GC cycle? ● Turns out using this value directly is very noisy! And this might end up missing the target. ● Instead, we aspire to search for a more “ideal” r value in the long run and use a PI controller as our way to try and search for it by setting the calculated r value as discussed to be the set point.

Slide 173

Slide 173 text

GC Pacer Since Go 1.18 How do we calculate r over a GC cycle? ● Turns out using this value directly is very noisy! And this might end up missing the target. ● Instead, we aspire to search for a more “ideal” r value in the long run and use a PI controller as our way to try and search for it by setting the calculated r value as discussed to be the set point. ● The controller might bounce around a little bit but the value it bounces around will probably be a better r value than what we would have used.

Slide 174

Slide 174 text

GC Pacer Since Go 1.18 How do we calculate r over a GC cycle? ● Turns out using this value directly is very noisy! And this might end up missing the target. ● Instead, we aspire to search for a more “ideal” r value in the long run and use a PI controller as our way to try and search for it by setting the calculated r value as discussed to be the set point. ● The controller might bounce around a little bit but the value it bounces around will probably be a better r value than what we would have used. What does that look like?

Slide 175

Slide 175 text

GC Pacer Since Go 1.18 GODEBUG=gcpacertrace=1 ● This outputs a handful of metrics internal to the tracer, such as the r value, amount of scan work to be done etc.

Slide 176

Slide 176 text

GC Pacer Since Go 1.18

Slide 177

Slide 177 text

GC Pacer Since Go 1.18 ● Due to this way of doing things: ○ We reframe our pacing problem and no longer suffer from the issue of controller getting saturated due to the P-only error term.

Slide 178

Slide 178 text

GC Pacer Since Go 1.18 ● Due to this way of doing things: ○ We reframe our pacing problem and no longer suffer from the issue of controller getting saturated due to the P-only error term. ○ Which means we no longer need mark assists in the steady state

Slide 179

Slide 179 text

GC Pacer Since Go 1.18 ● Due to this way of doing things: ○ We reframe our pacing problem and no longer suffer from the issue of controller getting saturated due to the P-only error term. ○ Which means we no longer need mark assists in the steady state ○ And don’t require the 5% extension - the goal utilization can be reduced to 25%!

Slide 180

Slide 180 text

GC Pacer Since Go 1.18 ● Due to this way of doing things: ○ We reframe our pacing problem and no longer suffer from the issue of controller getting saturated due to the P-only error term. ○ Which means we no longer need mark assists in the steady state ○ And don’t require the 5% extension - the goal utilization can be reduced to 25%! ○ This potentially means better application latencies as well!

Slide 181

Slide 181 text

GC Pacer Since Go 1.18 If we run the garbage benchmark: garbage -benchtime 30s -benchmem 512 and collect execution traces For Go 1.17, the minimum mutator utilization curve:

Slide 182

Slide 182 text

GC Pacer Since Go 1.18 If we run the garbage benchmark: garbage -benchtime 30s -benchmem 512 and collect execution traces For Go 1.18, the minimum mutator utilization curve:

Slide 183

Slide 183 text

GC Pacer Since Go 1.18 On scalability tests run on 5000 node Kubernetes clusters, Pod Startup Latency seemed to improve significantly after shifting to Go 1.18 (Thank you Antoni and Marseel from SIG Scalability for helping out with this picture!)

Slide 184

Slide 184 text

GC Pacer Since Go 1.18 Okay, phew… so, till now in our discussion of the redesign, we’ve spoken about: ● Including non-heap sources of work in pacing decisions ● Reframing the problem as a search problem ● Making use of a PI Controller for this search problem

Slide 185

Slide 185 text

GC Pacer Since Go 1.18 So far, this helps us mitigate the following downsides that we previously had: ● P-only controller disadvantages. ● Cases where non-heap sources of work are significant. ● We reduce the goal utilization to 25%, potentially improving application latencies.

Slide 186

Slide 186 text

GC Pacer Since Go 1.18 Okay, but what about assists?

Slide 187

Slide 187 text

GC Pacer Since Go 1.18 ● Assists come into play when we find more GC work than expected (non-heap sources included).

Slide 188

Slide 188 text

GC Pacer Since Go 1.18 ● Assists come into play when we find more GC work than expected (non-heap sources included). ● The worst case is when all scannable memory turns out to be live.

Slide 189

Slide 189 text

GC Pacer Since Go 1.18 ● Assists come into play when we find more GC work than expected (non-heap sources included). ● The worst case is when all scannable memory turns out to be live. ● Previously, we always assumed that the worst case is likely to happen, and bounded the heap growth to 1.1x of the heap goal (arbitrarily).

Slide 190

Slide 190 text

GC Pacer Since Go 1.18 ● Assists come into play when we find more GC work than expected (non-heap sources included). ● The worst case is when all scannable memory turns out to be live. ● Previously, we always assumed that the worst case is likely to happen, and bounded the heap growth to 1.1x of the heap goal (arbitrarily). ○ But, when we overshoot this hard limit (in cases where we have a large GOGC and we have the runway to do so) and the worst case actually happens, the rate of assists would skyrocket, starving mutators.

Slide 191

Slide 191 text

GC Pacer Since Go 1.18 ● But, if we have all this live memory, the next GC cycle is going to use at least this much memory anyway!

Slide 192

Slide 192 text

GC Pacer Since Go 1.18 ● But, if we have all this live memory, the next GC cycle is going to use at least this much memory anyway! ● So, why panic and ramp up assists, let’s let it slide for now and keep our rate of assists calm and smooth.

Slide 193

Slide 193 text

GC Pacer Since Go 1.18 ● But, if we have all this live memory, the next GC cycle is going to use at least this much memory anyway! ● So, why panic and ramp up assists, let’s let it slide for now and keep our rate of assists calm and smooth. ● But we cannot let this “deferring” shoot up the heap goal of the next cycle either.

Slide 194

Slide 194 text

GC Pacer Since Go 1.18 ● But, if we have all this live memory, the next GC cycle is going to use at least this much memory anyway! ● So, why panic and ramp up assists, let’s let it slide for now and keep our rate of assists calm and smooth. ● But we cannot let this “deferring” shoot up the heap goal of the next cycle either. ● Let H L be the size of the original live heap.

Slide 195

Slide 195 text

GC Pacer Since Go 1.18 ● Let H L be the size of the original live heap. ● In steady state, heap goal for the current cycle would be: [1 + GOGC/100] x H L ● And the heap goal for the next cycle would be: [1 + GOGC/100] x [1 + GOGC/100] x H L

Slide 196

Slide 196 text

GC Pacer Since Go 1.18 [1 + GOGC/100] x [1 + GOGC/100] x H L ● Assuming GOGC = 100, the worst case memory usage of next cycle would be 4x the size of the original live heap. ● Maintaining this invariant, we now extend the hard heap goal of this cycle to the worst case heap goal of the next cycle. ● Allow using more memory in the current cycle, because the next cycle is going to use at least this much extra memory anyway.

Slide 197

Slide 197 text

GC Pacer Since Go 1.18 [1 + GOGC/100] x [1 + GOGC/100] x H L ● This shields us from skyrocketing mark assist rates, but in the worst case, program memory consumption could spike up to 4x (for GOGC = 100) of the original live heap.

Slide 198

Slide 198 text

GC Pacer Since Go 1.18 [1 + GOGC/100] x [1 + GOGC/100] x H L ● This shields us from skyrocketing mark assist rates, but in the worst case, program memory consumption could spike up to 4x (for GOGC = 100) of the original live heap. ● For the sake of robustness, in some truly worst case scenarios, we bound this scenario also to 1.1x of the worst case goal. ○ So, in these scenarios, program memory could spike up to 1.1 x [1 + GOGC/100] x [1 + GOGC/100] x H L

Slide 199

Slide 199 text

How Did These Changes Affect A Kubernetes Release?

Slide 200

Slide 200 text

How Did These Changes Affect A Kubernetes Release? ● Kubernetes runs scalability tests on clusters of different sizes (100, 500, 5000 nodes). ● When the change to Go 1.18 was made:

Slide 201

Slide 201 text

How Did These Changes Affect A Kubernetes Release? ● Kubernetes runs scalability tests on clusters of different sizes (100, 500, 5000 nodes). ● When the change to Go 1.18 was made: ○ All clusters experienced a noticeable increase in memory consumption.

Slide 202

Slide 202 text

How Did These Changes Affect A Kubernetes Release? ● Kubernetes runs scalability tests on clusters of different sizes (100, 500, 5000 nodes). ● When the change to Go 1.18 was made: ○ All clusters experienced a noticeable increase in memory consumption. ○ Tests running on 5000 nodes experienced a sharp drop in Pod Startup Latencies ■ Pod Startup Latency = time taken since pod was created to when all its containers are reported as started when observed via a watch. ● The learnings from this are generally applicable to all Go programs as well and not specific to just Kubernetes!

Slide 203

Slide 203 text

How Did These Changes Affect A Kubernetes Release? ● In the 5000 node cluster, the memory footprint of the kube-apiserver increased by at least 10%, causing a release blocking scalability regression. ● This increased footprint was due to the pacer redesign, specifically, due to change in the meaning of GOGC.

Slide 204

Slide 204 text

How Did These Changes Affect A Kubernetes Release? ● In the 5000 node cluster, the memory footprint of the kube-apiserver increased by at least 10%, causing a release blocking scalability regression. ● This increased footprint was due to the pacer redesign, specifically, due to change in the meaning of GOGC. ● The peak live heap size tries to be as close to the heap goal of that cycle (the pacer tries to do this) ○ Previously, the heap goal would only factor in heap sources of work. ○ Now, it also considers stacks and globals, therefore increasing the heap goal by some amount.

Slide 205

Slide 205 text

How Did These Changes Affect A Kubernetes Release? ● In the 5000 node cluster, the memory footprint of the kube-apiserver increased by at least 10%, causing a release blocking scalability regression. ● This increased footprint was due to the pacer redesign, specifically, due to change in the meaning of GOGC. ● A Go program experiences a noticeable increase in memory usage after switching to Go 1.18 if it has non-negligible amounts of non-heap memory compared to heap memory.

Slide 206

Slide 206 text

How Did These Changes Affect A Kubernetes Release? The solution to mitigate this is to adjust GOGC accordingly!

Slide 207

Slide 207 text

How Did These Changes Affect A Kubernetes Release? The solution to mitigate this is to adjust GOGC accordingly! Note: Kubernetes does not set GOGC or any other runtime variables.

Slide 208

Slide 208 text

Mitigating Increased Memory Usage Due To Pacer Redesign By adjusting GOGC, we can return to our previous memory consumption. But what do we set GOGC to?

Slide 209

Slide 209 text

Mitigating Increased Memory Usage Due To Pacer Redesign ● If M o is the old memory consumption and M n is the new increased memory consumption, an approximation of the new GOGC can be: M o x [1 + GOGC old / 100] = M n x [1 + GOGC new / 100] ● And then we solve for GOGC new

Slide 210

Slide 210 text

Mitigating Increased Memory Usage Due To Pacer Redesign M o x [1 + GOGC old / 100] = M n x [1 + GOGC new / 100] ● M n can also be derived from M o if we know how much heap and non-heap memory we are using after making the switch to Go 1.18 or higher. M n = [1 + non-heap/heap] x M o

Slide 211

Slide 211 text

A Note On Go 1.19 Soft Memory Limit

Slide 212

Slide 212 text

A Note On Go 1.19 Soft Memory Limit ● Go 1.19 introduced a limit on the total amount of memory the Go runtime can use (to help mitigate cases of out of memory errors and GC workarounds). ● The pacer ties in tightly with this limit since it has to decide when to start a GC cycle (now also trying to respect this limit). ● The pacer tries to set the heap goal as the minimum of our previous definition and the heap limit (that is derived from GOMEMLIMIT). ● This change also limits the GC CPU consumption to 50% (compromising meeting the heap goal in some cases), and post this, the GC gives back time to the application in order to prevent “death spirals”.

Slide 213

Slide 213 text

References ● GC Pacer Redesign Design Proposal ● Original GC Pacer Design Proposal ● golang/go#42430 (GC Pacer Meta Issue) ● golang/go#14951 (tracking aggressive assist rates) ● Separate Soft and Hard Heap Limit Design Proposal ● On release-branch.go.1.18 ○ src/runtime/{mgc.go,mgcpacer.go,p roc.go} ● Pod Startup Latency SLO ● Kubernetes Performance Dashboard ● A Guide to the Go Garbage Collector ● Commit Message of Change Implementing The Redesign ● A Parallel, Real-Time Garbage Collector ● Garbage Collection Semantics ● Go GC: Solving The Latency Problem ● Loop Preemption in Go 1.14 ● Golang Garbage Collection Benchmarks ● Introduction to Control Theory And Its Applications to Computing Systems ● Control Theory in Container Fleet Management ● Feedback Control for Computer Systems

Slide 214

Slide 214 text

Thank you! Twitter: @MadhavJivrajani Gophers/K8s/CNCF Slack: @madhav