Slide 1

Slide 1 text

Go inliner: past, present and the future Backend Iskander (Alex) Sharipov Go contributor, open source enthusiast DevFest 2018, Novosibirsk, 23-25 November, 2018

Slide 2

Slide 2 text

The only slide DevFest 2018, Novosibirsk, 23-25 November, 2018 Problem: function calls are slow. Solution: just inline them. Fin. Problem solved.

Slide 3

Slide 3 text

The only slide (not really) DevFest 2018, Novosibirsk, 23-25 November, 2018 Now we have two problems: 1. Overall performance degraded. 2. Binary became larger. *Inlined everything*

Slide 4

Slide 4 text

I thought I solved the problem

Slide 5

Slide 5 text

DevFest 2018, Novosibirsk, 23-25 November, 2018 High-level overview

Slide 6

Slide 6 text

High-level overview DevFest 2018, Novosibirsk, 23-25 November, 2018 No inlining makes code run slow. Inlining dichotomy Too much inlining makes code run slow and binaries become bigger.

Slide 7

Slide 7 text

High-level overview DevFest 2018, Novosibirsk, 23-25 November, 2018

Slide 8

Slide 8 text

High-level overview DevFest 2018, Novosibirsk, 23-25 November, 2018

Slide 9

Slide 9 text

High-level overview DevFest 2018, Novosibirsk, 23-25 November, 2018

Slide 10

Slide 10 text

High-level overview DevFest 2018, Novosibirsk, 23-25 November, 2018 Cost model. Evaluate the approx “code size” for the function body. Main parts of the inliner Decision making. Tells if function is inlineable in general or on a call site. Inlining algorithm and trade-offs handling. How inlining is performed and when it’s not supported. 1) 2) 3)

Slide 11

Slide 11 text

DevFest 2018, Novosibirsk, 23-25 November, 2018 Inliner: the good parts (Before we start discussing everything else)

Slide 12

Slide 12 text

Inliner: the good parts DevFest 2018, Novosibirsk, 23-25 November, 2018 Compiler can inline function calls between packages. This makes packages that provide convenience wrappers around inlineable functions almost free.

Slide 13

Slide 13 text

Inliner: the good parts DevFest 2018, Novosibirsk, 23-25 November, 2018 func canInlineFuncLitCall() int { add1 := func(x int) int { return x + 1 } return add1(10) // Inlined } // Optimized to return 11

Slide 14

Slide 14 text

Inliner: the good parts DevFest 2018, Novosibirsk, 23-25 November, 2018 func canInlineClosureCall() int { x := 10 add1 := func() int { return x + 1 } return add1() // “return 11” }

Slide 15

Slide 15 text

Inliner: the good parts DevFest 2018, Novosibirsk, 23-25 November, 2018 func canInlineExplicitPanic() { panic("can be inlined") } // Implicit panics are also // handled. They can occur in // slice indexing expressions.

Slide 16

Slide 16 text

Inliner: the good parts DevFest 2018, Novosibirsk, 23-25 November, 2018 We almost have it... Mid-stack inlining (inlining of non-leaf calls) Still causes significant code bloating. But it gets better. https://golang.org/issue/19348

Slide 17

Slide 17 text

DevFest 2018, Novosibirsk, 23-25 November, 2018 Function “cost” calculation

Slide 18

Slide 18 text

Function “cost” calculation DevFest 2018, Novosibirsk, 23-25 November, 2018 More syntactical elements means higher cost. Sometimes one can re-write code so it has lower cost but operates in the same way. Cost is a function over syntax

Slide 19

Slide 19 text

Function “cost” calculation DevFest 2018, Novosibirsk, 23-25 November, 2018

Slide 20

Slide 20 text

Function “cost” calculation DevFest 2018, Novosibirsk, 23-25 November, 2018

Slide 21

Slide 21 text

Function “cost” calculation DevFest 2018, Novosibirsk, 23-25 November, 2018 The inliner is very static. If cost is calculated and it’s out of the budget, function is never an inlining candidate. Static vs dynamic cost

Slide 22

Slide 22 text

Function “cost” calculation DevFest 2018, Novosibirsk, 23-25 November, 2018 Even if function can be further optimized or even const-folded with constant arguments, it just won’t happen. Static approach disadvantage

Slide 23

Slide 23 text

Function “cost” calculation DevFest 2018, Novosibirsk, 23-25 November, 2018 xs = append(xs, 1) // cost=5 x := 1 // cost=5 // Builtin costs are weird. // // Budget of 80 with appends // leads to >1kb of x86 code!

Slide 24

Slide 24 text

Function “cost” calculation DevFest 2018, Novosibirsk, 23-25 November, 2018 Forwarding function does not have any execution time costs, but every inlining increases total cost by introducing extra AST nodes. The cost of wrapping

Slide 25

Slide 25 text

Function “cost” calculation DevFest 2018, Novosibirsk, 23-25 November, 2018 It looks like there are so many things to improve... Inlining dichotomy Yet, most of these paths lead to performance regressions. https://golang.org/issue/17566

Slide 26

Slide 26 text

DevFest 2018, Novosibirsk, 23-25 November, 2018 “Extra budget” experiment

Slide 27

Slide 27 text

“Extra budget” experiment DevFest 2018, Novosibirsk, 23-25 November, 2018 Inlining is more significant inside loops. Constant arguments also make inlining more eligible. Hypothesis 1/2

Slide 28

Slide 28 text

“Extra budget” experiment DevFest 2018, Novosibirsk, 23-25 November, 2018 Conditional branch that is not known to be “likely taken” makes inlining less beneficial. Hypothesis 2/2

Slide 29

Slide 29 text

“Extra budget” experiment DevFest 2018, Novosibirsk, 23-25 November, 2018 Increase maximum inlining budget. Budget starts from the base and can be changed through the program control flow. Proposed solution

Slide 30

Slide 30 text

“Extra budget” experiment DevFest 2018, Novosibirsk, 23-25 November, 2018 func f(x, y int) { // Does something smart. } // Imagine that f has a cost of // 95. Base budget is 80, so we // can’t inline it normally.

Slide 31

Slide 31 text

“Extra budget” experiment DevFest 2018, Novosibirsk, 23-25 November, 2018 // Base budget is 80. for _, x := range xs { // 90 for _, y := range ys { // 100 f(x, y) // Can inline! } // budget is 90 again } // budget is 80 again

Slide 32

Slide 32 text

“Extra budget” experiment DevFest 2018, Novosibirsk, 23-25 November, 2018 for _, x := range xs { // 90 for _, y := range ys { // 100 if x + y < 100 { // 90 f(x, y) // Won’t inline } // budget is 100 again } // budget is 90 again } // budget is 80 again

Slide 33

Slide 33 text

“Extra budget” experiment DevFest 2018, Novosibirsk, 23-25 November, 2018 // Constant arguments effect. var x, y int f(x, y) // 80, won’t inline f(10, y) // 90, won’t inline f(10, 20) // 100, can inline!

Slide 34

Slide 34 text

“Extra budget” experiment DevFest 2018, Novosibirsk, 23-25 November, 2018 Pros: simple to implement, doesn’t make compilation slower, avoids code bloating. Pros & Cons The real benefits, apart from the microbenchmarks, are not apparent. It doesn’t seem to work.

Slide 35

Slide 35 text

DevFest 2018, Novosibirsk, 23-25 November, 2018 Recent improvements (The present)

Slide 36

Slide 36 text

Recent improvements DevFest 2018, Novosibirsk, 23-25 November, 2018 var total, i int loop: if i == len(xs) { return total } total += xs[i] i++ goto loop

Slide 37

Slide 37 text

Recent improvements DevFest 2018, Novosibirsk, 23-25 November, 2018 func countSum(xs []int) int { var total int for _, x := range xs { total += x } return total } // Inlineable with CL148777.

Slide 38

Slide 38 text

Recent improvements DevFest 2018, Novosibirsk, 23-25 November, 2018 The budget drops from 80 to 20 for very large functions. See https://golang.org/cl/125516. Less inlining in big functions

Slide 39

Slide 39 text

Recent improvements DevFest 2018, Novosibirsk, 23-25 November, 2018 func wrapper(x int) int { return nonLeafCall(x) } // Inlineable with CL147361. // Functions with a single non-leaf call // inside bodies can be inlineable now. // This is a part of mid-stack enabling // work that is done by David Chase.

Slide 40

Slide 40 text

DevFest 2018, Novosibirsk, 23-25 November, 2018 Potential improvements (The future?)

Slide 41

Slide 41 text

Potential improvements DevFest 2018, Novosibirsk, 23-25 November, 2018 Some functions can become trivially inlineable after SSA optimizations. Second inlining round could inline them. Additional inlining round(s)

Slide 42

Slide 42 text

Potential improvements DevFest 2018, Novosibirsk, 23-25 November, 2018 Collect constant argument control flow specialization if it fits the budget. It can solve issues like #27149. Arg-based specializations

Slide 43

Slide 43 text

Potential improvements DevFest 2018, Novosibirsk, 23-25 November, 2018 Constructor functions, for example, can benefit from being inlined, since it can make more created objects stack-allocated. More special-casing for idioms

Slide 44

Slide 44 text

DevFest 2018, Novosibirsk, 23-25 November, 2018 Fundamental problems

Slide 45

Slide 45 text

Fundamental problems DevFest 2018, Novosibirsk, 23-25 November, 2018 Even inside one arch, like AMD64, optimal inliner parameters can vary. Not to mention how different it is for the other arch (instruction cache is one of the concerns).

Slide 46

Slide 46 text

Fundamental problems DevFest 2018, Novosibirsk, 23-25 November, 2018 Lack of compile flags demands “best fit defaults”. Compile time is important in Go, we can’t perform expensive operations (and the user can’t ask for that).

Slide 47

Slide 47 text

Fundamental problems DevFest 2018, Novosibirsk, 23-25 November, 2018 If Go calling convention changes from stack-based to register-based we should reconsider some inliner heuristics. Calling convention

Slide 48

Slide 48 text

DevFest 2018, Novosibirsk, 23-25 November, 2018 Pragmatic hints (Don’t follow blindly)

Slide 49

Slide 49 text

Pragmatic hints DevFest 2018, Novosibirsk, 23-25 November, 2018 If possible, avoid even thinking about inlining and how it will affect your application performance. It can be irrelevant. Don’t rely on inlining (if you can)

Slide 50

Slide 50 text

Pragmatic hints DevFest 2018, Novosibirsk, 23-25 November, 2018 Inlining heuristics may depend on some very widely used Go idioms to choose the right thing. Write idiomatic Go code

Slide 51

Slide 51 text

Pragmatic hints DevFest 2018, Novosibirsk, 23-25 November, 2018 Split big CPU-intensive functions into several smaller ones (inlineable) and call viable specialization on the call site. Manual function specialization

Slide 52

Slide 52 text

Pragmatic hints DevFest 2018, Novosibirsk, 23-25 November, 2018 If it is crucial for some functions to be inlineable, add a test for that. github.com/Quasilyte/inltest Write inlining tests

Slide 53

Slide 53 text

Pragmatic hints DevFest 2018, Novosibirsk, 23-25 November, 2018 You can forbid function inlining. //go:noinline func coldPathFunc() T { // Inlineable, but never // called from a hot path. }

Slide 54

Slide 54 text

Pragmatic hints DevFest 2018, Novosibirsk, 23-25 November, 2018 You can debug inliner decisions. $ go tool compile -m=2 foo.go cannot inline coldPathFunc: marked go:noinline Also works with “go build”: $ go build -gcflags='-m=2' foo.go

Slide 55

Slide 55 text

Thanks for attention Iskander (Alex) Sharipov Go contributor, open source enthusiast @quasilyte @quasilyte @quasilyte @quasilyte