Internal Architecture of Delve

Embed

Start on current slide

Slide 1

Slide 1 text

Internal Architecture

Slide 2

Slide 2 text

What is This ● Delve is: – A symbolic debugger for Go https://github.com/derekparker/delve – Used by Goland IDE, VSCode Go, vim-go (and others) ● This talk will: – give a general overview of delve’s architecture – explain why other debuggers have difficulties with Go programs

Slide 3

Slide 3 text

Table of Contents ● Assembly Basics ● Architecture of Delve ● Implementation of some Delve features

Slide 4

Slide 4 text

Assembly Basics

Slide 5

Slide 5 text

CPU ● Computers have CPUs ● CPUs have registers, in particular: – “Program Counter” (PC): address of the next instruction to execute ● also known as Instruction Pointer, IP – “Stack Pointer” (SP): address of the “top” of the call stack ● CPUs execute assembly instructions that look like this: MOVQ DX, 0x58(SP)

Slide 6

Slide 6 text

Call Stack ● Stores arguments, local variables and return address of a function call Locals of runtime.main Ret. address Locals of main.main Arguments of main.f Ret. address SP Locals of main.f

Slide 7

Slide 7 text

Call Stack SP Goroutine 1 starts by calling runtime.main Locals of runtime.main Dotted box: Space allocated for the stack Solid box: Space in use

Slide 8

Slide 8 text

Call Stack Locals of runtime.main SP runtime.main calls main.main by pushing a return address on the stack Ret. address

Slide 9

Slide 9 text

Call Stack Locals of runtime.main Ret. address SP main.main pushes it’s local variables on the stack Locals of main.main

Slide 10

Slide 10 text

Call Stack Locals of runtime.main Ret. address Locals of main.main Arguments of main.f SP When main.main calls another function (main.f): • pushes the arguments of main.f on the stack • pushes the return value on the stack Ret. address

Slide 11

Slide 11 text

Call Stack Locals of runtime.main Ret. address Locals of main.main Arguments of main.f Ret. address SP Finally main.f pushes its local variables on the stack Locals of main.f

Slide 12

Slide 12 text

Threads and Goroutines ● M:N threading / green threads – M goroutines are scheduled cooperatively on N threads – N initially equal to $GOMAXPROCS (by default the number of CPU cores) ● Unlike threads, goroutines: – are scheduled cooperatively – their stack starts small and grows/shrinks during execution

Slide 13

Slide 13 text

Threads and Goroutines ● When a go function is called – it checks that there is enough space on the stack for its local variables – if the space is not enough runtime.morestack_noctxt is called – runtime.morestack_noctxt allocates more space for the stack – if the memory area below the current stack is already used the stack is copied somewhere else in memory and then expanded ● Goroutine stacks can move in memory – debuggers normally assume stacks don’t move

Slide 14

Slide 14 text

Architecture of Delve

Slide 15

Slide 15 text

Architecture of Delve UI Layer Symbolic Layer Target Layer

Slide 16

Slide 16 text

Architecture of a Symbolic Debugger UI Layer Symbolic Layer Target Layer knows about line numbers, types, variable names, etc. controls target process, doesn’t know anything about your source code.

Slide 17

Slide 17 text

Features of the Target Layer ● Attach/detach from target process ● Enumerate threads in the target process ● Can start/stop individual threads (or the whole process) ● Receives “debug events” (thread creation/death and most importantly thread stop on a breakpoint) ● Can read/write the memory of the target process ● Can read/write the CPU registers of a stopped thread – actually this is the CPU registers saved in the thread descriptor of the OS scheduler

Slide 18

Slide 18 text

Target Layer in Delve (1) ● We have 3 implementations of the target layer: – pkg/proc/native: controls target process using OS API calls, supports: ● Windows – WaitForDebugEvent, ContinueDebugEvent, SuspendThread... ● Linux – ptrace, waitpid, tgkill.. ● macOS – notification/exception ports, ptrace, mach_vm_region… – default backend on Windows and Linux

Slide 19

Slide 19 text

Target Layer in Delve (2) ● Second implementation of Target Layer: – pkg/proc/core: reads linux_amd64 core files

Slide 20

Slide 20 text

Target Layer in Delve (3) ● We have 3 (but really 5) implementations of the target layer: – pkg/proc/gdbserial: used to connect to: ● debugserver on macOS (default setup on macOS) ● lldb-server ● Mozilla RR (a time travel debugger backend, only works on linux/amd64) – The name comes from the protocol it speaks, the Gdb Remote Serial Protocol ● https://sourceware.org/gdb/onlinedocs/gdb/Remote-Protocol.html ● https://github.com/llvm-mirror/lldb/blob/master/docs/lldb-gdb-remote .txt

Slide 21

Slide 21 text

About debugserver ● pkg/proc/gdbserial connected to debugserver is the default target layer for macOS ● Two reasons: – the native backend uses undocumented API and never worked properly – the kernel API used by the native backend are restricted and require a signed executable ● distributing a signed executable as an open source project is problematic ● users often got the self-signing process wrong

Slide 22

Slide 22 text

Symbolic Layer UI Symbolic Layer Target Layer Executable File debug symbols Code Compiler/Linker ● Does its job by opening the executable file and reading the debug symbols that the compiler wrote ● The format of the debug symbols for Go is DWARFv4: http://dwarfstd.org/

Slide 23

Slide 23 text

DWARF Sections (1) debug_info debug_types debug_loc debug_ranges debug_line debug_pubnames debug_pubtypes debug_aranges debug_macinfo debug_frame debug_str debug_abbrev ● Defines many sections:

Slide 24

Slide 24 text

DWARF Sections (1) debug_info debug_types debug_loc debug_ranges debug_line debug_pubnames debug_pubtypes debug_aranges debug_macinfo debug_frame debug_str debug_abbrev ● The important ones: ● debug_line: a table mapping instruction addresses to file:line pairs ● debug_frame: stack unwind information ● debug_info: describes all functions, types and variables in the program

Slide 25

Slide 25 text

debug_info example (1) package main type Point struct { X, Y int } func NewPoint(x, y int) Point { p := Point{ x, y } return p }

Slide 26

Slide 26 text

debug_info example (2) Subprogram Name: main.NewPoint Lowpc: 0x452e60 Highpc: 0x452ea8 FormalParameter Name: x Type: 0x1f5ad Location: call_frame_cfa FormalParameter Name: y Type: 0x1f5ad Location: fbreg+0x8 Variable Name: p Type: 0x29e1d Location: fbreg+0x10 BasicType (0x1f5ad) Name: int Encoding: signed ByteSize: 8 reference child StructType (0x29e1d) Name: main.Point ByteSize: 16 Member Name: X DataMemberLoc: 0 Type: 0x1f5ad Member Name: Y DataMemberLoc: 8 Type: 0x1f5ad

Slide 27

Slide 27 text

Stacktraces ● Get the list of instruction addresses – 0x4519c9, 0x451a00, 0x426450, 0x44c021 ● Look up debug_info to find the name of the function ● Look up debug_line to find the source line correesponding to the instruction 2 0x00000000004519c9 in main.f at ./panicy.go:4 3 0x0000000000451a00 in main.main at ./panicy.go:8 4 0x0000000000426450 in runtime.main at /usr/local/go/src/runtime/proc.go:198 5 0x000000000044c021 in runtime.goexit at /usr/local/go/src/runtime/asm_amd64.s:2361

Slide 28

Slide 28 text

Stacktraces (2) ● If functions had no local variables of arguments this would be easy ● A stack trace is the value of PC register ● Followed by reading the stack starting at SP Ret. address of main.f SP Ret. address of main.main Ret. address of runtime.main

Slide 29

Slide 29 text

debug_frame ● A table giving you the size of the current stack frame given the address of an instruction – Actually has many more features, but that’s the only thing you need for pure Go Locals of runtime.main Arguments of main.main Ret. address Locals of main.main Arguments of main.f Ret. address Locals of main.f SP ● To create a stack trace: – start with ● PC 0 = the value of the PC register ● SP 0 = the value of the SP register – look up PC i in debug_frame ● get size of the current frame sz i – get return address ret i at SP i +sz i -8 – repeat the procedure with ● PC i+1 = reti ● SP i+1 = SP i +sz i – The stack trace is PC 0 , PC 1 , PC 2 ...

Slide 30

Slide 30 text

Symbolic Layer in Delve ● mostly pkg/proc ● support code in pkg/dwarf and stdlib debug/dwarf

Slide 31

Slide 31 text

Actual Architecture of Delve (1) UI Layer Symbolic Layer Target Layer

Slide 32

Slide 32 text

Actual Architecture of Delve (1) UI Layer Symbolic Layer Target Layer This is a Lie

Slide 33

Slide 33 text

Actual Architecture of Delve (2) UI Layer Symbolic Layer Target Layer Service Layer Service Layer JSON RPC This makes embedding Delve into other programs easier

Slide 34

Slide 34 text

User Interfaces for Delve ● Built-in command line prompt ● Plugins – Atom plugin https://github.com/lloiser/go-debug – Emacs plugin https://github.com/benma/go-dlv.el/ – Vim-go https://github.com/fatih/vim-go – VS Code Go https://github.com/Microsoft/vscode-go ● IDE – JetBrains Goland IDE https://www.jetbrains.com/go – LiteIDE https://github.com/visualfc/liteide ● Standalone GUI debuggers – Gdlv https://github.com/aarzilli/gdlv

Slide 35

Slide 35 text

Actual Architecture of Delve (3) UI Layer pkg/terminal Symbolic Layer pkg/proc Target Layer pkg/proc/native pkg/proc/core pkg/proc/gdbserial Service Layer service/... Service Layer service/... JSON RPC

Slide 36

Slide 36 text

Implementation of some Delve features

Slide 37

Slide 37 text

Variable Evaluation (on the way down) UI Layer Symbolic Layer Target Layer print a determines address and size of a using debug_info EvalExpression(“a”) EvalExpression(“a”) ReadMemory(0xc000049f38, 8)

Slide 38

Slide 38 text

Variable Evaluation (on the way up) Variable{ Address: 0xc000049f38, Name: “a”, Type: “int”, Value: 1, ... } []byte{ 0x01, 0x00, 0x00… } UI Layer Symbolic Layer Target Layer a = int(1)

Slide 39

Slide 39 text

Variable Evaluation gdb vs delve (dlv) print err1 error(*main.astruct) *{A: 1, B: 2} (gdb) p err1 $1 = {tab = 0x4f4ca0 <*main.astruct,error>, data = 0xc00008c030} (dlv) print ch1 chan int { qcount: 4, dataqsiz: 10, buf: *[10]int [1,4,3,2,0,0,0,0,0,0], ... (gdb) print ch1 $5 = (void *) 0xc0000b2000

Slide 40

Slide 40 text

Creating Breakpoints UI Layer Symbolic Layer Target Layer break main.f looks up main.f in debug_info SetBreakpoint(FunctionName: “main.f”) writeBreakpoint(0x452e60) ● The target layer overwrites the instruction at 0x452e60 with an instruction that, when executed, stops execution of the thread and makes the OS notify the debugger. – In intel amd64 it’s the instruction INT 3 which is encoded as 0xCC

Slide 41

Slide 41 text

Creating Breakpoints gdb vs delve 0x452eb0 MOVQ FS:0xfffffff8, CX 0x452eb9 CMPQ 0x10(CX), SP 0x452ebd JBE 0x452ee3 0x452ebf SUBQ $0x28, SP 0x452ec3 MOVQ BP, 0x20(SP) 0x452ec8 LEAQ 0x20(SP), BP 0x452ecd XORPS X0, X0 0x452ed0 MOVUPS X0, 0(SP) 0x452ed4 CALL main.NewPoint(SB) 0x452ed9 MOVQ 0x20(SP), BP 0x452ede ADDQ $0x28, SP 0x452ee2 RET 0x452ee3 CALL runtime.morestack_noctxt(SB) 0x452ee8 JMP main.main(SB) gdb dlv ● Instructions in red are the stack-split prologue – checks if the function needs more stack and calls runtime.morestack if it does ● A breakpoint set on the function’s entry point will be hit twice if when the stack is resized, giving the impression that the function was executed twice

Slide 42

Slide 42 text

Continue (on the way down) UI Layer Symbolic Layer Target Layer continue Continue() ContinueOnce() ● ContinueOnce resumes all threads and waits for a debug event

Slide 43

Slide 43 text

Continue (on the way up) UI Layer Symbolic Layer Target Layer list of running goroutines with their file:line position, the function they are executing and which breakpoint they are stopped at, if any returns value of PC register for all threads > main.main() ./main.go:200 (PC: 0x4a3277)

Slide 44

Slide 44 text

Mapping Goroutines to Threads ● Each goroutine is described by a runtime.g struct ● All g structs are saved into runtime.allgs ● The goroutine running on a given thread is stored in the Thread’s Local Storage – Actual implementation varies depending on GOOS and GOARCH ● linux/amd64: FS:0xfffffff8 ● windows/amd64: GS:0x28 ● macOS/amd64: GS:0x8a0 or GS:0x30 (starting with go1.11) type g struct { stack stack stackguard0 uintptr stackguard1 uintptr _panic *_panic // innermost panic - offset known to liblink _defer *_defer // innermost defer ... goid int64 ... }

Slide 45

Slide 45 text

Conditional Breakpoints ● A breakpoint that should stop the execution of the program only when a boolean condition is true ● Setting them is the same as setting normal breakpoints ● When ContinueOnce (target layer) returns: – Continue (symbolic layer) evaluates the condition(s) associated with (all) the current breakpoint(s) – if it’s true Continue returns – otherwise ContinueOnce is called again. ● Optimizations are possible – Peter B. Kessler. 1990. Fast Breakpoints: Design and Implementation. PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation. Pages 78-84

Slide 46

Slide 46 text

Step Over ● Executes one line of source code, “steps over” function calls ● Also known as “next”

Slide 47

Slide 47 text

Wrong “next” strategy, step 0 package main func fib(n int) int { if n == 0 { return 1 } if n == 1 { return 1 } a := fib(n-1) b := fib(n-2) return a+b } func main() { r := fib(10) println(r) }

Slide 48

Slide 48 text

Wrong “next” strategy, step 1 package main func fib(n int) int { if n == 0 { return 1 } if n == 1 { return 1 } a := fib(n-1) b := fib(n-2) return a+b } func main() { r := fib(10) println(r) } Set a breakpoint on every line of the current function

Slide 49

Slide 49 text

Wrong “next” strategy, step 2 package main func fib(n int) int { if n == 0 { return 1 } if n == 1 { return 1 } a := fib(n-1) b := fib(n-2) return a+b } func main() { r := fib(10) println(r) } Set a breakpoint on the return address of the current frame

Slide 50

Slide 50 text

Wrong “next” strategy, step 3 ● Set a breakpoint on the first deferred function ● Call continue

Slide 51

Slide 51 text

Wrong “next” strategy, bug 1: Can’t handle concurrency package main func fib(n int) int { if n == 0 { return 1 } if n == 1 { return 1 } a := fib(n-1) b := fib(n-2) return a+b } func main() { for i := 1; i < 10; i++ { go func() { r := fib(i) println(r) }() } }

Slide 52

Slide 52 text

Wrong “next” strategy, bug 2: Can’t handle recursion package main func fib(n int) int { if n == 0 { return 1 } if n == 1 { return 1 } a := fib(n-1) b := fib(n-2) return a+b } func main() { r := fib(i) println(r) }

Slide 53

Slide 53 text

Better “next” strategy ● Set a breakpoint on every line of the current function – condition: stay on the same goroutine & stack frame ● Set a breakpoint on the return address of the current frame – condition: stay on the same goroutine & previous stack frame ● Set a breakpoint on the most recently deferred function – condition: stay on the same goroutine & check that it was called through a panic ● Call Continue

Slide 54

Slide 54 text

Better “next” strategy gdb vs. delve ● gdb doesn’t know about defer ● gdb doesn’t know about goroutines ● gdb can’t check that we didn’t change stack frame – goroutine stacks will move when resized – gdb assumes stacks always stay in the same place

Slide 55

Slide 55 text

Implementing “next” checks ● “same goroutine” check: – read the goid field of the runtime.g struct on the current thread ● “same frame” check: – SP + current_frame_size – g.stack.stackhi ● where g is the runtime.g struct for the current thread

Slide 56

Slide 56 text

The End