Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bare Metal Gophers: can you write an OS Kernel in Go?

Bare Metal Gophers: can you write an OS Kernel in Go?

Slides for my GolangUK '17 talk. Abstract: Go is a great language for building server applications but can you use it to write an OS kernel? Let's talk about the challenges involved in writing, compiling and linking Go code that runs in Ring-0 and code a simple "Hello World" demo.


Achilleas Anagnostopoulos

August 17, 2017


  1. Bare Metal Gophers Can you write an OS Kernel in

    Go? Achilleas Anagnostopoulos github.com/achilleasa Sr. Software Engineer, Geckoboard
  2. Ring-3 A little bit of theory: ring-based security Ring-1 Ring-0

    Normal code (userland) OS Kernel Syscall
  3. Running Go applications at Ring-0 • Why would we want

    to do this? ◦ Performance boost for some types of applications ◦ Exclusive access to (shared) resources ◦ Remove some of the abstraction layers between our code and the real hardware • What’s the easiest way to do this? ◦ Lots of interest around the concept of uni-kernels ◦ Bundle your Go app with a minimal OS-in-a-package ◦ Running on the cloud? How about a hypervisor-aware kernel?
  4. Is Go suitable for such a task? • Arguments against

    ◦ Go uses a GC ◦ You need to re-implement the entire runtime • Arguments in favor ◦ Compile-time checks ◦ Bounds-checking for slices ◦ Interfaces
  5. Let’s build something simple • A simple 32-bit hello world

    “Kernel” in Go! ◦ Target arch x86 ◦ No paging ◦ Minimal ASM code for low-level initialization • Our host machine runs a 64-bit OS ◦ Go toolchain supports cross-compilation out of the box • Run on VirtualBox (or alternatively, qemu or real hardware)
  6. How do we load our kernel to memory? • We

    need to use a bootloader ◦ GRUB2 • How does the kernel communicate with the bootloader? ◦ Kernel must define a special header that begins with a magic value ◦ Header must be defined in the first 4K of the kernel image We need to tell the linker where to place each section of the binary • We can do this with GNU ld
  7. How does a linker script look like? ENTRY(_rt0_entry) SECTIONS {

    . = 1M; /* load kernel at 1M */ .multiboot :{ *(.multiboot_header ) } /* executable code */ .text BLOCK(4K) : ALIGN(4K) { *(.text) } /* read-only data */ .rodata BLOCK(4K) : ALIGN(4K){ *(.rodata) } /* read-write data (initialized) */ .data BLOCK(4K) : ALIGN(4K){ *(.data) } /* read-write data (not initialized) */ .bss BLOCK(4K) : ALIGN(4K){ *(COMMON) *(.bss) } } 0x00 ... ... Kernel image memory layout _rt0_entry 1 Mb
  8. The kernel is loaded to memory; what’s next? • There

    is no stack • Streaming SIMD Extensions (SSE) are disabled ◦ SIMD → Single Instruction; Multiple Data ◦ Allows us to perform an operation to multiple values concurrently • CPU is in 32-bit protected mode
  9. Accessing memory Flat memory model 0x00 0x01 0x02 0x03 0x04

    ... 4Gb pointer (uintptr) Segmented addressing 0x00 0x01 0x02 0x03 0x04 ... 4Gb gs:0x02 gs Offset 0x02 Segment register Offset
  10. What happens when a Go function runs? • A small

    bit of code (prologue) executes before the actual function • Stack Growth Check Code calls foo() - Fetch pointer to current g - If SP < g.stackguard0 { runtime.GrowStack() } - Jump to foo() code type g struct { stack stack stackguard0 uintptr … } type stack struct { lo uintptr hi uintptr } $GOROOT/src/runtime/runtime2.go
  11. Stack growth check: behind the scenes GOOS=linux GOARCH=386 go build

    objdump -d output (Intel syntax) gs:0x00 Ptr to ? Thread Control Block (TCB) [TCB - 0x04] → Ptr to g func foo() { print(“hello”) } 0808aae0 <main.foo>: 01 mov ecx,DWORD PTR gs:0x0 02 mov ecx,DWORD PTR [ ecx-0x4] 03 cmp esp,DWORD PTR [ ecx+0x8] 04 jbe 909ab19 (grow stack) Current g 0x00 stack.lo 0x04 stack.hi 0x08 stackguard0 What things could possibly go wrong here?
  12. Bootstrap code: allocate stack and initialize g • Reserve 16K

    in .bss (uninitialized data) section of the kernel image ◦ Load stack pointer with the address to the end this block • Setup gs register according to the TLS ABI • Populate a g struct ◦ Runtime package defines a g instance called g0 ◦ Set g0.stack.hi / g0.stack.lo to our stack block ◦ Set g0.stackguard0 = g0.stack.lo → bypass stack growth checks Now we can safely jump to the Go code
  13. Overriding the Go build process • go build cross-compiles everything

    for us ◦ Including the Go runtime • We must perform a separate link step • Idea: intercept go build steps, tweak and execute them ◦ -n flag to go build outputs the build script for our package
  14. GOARCH=386 GOOS=linux go build -n 2>&1 | sed \ -e

    "1s|^|set -e\n|" \ -e "1s|^|export GOOS=linux\n|" \ -e "1s|^|export GOARCH=386\n|" \ -e "1s|^|WORK='$(BUILD_ABS_DIR)'\n|" \ -e "s|-extld|-linkmode=external -tmpdir='$(BUILD_ABS_DIR)' -extldflags='-nostdlib' -extld|g" \ | sh Let’s automate the work!
  15. Linking the final kernel image • Use GNU ld to

    link ◦ The object files produced by nasm ◦ The go.o file produced by the modified go build step • We invoke the linker and... build/arch/x86/asm/rt0.o: In function `_rt0_entry': arch/x86/asm/rt0.s:61: undefined reference to `runtime.g0' • Why did this happen? ◦ Objcopy to the rescue! $ objcopy --globalize-symbol runtime.g0 \ $(BUILD_DIR)/go.o $(BUILD_DIR)/go.o
  16. Success! $ ls -logh build/*.bin -rwxr-xr-x 1 1.0M Aug 17

    11:40 build/ kernel-x86.bin
  17. Screen output without an OS • VGA 80x25 text-mode →

    Linear framebuffer located at 0xb8000 ‘H’ attr ‘i’ attr ... attr attr attr ... attr ... ... ... ... ... ... ... attr attr ... attr R:0, C:0 → fb + 0 2 bytes per character R:1, C:0 → fb + 160 R:0, C:79 → fb + 158 • Attribute byte ◦ 4 bits for fg / bg color ◦ 16 available colors
  18. Before we begin coding: limitations • Unsupported Go features ◦

    Maps ◦ Interfaces ◦ Go-routines ◦ defer • Go memory allocator is not available ◦ Calls to the allocator will cause a triple-fault No variable should escape to the heap
  19. Let’s write some code!

  20. How to overcome these limitations • Bootstrap the Go memory

    allocator ◦ Allocator is implemented entirely in Go ◦ Requires some OS-specific hooks ◦ Our Kernel must provide its own implementation for these hooks • Unlock more Go features ◦ Maps, interfaces and defer ◦ Package init() functions Can we actually DO this without triggering memory allocations?
  21. Bare Metal Gophers We enabled SSE. Let’s do some math!

  22. Final remarks YES! But... • It is harder compared to

    other languages • Initial code must be designed to avoid memory allocations • Things get easier once the allocator is up and running Slides and code for this talk https://github.com/achilleasa/bare-metal-gophers Bonus! A work-in-progress 64-bit Kernel using the concepts from this talk https://github.com/achilleasa/gopher-os P.S. We are hiring: https://geckoboard.com/careers Can you write a OS Kernel in Go?