Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Linux Kernel Library - A Library Version of Lin...

Linux Kernel Library - A Library Version of Linux Kernel/lkl-fosdem2020-uk-devroom

Hajime Tazaki

February 02, 2020
Tweet

More Decks by Hajime Tazaki

Other Decks in Technology

Transcript

  1. Linux Kernel Library Linux Kernel Library A Library Version of

    Linux Kernel A Library Version of Linux Kernel Hajime Tazaki (@thehajime) IIJ Research Laboratory FOSDEM 2020: February 2020  1 
  2. Motivations Motivations Plenty of Linux kernel-like projects WSL (Windows Subsystem

    for Linux) gVisor Graphene Noah We don't wish to re-write Linux kernel 2 
  3. Motivations (cont'd) Motivations (cont'd) Linux kernel is written in C

    Our programs may also be written in C Why not use the kernel code as a library ? Quite similar motivation to NetBSD rump kernel 3 
  4. Linux Kernel Library (LKL) Linux Kernel Library (LKL) a library

    (liblkl.{so,a}) out-of-tree architecture (h/w-independent) run Linux code on various ways with a reusable library h/w dependent layer on Linux/Windows /FreeBSD/Android uspace, unikernel, on UEFI network simulator (ns-3) code 2.4KLoC (h/w independent) 6.6KLoC (h/w dep) 4 
  5. Alternatives (userspace network stacks) Alternatives (userspace network stacks) year lang

    how API features original (if any) lwip (2001) C src- embedded custom v4,v6,ipfwd,tcp scratch Seastar (2014) C++17 static lib custom v4,tcp,dpdk scratch OSv (2013) C++/C static lib POSIX v4,tcp (freebsd) gVisor (2018) golang go pkg custom v4,v6,tcp scratch mTCP (2014) C static lib custom v4,tcp,dpdk scratch rump (2007) C,asm static/sh lib POSIX v4,v6,ipfwd,tcp NetBSD Linux (1991) C,asm (kernel) POSIX v4,v6,ipfwd,tcp,xdp? Linux LKL (2007?) C,asm static/sh lib POSIX v4,v6,ipfwd,tcp,dpdk Linux 5 
  6. LKL: internals LKL: internals core design outsource machine dependent code

    keep application and kernel code untouched components 1. host backend (host_ops) 2. CPU independent arch. (arch/lkl) 3. application interface 6 
  7. 1. host backend 1. host backend environment dependent part unify

    an interface across di erent platforms (rump-hypercall like) device interface with Virtio block device <=> disk image networking <=> TAP, raw socket, DPDK, VDE 7 
  8. 2. CPU independent architecture 2. CPU independent architecture architecture (arch/lkl)

    transparent architecture bind (as CPU arch) require no modi cation to the other implementation thread information (struct thread_info) irq, timer, syscall handler access to underlying layer by host_ops 8 
  9. 3. Application interface 3. Application interface 1. use exposed API

    (LKL syscall) 2. use host libc (LD_PRELOAD) 3. extend (alternative) libc 9 
  10. API 1: use exposed API (LKL syscall) API 1: use

    exposed API (LKL syscall) call entry points of LKL kernel lkl_sys_open(), lkl_sys_socket() almost same as ordinal syscalls return value, errno noti cation are di erent can use LKL syscall and host syscall simultaneously read ext4 le by lkl_sys_read() => write into host (Windows) by write() 10 
  11. API 2: hijack host standard library API 2: hijack host

    standard library dynamically replace symbols of host syscalls (of libc) LD_PRELOAD socket() => lkl_sys_socket() can use host binary (executable) as-is limitation of replaceable symbols needs syscall translation on non-linux host 11 
  12. API 3: extend (alternative) libc API 3: extend (alternative) libc

    only call LKL syscall with our own libc also introduce as a virtual CPU architecture a program can link this instead of host libc can't access to (underlying) host resource directly via this lkl syscall as a patch for musl libc 12 
  13. mount a disk image w/o root mount a disk image

    w/o root privilege privilege mount/modify a disk image mount as loopback devices (may not on foreign OS) or use a VM LKL is for you modifying btrfs image with non root user 14 
  14. out of tree network protocols on Android out of tree

    network protocols on Android 15 
  15. UNIX pipe as a NIC UNIX pipe as a NIC

    access control by grep port mirroring by tee service function chaining, huh ? https://github.com/thehajime/blog/issues/3 16 
  16. Linux kernel on web browser Linux kernel on web browser

    Compile kernel with clang/llvm generate JS code by emscripten run on a browser 17 
  17. Convert Linux kernel (C) to JS Convert Linux kernel (C)

    to JS asmlinkage __visible void __init start_kernel(void) set_task_stack_end_magic(&init_task); smp_setup_processor_id(); 1 { 2 char *command_line; 3 char *after_dashes; 4 5 6 7 debug_objects_early_init(); 8 9 cgroup_init_early(); 10 11 local_irq_disable(); 12 early_boot_irqs_disabled = true; 13 14 /* 15 * Interrupts are still disabled. Do necessary setups, then 16 bl h function _start_kernel() { 1 var $0 = 0, $1 = 0, $10 = 0, $11 = 0, $12 = 0, $13 = 0, $14 = 0, $15 = 0, $16 = 2 0, $17 = 0, $18 = 0, $19 = 0, $2 = 0, $20 = 0, $21 = 0, $22 = 0, $23 = 0, $24 = 3 0, $25 = 0, $26 = 0; 4 var $27 = 0, $28 = 0, $29 = 0, $3 = 0, $30 = 0, $4 = 0, $5 = 0, $6 = 0, $7 = 0, 5 $8 = 0, $9 = 0, $spec$select$i = 0, $vararg_buffer = 0, $vararg_buffer1 = 0, $v 6 ararg_buffer4 = 0, $vararg_buffer6 = 0, $vararg_buffer8 = 0, $vararg_ptr11 = 0, 7 label = 0, sp = 0; 8 sp = STACKTOP; 9 STACKTOP = STACKTOP + 48|0; if ((STACKTOP|0) >= (STACK_MAX|0)) abortStackOverfl 10 ow(48|0); 11 $vararg_buffer8 = sp + 32|0; 12 $vararg_buffer6 = sp + 24|0; 13 $vararg_buffer4 = sp + 16|0; 14 $vararg_buffer1 = sp + 8|0; 15 $vararg_buffer = sp; 16 $ | 18 
  18. Running Linux container on non-Linux host Running Linux container on

    non-Linux host port LKL to macOS docker integration (OCI runtime) w/ modi ed dockerd w/o Hypervisor.framework 19 
  19. Fuzz testing with LKL Fuzz testing with LKL Syscall fuzzer

    by LKL Focus on lesystem fuzzing tests Found numbers of unknown bugs (ext4, btrfs, f2fs, etc) - Xu et al., Fuzzing File Systems via Two-Dimensional Input Space Exploration, IEEE S&P 2019 20 
  20. Network simulation Network simulation What ? network simulation (ns-3) with

    Linux network stack Why ? less abstraction more realistic fully reproducible 21 
  21. LKL Upstreaming LKL Upstreaming Initial patches on LKML (2008) Proposed

    on LKML (2015) Recently restarted (Oct. 2019) as a mode of UML (UMMODE=library) 1st step: eliminate duplicated features (devices) still ongoing 23 