Slide 1

Slide 1 text

LKL.js: Running Linux Kernel on JavaScript *Directly* July 21, 2018 kernelvm14@Tokyo @retrage

Slide 2

Slide 2 text

Motivation • Q: Can Linux run on JavaScript? • A: Obviously yes, use JSLinux[1]. • Q: Can Linux run on JavaScript *directly*? • A: ??? -> Challenge accepted. 1 JavaScript Engine Emulator Linux JavaScript Engine Linux JSLinux LKL.js

Slide 3

Slide 3 text

Before We Port • OK, but JavaScript is a programming language and Linux kernel runs on machines, there is a gap. • Linux Kernel Library[2] (LKL) • anykernel: Run a real OS on userspace • A fork of torvalds/linux, LKL as a new arch arch/lkl • Emscripten[3] • LLVM based C/C++ to JS/WASM transpiler • Emscripten provides Unix-like environment • -> LKL and Emscripten will fill the gap. 2

Slide 4

Slide 4 text

Building Linux Kernel with Clang • Q: Linux kernel deeply depends on gcc-extension. Can we build it with Clang? • A: Yes. Once upon a time, there is LLVMLinux[4] project. And now, by the efforts of Google Android team, two LTS (4.4 and 4.9) can be built with Clang[5]. • => The latest LKL can be built with Clang. 3

Slide 5

Slide 5 text

LKL Build Flow 101 • $ make -C tools/lkl • -> Determine which files will be compiled from Kconfig settings, then compile the code. • -> The object files are archived to built-in.o • -> Link built-in.o files to vmlinux • -> Compile host side code, link to liblkl-in.o • -> Link all to liblkl.so 4

Slide 6

Slide 6 text

JavaScript is a New Exec Format • In LLVM world, code is compiled to the target like: • Source -> LLVM IR -> Target • In Emcsripten, the “Linking” is that translating LLVM IR to JavaScript. • From this, all source code (including libc Emscripten provides) should be compiled to LLVM IR. 5

Slide 7

Slide 7 text

Generating vmlinux.bc (1) • make -C tools/lkl CC="$CC $CFLAGS" AR="$PY $PWD/ar.py" V=1 6 "-s WASM=0" "-s ASYNCIFY=1" "-s EMULATE_FUNCTION_POINTER_CASTS=1" "-s USE_PTHREADS=1" "-s PTHREAD_POOL_SIZE=4" "-s TOTAL_MEMORY=1342177280" "-DMAX_NR_ZONES=2" "-DNR_PAGEFLAGS=20" "-DSPINLOCK_SIZE=0" "-DF_GETLK64=12" "-DF_SETLK64=13" "-DF_SETLKW64=14" • Options for Emscripten. See manuals[6] for more details. • Parameters obtained from ELF

Slide 8

Slide 8 text

Generating vmlinux.bc (2) • LLVM has llvm-link, a linker for LLVM bitcode files. As llvm-link is not support archive files, we have to pass all object files to it. • ar.py will keep all object files to be archived to objs. 7 objs = [] for i, arg in enumerate(sys.argv): if ".o" in arg and not "built-in" in arg and i > 2: objs.append(arg)

Slide 9

Slide 9 text

Generating vmlinux.bc (3) • clean-obj.py will clean up objs by eliminating duplicated file path. • link-vmlinux-gen.py will generate link- vmlinux.sh from objs. • link-vmlinux.sh will generate vmlinux.bc. • -> vmlinux.bc, a Linux kernel fully written in LLVM bitcode! 8 python "${srctree}/clean-obj.py" python "${srctree}/link-vmlinux-gen.py" bash "${srctree}/link-vmlinux.sh"

Slide 10

Slide 10 text

Generating boot.js (1) • As LKL is one of LibOS, vmlinux.bc can not run without application part. Our target is tools/lkl/tests/boot 9 $LINK -o $LKL/tests/boot.bc \ $LKL/tests/boot-in.o $LKL/lib/liblkl-in.o $LKL/lib/lkl.o • boot-in.o: Application part • liblkl-in.o: Host side part • lkl.o: Linux kernel (vmlinux.bc)

Slide 11

Slide 11 text

Generating boot.js (2) • Disassemble all LLVM bitcode files. • Run rename_symbols.py to avoid function name collisions in LLVM IR files. (ex. strcmp) 10 $DIS -o $LKL/tests/boot.ll $LKL/tests/boot.bc $DIS -o js/dlmalloc.ll js/dlmalloc.bc $DIS -o js/libc.ll js/libc.bc $DIS -o js/pthreads.ll js/pthreads.bc $PY rename_symbols.py $LKL/tests/boot.ll $LKL/tests/boot-mod.ll EMCC_DEBUG=1 $CC -o js/boot.html \ $LKL/tests/boot-mod.ll $CFLAGS -v • will generate boot.js, a linux kernel fully written in JavaScript!!

Slide 12

Slide 12 text

Fix inline assemblies • Most of the Linux kernel code is machine independent, but there are some exceptions. • In set_normalized_timespec64 11 • As Emscripten does not have asm, so replace with emscripten_asm_const_int while (nsec < 0) { asm("" : "+rm"(nsec)); nsec += NSEC_PER_SEC; --sec; }

Slide 13

Slide 13 text

early_param & initcall • early_param, __setup and initcalls use ELF tricks for function management. • Example: early_param (__setup_param) 12 #define __setup_param(str, unique_id, fn, early) \ static const char __setup_str_##unique_id[] __initconst \ __aligned(1) = str; \ static struct obs_kernel_param __setup_##unique_id \ __used __section(.init.setup) \ __attribute__((aligned((sizeof(long))))) \ = { __setup_str_##unique_id, fn, early } __setup_start = .; KEEP(*(.init.setup)) __setup_end = .; const struct obs_kernel_param *p; for (p = __setup_start; p < __setup_end; p++) {

Slide 14

Slide 14 text

early_param & initcall 13 • These mechanisms are broken when generating JS. const char *early_params[MAX_INIT_ARGS+2] = { "debug", "quiet", "loglevel", NULL, }; /* snip */ for (i = 0; early_params[i]; i++) { • To fix them, hard code them. • For initcalls, export_initcalls.py will generate the code. /* initcall0 */ EM_ASM({ _net_ns_init(); });

Slide 15

Slide 15 text

Demo 14 https://retrage.github.io/lkl-js/

Slide 16

Slide 16 text

Limitations • boot.js just ”boot”, but completely useless yet. • start_kernel is success, but other ops are failed. • It fails • to create kernel threads. • Workaround: Remove kthread dependencies. • to run some initcalls. • to mount rootfs. • to execute init. • -> There are many workarounds to show dmesg. 15

Slide 17

Slide 17 text

Related work • Rump Kernels • A NetBSD based anykernel. • Ported to JavaScript by Antti Kantee in 2012[8][9]. • Google Native Client[10] • Run native code in browser sandbox. • NaCl (nexe) and PNaCl (pexe). • NaCl uses ELF. • We can build LKL using NaCl toolchains without any modification. 16

Slide 18

Slide 18 text

Conclusions • Q: Can Linux run on JavaScript directly? • A: Partially yes, but there is a lot of problems. • By using LKL and Emscripten, we can generate Linux kernel fully written in JavaScript. • As Linux kernel uses ELF tricks, we need to fix from the generated JavaScript. • Source code is available: • https://github.com/retrage/linux/tree/retrage/em- v2 17

Slide 19

Slide 19 text

FAQ • Q: What about WebAssembly? • A: Emscripten fails to generate WASM file when generating from boot-mod.ll. • Q: How the size of boot.js? • A: 99MB, 1,951,686 lines, 103,337,468 characters, 50MB by compressing. • Q: Why are the call traces empty? • A: Because JS functions do not have addresses. 18

Slide 20

Slide 20 text

Reference • [1] https://bellard.org/jslinux/ • [2] https://github.com/lkl/linux • [3] https://github.com/kripken/emscripten • [4] https://wiki.linuxfoundation.org/llvmlinux • [5] https://lwn.net/Articles/734071/ • [6] https://kripken.github.io/emscripten-site/docs/index.html • [7] https://developer.mozilla.org/en- US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer • [8] https://blog.netbsd.org/tnf/entry/kernel_drivers_compiled_to_javascri pt • [9] ftp://ftp.ulakbim.gov.tr/pub/NetBSD/misc/pooka/rump.js/index.html • [10] https://developer.chrome.com/native-client 19