Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LKL.js: Running Linux Kernel on JavaScript *Directly*

LKL.js: Running Linux Kernel on JavaScript *Directly*

lkl.js is Linux Kernel Library ported to JavaScript using Emscripten. Unlike JSLinux, lkl.js includes a Linux kernel fully written in JavaScript and runs without emulators. lkl.js just boots Linux kernel and still completely useless. It shows how Emscripten is powerful and how Linux kernel is flexible. This presentation describes how to port LKL to JavaScript.

Akira Moroo

July 21, 2018
Tweet

More Decks by Akira Moroo

Other Decks in Technology

Transcript

  1. LKL.js: Running Linux Kernel on JavaScript *Directly* July 21, 2018

    kernelvm14@Tokyo @retrage
  2. Motivation • Q: Can Linux run on JavaScript? • A:

    Obviously yes, use JSLinux[1]. • Q: Can Linux run on JavaScript *directly*? • A: ??? -> Challenge accepted. 1 JavaScript Engine Emulator Linux JavaScript Engine Linux JSLinux LKL.js
  3. Before We Port • OK, but JavaScript is a programming

    language and Linux kernel runs on machines, there is a gap. • Linux Kernel Library[2] (LKL) • anykernel: Run a real OS on userspace • A fork of torvalds/linux, LKL as a new arch arch/lkl • Emscripten[3] • LLVM based C/C++ to JS/WASM transpiler • Emscripten provides Unix-like environment • -> LKL and Emscripten will fill the gap. 2
  4. Building Linux Kernel with Clang • Q: Linux kernel deeply

    depends on gcc-extension. Can we build it with Clang? • A: Yes. Once upon a time, there is LLVMLinux[4] project. And now, by the efforts of Google Android team, two LTS (4.4 and 4.9) can be built with Clang[5]. • => The latest LKL can be built with Clang. 3
  5. LKL Build Flow 101 • $ make -C tools/lkl •

    -> Determine which files will be compiled from Kconfig settings, then compile the code. • -> The object files are archived to built-in.o • -> Link built-in.o files to vmlinux • -> Compile host side code, link to liblkl-in.o • -> Link all to liblkl.so 4
  6. JavaScript is a New Exec Format • In LLVM world,

    code is compiled to the target like: • Source -> LLVM IR -> Target • In Emcsripten, the “Linking” is that translating LLVM IR to JavaScript. • From this, all source code (including libc Emscripten provides) should be compiled to LLVM IR. 5
  7. Generating vmlinux.bc (1) • make -C tools/lkl CC="$CC $CFLAGS" AR="$PY

    $PWD/ar.py" V=1 6 "-s WASM=0" "-s ASYNCIFY=1" "-s EMULATE_FUNCTION_POINTER_CASTS=1" "-s USE_PTHREADS=1" "-s PTHREAD_POOL_SIZE=4" "-s TOTAL_MEMORY=1342177280" "-DMAX_NR_ZONES=2" "-DNR_PAGEFLAGS=20" "-DSPINLOCK_SIZE=0" "-DF_GETLK64=12" "-DF_SETLK64=13" "-DF_SETLKW64=14" • Options for Emscripten. See manuals[6] for more details. • Parameters obtained from ELF
  8. Generating vmlinux.bc (2) • LLVM has llvm-link, a linker for

    LLVM bitcode files. As llvm-link is not support archive files, we have to pass all object files to it. • ar.py will keep all object files to be archived to objs. 7 objs = [] for i, arg in enumerate(sys.argv): if ".o" in arg and not "built-in" in arg and i > 2: objs.append(arg)
  9. Generating vmlinux.bc (3) • clean-obj.py will clean up objs by

    eliminating duplicated file path. • link-vmlinux-gen.py will generate link- vmlinux.sh from objs. • link-vmlinux.sh will generate vmlinux.bc. • -> vmlinux.bc, a Linux kernel fully written in LLVM bitcode! 8 python "${srctree}/clean-obj.py" python "${srctree}/link-vmlinux-gen.py" bash "${srctree}/link-vmlinux.sh"
  10. Generating boot.js (1) • As LKL is one of LibOS,

    vmlinux.bc can not run without application part. Our target is tools/lkl/tests/boot 9 $LINK -o $LKL/tests/boot.bc \ $LKL/tests/boot-in.o $LKL/lib/liblkl-in.o $LKL/lib/lkl.o • boot-in.o: Application part • liblkl-in.o: Host side part • lkl.o: Linux kernel (vmlinux.bc)
  11. Generating boot.js (2) • Disassemble all LLVM bitcode files. •

    Run rename_symbols.py to avoid function name collisions in LLVM IR files. (ex. strcmp) 10 $DIS -o $LKL/tests/boot.ll $LKL/tests/boot.bc $DIS -o js/dlmalloc.ll js/dlmalloc.bc $DIS -o js/libc.ll js/libc.bc $DIS -o js/pthreads.ll js/pthreads.bc $PY rename_symbols.py $LKL/tests/boot.ll $LKL/tests/boot-mod.ll EMCC_DEBUG=1 $CC -o js/boot.html \ $LKL/tests/boot-mod.ll $CFLAGS -v • will generate boot.js, a linux kernel fully written in JavaScript!!
  12. Fix inline assemblies • Most of the Linux kernel code

    is machine independent, but there are some exceptions. • In set_normalized_timespec64 11 • As Emscripten does not have asm, so replace with emscripten_asm_const_int while (nsec < 0) { asm("" : "+rm"(nsec)); nsec += NSEC_PER_SEC; --sec; }
  13. early_param & initcall • early_param, __setup and initcalls use ELF

    tricks for function management. • Example: early_param (__setup_param) 12 #define __setup_param(str, unique_id, fn, early) \ static const char __setup_str_##unique_id[] __initconst \ __aligned(1) = str; \ static struct obs_kernel_param __setup_##unique_id \ __used __section(.init.setup) \ __attribute__((aligned((sizeof(long))))) \ = { __setup_str_##unique_id, fn, early } __setup_start = .; KEEP(*(.init.setup)) __setup_end = .; const struct obs_kernel_param *p; for (p = __setup_start; p < __setup_end; p++) {
  14. early_param & initcall 13 • These mechanisms are broken when

    generating JS. const char *early_params[MAX_INIT_ARGS+2] = { "debug", "quiet", "loglevel", NULL, }; /* snip */ for (i = 0; early_params[i]; i++) { • To fix them, hard code them. • For initcalls, export_initcalls.py will generate the code. /* initcall0 */ EM_ASM({ _net_ns_init(); });
  15. Demo 14 https://retrage.github.io/lkl-js/

  16. Limitations • boot.js just ”boot”, but completely useless yet. •

    start_kernel is success, but other ops are failed. • It fails • to create kernel threads. • Workaround: Remove kthread dependencies. • to run some initcalls. • to mount rootfs. • to execute init. • -> There are many workarounds to show dmesg. 15
  17. Related work • Rump Kernels • A NetBSD based anykernel.

    • Ported to JavaScript by Antti Kantee in 2012[8][9]. • Google Native Client[10] • Run native code in browser sandbox. • NaCl (nexe) and PNaCl (pexe). • NaCl uses ELF. • We can build LKL using NaCl toolchains without any modification. 16
  18. Conclusions • Q: Can Linux run on JavaScript directly? •

    A: Partially yes, but there is a lot of problems. • By using LKL and Emscripten, we can generate Linux kernel fully written in JavaScript. • As Linux kernel uses ELF tricks, we need to fix from the generated JavaScript. • Source code is available: • https://github.com/retrage/linux/tree/retrage/em- v2 17
  19. FAQ • Q: What about WebAssembly? • A: Emscripten fails

    to generate WASM file when generating from boot-mod.ll. • Q: How the size of boot.js? • A: 99MB, 1,951,686 lines, 103,337,468 characters, 50MB by compressing. • Q: Why are the call traces empty? • A: Because JS functions do not have addresses. 18
  20. Reference • [1] https://bellard.org/jslinux/ • [2] https://github.com/lkl/linux • [3] https://github.com/kripken/emscripten

    • [4] https://wiki.linuxfoundation.org/llvmlinux • [5] https://lwn.net/Articles/734071/ • [6] https://kripken.github.io/emscripten-site/docs/index.html • [7] https://developer.mozilla.org/en- US/docs/Web/JavaScript/Reference/Global_Objects/SharedArrayBuffer • [8] https://blog.netbsd.org/tnf/entry/kernel_drivers_compiled_to_javascri pt • [9] ftp://ftp.ulakbim.gov.tr/pub/NetBSD/misc/pooka/rump.js/index.html • [10] https://developer.chrome.com/native-client 19