Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Wrapping native libraries using WebAssembly

Wrapping native libraries using WebAssembly

Go is gaining momentum, with projects that have large codebases in older languages like C looking to it to provide a modern, productive development experience without sacrificing performance. However, Go apps tend to be written in Go - while cgo exists to reuse libraries written in languages like C, which introduces limitations such as requiring a C toolchain for building, the possibility for incompatibilities with different libc versions, and limited platform/cross-compilation support. Indeed, many popular Go applications are built with cgo disabled to ensure broad usability.

Luckily, many native libraries can be compiled to WebAssembly and run with the pure Go WebAssembly runtime, wazero. This means the code can be compiled once with no external dependencies like libc, and executed in any Go application, regardless of cgo support. This presentation will give an overview of what WebAssembly is, how to compile legacy codebases into WebAssembly for use in Go, and will present some case studies of a few libraries, with insights into tradeoffs of wrapping vs rewriting. At the end of the presentation, you’ll be able to tell your colleagues, still hesitant about jumping from C to Go, that they can try it out without losing access to their existing code.

From GopherCon 2023: https://www.gophercon.com/agenda/speakers/3058151

Anuraag Agrawal

October 11, 2023
Tweet

More Decks by Anuraag Agrawal

Other Decks in Programming

Transcript

  1. C source code WebAssembly .wat text format WebAssembly .wasm binary

    format int factorial(int n) { if (n == 0) return 1; else return n * factorial(n-1); } (func (param i64) (result i64) local.get 0 i64.eqz if (result i64) i64.const 1 else local.get 0 local.get 0 i64.const 1 i64.sub call 0 i64.mul end) 00 61 73 6D 01 00 00 00 01 00 01 60 01 73 01 73 06 03 00 01 00 02 0A 00 01 00 00 20 00 50 04 7E 42 01 05 20 00 20 00 42 01 7D 10 00 7E 0B 0B 15 17
  2. Java ldc p iconst_2 putfield x 🤔 Wasm global.get p

    i32.const 2 i32.store offset(x) v128.load v128.load i8x16.add
  3. Host WebAssembly Runtime Sandbox Memory Compiled wasm str ld Wasm

    binary Read Compile Host function Guest function proxy_replace_header_map_value proxy_send_local_response clock_time_get fd_write proxy_on_request_header proxy_on_request_body
  4. [2023-09-15 01:43:07.570341][1][error][wasm] [source/extensions/common/wasm/wasm_vm.cc:38] Function: proxy_on_configure failed: Uncaught RuntimeError: memory access

    out of bounds Proxy-Wasm plugin in-VM backtrace: 0: 0x70cae - (*regexp/syntax.compiler).compile 1: 0x715cd - (*regexp/syntax.compiler).compile 2: 0x71523 - (*regexp/syntax.compiler).compile 3: 0x71629 - (*regexp/syntax.compiler).compile 4: 0x71523 - (*regexp/syntax.compiler).compile 5: 0x71629 - (*regexp/syntax.compiler).compile 6: 0x71523 - (*regexp/syntax.compiler).compile 7: 0x71629 - (*regexp/syntax.compiler).compile 8: 0x71523 - (*regexp/syntax.compiler).compile 9: 0x71629 - (*regexp/syntax.compiler).compile [2023-09-15 01:43:07.570355][1][error][wasm] [source/extensions/common/wasm/wasm.cc:109] Wasm VM failed Failed to configure base Wasm plugin
  5. cgo is not Go cgo is TinyGo // #cgo LDFLAGS:

    -Linternal/wasm -lre2 C.re2_new()
  6. 00 61 73 6D 01 00 00 00 01 00

    01 60 01 73 01 73 06 03 00 01 00 02 0A 00 01 00 00 20 00 50 04 7E 42 01 05 20 00 20 00 42 01 7D 10 00 7E 0B 0B 15 17 wazeroir Run in interpreter x86/arm64 ASM + JMP
  7. Compile and invoke import "github.com/tetratelabs/wazero" //go:embed add.wasm var wasmAdd []byte

    r := wazero.NewRuntime(ctx) defer r.Close(ctx) mod, _ := r.Instantiate(ctx, wasmAdd) res, _ := mod.ExportedFunction("add").Call(ctx, 1, 2) println(res[0])
  8. name \ time/op build/wafbench_stdlib.txt build/wafbench.txt build/wafbench_cgo.txt WAF/FTW-2 29.82 ± ∞

    ¹ 26.38 ± ∞ ¹ -11.55% (p=0.008 n=5) 26.95 ± ∞ ¹ -9.61% (p=0.008 n=5) WAF/POST/1-2 3.359m ± ∞ ¹ 3.550m ± ∞ ¹ +5.70% (p=0.008 n=5) 3.563m ± ∞ ¹ +6.09% (p=0.008 n=5) WAF/POST/1000-2 20.532m ± ∞ ¹ 6.194m ± ∞ ¹ -69.83% (p=0.008 n=5) 5.211m ± ∞ ¹ -74.62% (p=0.008 n=5) WAF/POST/10000-2 187.29m ± ∞ ¹ 25.94m ± ∞ ¹ -86.15% (p=0.008 n=5) 17.69m ± ∞ ¹ -90.56% (p=0.008 n=5) WAF/POST/100000-2 1852.4m ± ∞ ¹ 220.2m ± ∞ ¹ -88.11% (p=0.008 n=5) 143.8m ± ∞ ¹ -92.23% (p=0.008 n=5) Match/Easy0/16-2 4.355n ± ∞ ¹ 348.200n ± ∞ ¹+7895.41% (p=0.008 n=5) 224.400n ± ∞ ¹+5052.70% (p=0.008 n=5) Match/Medium/32-2 953.6n ± ∞ ¹ 332.7n ± ∞ ¹ -65.11% (p=0.008 n=5) 227.9n ± ∞ ¹ -76.10% (p=0.008 n=5) Match/Hard/32-2 1193.0n ± ∞ ¹ 332.2n ± ∞ ¹ -72.15% (p=0.008 n=5) 227.8n ± ∞ ¹ -80.91% (p=0.008 n=5) Match/Hard/32K-2 1617883.0n ± ∞ ¹ 1151.0n ± ∞ ¹ -99.93% (p=0.008 n=5) 226.4n ± ∞ ¹ -99.99% (p=0.008 n=5)
  9. name \ time/op build/wafbench_stdlib.txt build/wafbench.txt build/wafbench_cgo.txt WAF/FTW-2 41.63 ± ∞

    ¹ 39.58 ± ∞ ¹ -4.94% (p=0.008 n=5) 41.79 ± ∞ ¹ WAF/POST/1-2 4.858m ± ∞ ¹ 4.611m ± ∞ ¹ ~ (p=0.095 n=5) 4.800m ± ∞ ¹ WAF/POST/1000-2 27.05m ± ∞ ¹ 26.35m ± ∞ ¹ ~ (p=0.151 n=5) 26.23m ± ∞ ¹ WAF/POST/10000-2 232.7m ± ∞ ¹ 231.6m ± ∞ ¹ ~ (p=0.690 n=5) 226.4m ± ∞ ¹ WAF/POST/100000-2 2.243 ± ∞ ¹ 2.227 ± ∞ ¹ ~ (p=0.548 n=5) 2.271 ± ∞ ¹ BurntSushi/sherlock/name/alt1/default-2 443µs ± 1% 469µs ± 0% 64µs ± 0% BurntSushi/sherlock/name/alt1/dfa-2 425µs ± 2% 481µs ± 0% 64µs ± 1% BurntSushi/sherlock/name/nocase1/default 12.7ms ± 0% 4.1ms ± 0% 0.9ms ± 1% BurntSushi/sherlock/name/nocase1/dfa-2 4.08ms ± 0% 4.04ms ± 0% 0.86ms ± 1%
  10. name \ time/op build/wafbench_stdlib.txt build/wafbench.txt build/wafbench_cgo.txt WAF/FTW-2 35.6s ± 0%

    36.5s ± 1% 35.3s ± 0% WAF/POST/1-2 4.13ms ± 2% 4.22ms ± 1% 4.10ms ± 1% WAF/POST/1000-2 21.6ms ± 1% 22.1ms ± 3% 21.2ms ± 1% WAF/POST/10000-2 189ms ± 1% 195ms ± 0% 186ms ± 1% WAF/POST/100000-2 1.86s ± 1% 1.93s ± 1% 1.83s ± 1% SQLiDriver/tests/test-sqli-038.txt-2 580ns ± 1% 2430ns ± 1% 968ns ± 1% IsXSS/x_><script>alert(1);</script>- 1.13µs ± 0% 0.49µs ± 1% 0.32µs ± 2%
  11. func FuzzSQLi(f *testing.F) { for _, tc := range sqliTests

    { f.Add(tc) } f.Fuzz(func(t *testing.T, tc string) { resGo := libinjectiongo.DetectSQLi(tc) resWasm := libinjectionwasm.DetectSQLi(tc) if resGo != resWasm { t.Errorf(“result mismatch %s %s”, resGo, resWasm) } }) }
  12. wazero performance goos: darwin goarch: arm64 pkg: github.com/tetratelabs/wazero/internal/engine/wazevo Benchmark_wazevo Benchmark_wazevo/old

    Benchmark_wazevo/old-10 106 10891799 ns/op Benchmark_wazevo/wazevo Benchmark_wazevo/wazevo-10 297 4031899 ns/op PASS
  13. Reusing native libraries in Go Generally use cgo, but cgo

    is not Go • Requires native toolchain • Can have compatibility issues with e.g. libc • etc, etc
  14. Reusing native libraries in Go using WebAssembly • Only need

    native toolchain when releasing ◦ Built library is cross-platform • Pure Go WebAssembly runtime = pure Go applications
  15. WebAssembly is not Assembly • Stack-based virtual machine bytecode ◦

    So is Java • No semantics ◦ No operations on structs • Some hardware bias ◦ SIMD instructions Common hardware are register machines, not stack machines
  16. So not fast? • Often no: https://zaplib.com/docs/blog_post_mortem.html • Overhead when

    calling functions across sandbox • Self-contained apps might be “pretty fast” ◦ But it all depends on the runtime
  17. WebAssembly has a bias but is not only for web

    • WebAssembly is a web standard ◦ Biggest contributors in the early days were Google and Mozilla ◦ Much tooling focuses on integrating wasm and browser • Language runtimes can allow running outside the browser ◦ Go: wazero, Rust: wasmtime, JS: V8, etc ◦ Tooling still immature for integrating wasm and host process
  18. What is WebAssembly? • General bytecode for targeting from any

    programming language • Sandboxed execution model ◦ No access to host memory, syscalls without explicitly exposing through ABI • No structure, standard library, etc ◦ Compile the entire language into binary Compile existing binary for loading in web (not as fast as native but maybe enough) Allow extending apps in a safe way (or only way, e.g. Go)
  19. What is an ABI? • Interface to allow interaction between

    two separately compiled executables • Generally refers to code executing in the same process ◦ IPC is the interface between different processes on a single machine ◦ RPC is the interface between different processes (often) on different machines
  20. What is a WebAssembly ABI • Interface for WebAssembly program

    to interact with the host • By default, WebAssembly is in a sandbox that cannot interact at all • When extending / writing plugins, gives access to the extended object
  21. ABI parts • Functions with numeric parameters • … •

    Nope that’s it (at least in current finalized spec)
  22. Shared memory • Sandbox cannot access host memory directly •

    For host to pass to sandbox, sandbox must alloc / dealloc
  23. Host ABI - WASI • POSIX-like set of host functions

    for accessing OS functionality ◦ fd_write($fd i32, $iovec i32, $len i32, $written i32) result i32 • Invoked by wasi-libc • Defacto target for non-browser Wasm binaries ◦ Another similar host ABI is emscripten, for browsers
  24. 00 61 73 6D 01 00 00 00 01 00

    01 60 01 73 01 73 06 03 00 01 00 02 0A 00 01 00 00 20 00 50 04 7E 42 01 05 20 00 20 00 42 01 7D 10 00 7E 0B 0B 15 17 wazeroir Run in interpreter x86/arm64 Native call
  25. Compile and invoke import "github.com/tetratelabs/wazero" //go:embed add.wasm var wasmAdd []byte

    r := wazero.NewRuntime(ctx) defer r.Close(ctx) mod, _ := r.Instantiate(ctx, wasmAdd) res, _ := mod.ExportedFunction("add").Call(ctx, 1, 2)
  26. wasi-libc • Implementation of libc with syscalls invoking WASI host

    functions • Based off musl • Many common features missing ◦ Mmap, signals ◦ Threads support is under development
  27. wasi-sdk • Collection of LLVM and wasi-libc • Compiles to

    wasm32-wasi out of the box docker run -it --rm -v `pwd`:/workspace --workdir /workspace ghcr.io/webassembly/wasi-sdk:wasi-sdk-20 bash
  28. Takeaways • Slower than Go regexp for simple cases ◦

    There is overhead when invoking Wasm • Dramatically faster for complex cases ◦ Security filters will benefit, URL routers will not • Can support cgo as well with same library ◦ Both wazero and cgo invoke a C ABI ◦ Main difference is memory sandbox
  29. Takeaways • Not faster for SQL injection testing, faster for

    XSS testing ◦ Corresponds to optimization work done for former but not latter • Rewrite does come with bugs ◦ “Don't panic on an incomplete HTML entity” • Can still support a rewrite by fuzz testing against native
  30. Takeaways • Wasm always outperforms for realistic cases ◦ Dictionary

    of 4 terms is very small • Go rewrite was competitive in initial version ◦ Maintaining a port is hard
  31. General takeaways (Coraza project) • Reimplemented business logic from cgo

    to enable pure Go ◦ Significant engineering time to do rewrite ◦ Even more to fix bugs and performance issues • Wasm can allow reusing instead of rewriting ◦ Spend engineering time on creation ◦ Targeting both Wasm and cgo provides flexibility with relatively low code overhead • Wasm can support rewriting ◦ Ideal is always to rewrite if resources allow ◦ Testing / fuzzing in pure go vs the actual original library ◦ Maintainers don’t need C toolchain, docker can create prebuilt binary
  32. General takeaways (cgo vs wazero) • cgo will support any

    compilable code ◦ Wasm still missing support for exceptions, signals, more • Wasm compiled at start of app ◦ Can be cached for platform, in which case still relatively hefty cache key computation • Wasm supports profiling ◦ https://github.com/stealthrocket/wzprof ◦ And possibly step debugging in the future? • Wasm can be used by any Go app
  33. But no threads… • Stateful code must lock invocation ◦

    Each regular expression has a lock • Stateless code can use sync.Pool • Each expression or pooled object duplicates significant memory ◦ Strings, such as error messages ◦ Miscellaneous global state
  34. WebAssembly Threads • Atomic memory instructions in WebAssembly • Phase

    3 draft ◦ Phase 4 is near-final • Prototype implementation for wazero and go-re2 ◦ Locks removed ◦ Single module, no extraneous memory usage ◦ Waiting on spec phase 4 before upstreaming