for user-space programs to access OS kernel functionality. • System call hooks allow us to intercept a system call 3 user-space program kernel-space OS subsystem user-space kernel-space System Call Hook
for user-space programs to access OS kernel functionality. • System call hooks allow us to intercept a system call and redirect execution to a user-de fi ned hook function. 4 user-space program kernel-space OS subsystem user-space kernel-space System Call Hook user-de fi ned hook function
to transparently apply user- space OS subsystems to existing applications. 5 user-space program kernel-space OS subsystem user-space kernel-space user-space OS subsystem
to transparently apply user- space OS subsystems to existing applications. 6 user-space program kernel-space OS subsystem user-space kernel-space user-space OS subsystem Highly Performant TCP ping-pong performance Throughput [K reqs/sec] 0 100 200 300 400 500 Linux TCP stack lwIP on DPDK 8.8 times faster =user-space TCP Stack
kernel-space user-space OS subsystem user-de fi ned hook function • System call hook mechanisms allow us to transparently apply user- space OS subsystems to existing applications. • A system call hook can transparently glue user-space subsystems to existing applications. System Call Hook
kernel-space user-space OS subsystem user-de fi ned hook function There are several options • System call hook mechanisms allow us to transparently apply user- space OS subsystems to existing applications. • A system call hook can transparently glue user-space subsystems to existing applications. System Call Hook
to transparently apply user- space OS subsystems to existing applications • A system call hook can transparently glue user-space subsystems to existing applications. 9 user-space program kernel-space OS subsystem user-space kernel-space user-space OS subsystem user-de fi ned hook function System Call Hook Categories of system call hook mechanisms - Common Kernel Support - BPF-based Hooks - Non-upstreamed Extensions - Function Call Hooking - Binary Rewriting
System Call Emulation Easy-to-Use Instruction-level Hook Common Kernel Support (ptrace) ✔ ✔ ✔ Table 1: Categories of system call hook mechanisms and their properties (§ 2).
System Call Emulation Easy-to-Use Instruction-level Hook Common Kernel Support (ptrace) ✔ ✔ ✔ High overhead due to process scheduling between tracer and tracee Table 1: Categories of system call hook mechanisms and their properties (§ 2).
System Call Emulation Easy-to-Use Instruction-level Hook Common Kernel Support (ptrace) ✔ ✔ ✔ BPF-based Hooks ✔ ✔ ✔ Table 1: Categories of system call hook mechanisms and their properties (§ 2).
System Call Emulation Easy-to-Use Instruction-level Hook Common Kernel Support (ptrace) ✔ ✔ ✔ BPF-based Hooks ✔ ✔ ✔ Cannot achieve System Call Emulation due to BPF VM restrictions Table 1: Categories of system call hook mechanisms and their properties (§ 2).
System Call Emulation Easy-to-Use Instruction-level Hook Common Kernel Support (ptrace) ✔ ✔ ✔ BPF-based Hooks ✔ ✔ ✔ Non-upstreamed Extensions ✔ ✔ ✔ modifying kernels or standard libraries. Table 1: Categories of system call hook mechanisms and their properties (§ 2).
System Call Emulation Easy-to-Use Instruction-level Hook Common Kernel Support (ptrace) ✔ ✔ ✔ BPF-based Hooks ✔ ✔ ✔ Non-upstreamed Extensions ✔ ✔ ✔ Concerns about security, stability, and future maintenance costs Table 1: Categories of system call hook mechanisms and their properties (§ 2).
System Call Emulation Easy-to-Use Instruction-level Hook Binary Rewriting ✔ ✔ ✔ ✔ Common Kernel Support (ptrace) ✔ ✔ ✔ BPF-based Hooks ✔ ✔ ✔ Non-upstreamed Extensions ✔ ✔ ✔ Function Call Hooking (LD_PRELOAD) ✔ ✔ ✔ have advantages over other options Table 1: Categories of system call hook mechanisms and their properties (§ 2).
Hook Binary Rewriting ✔ ✔ ✔ ✔ Common Kernel Support (ptrace) ✔ ✔ ✔ BPF-based Hooks ✔ ✔ ✔ Non-upstreamed Extensions ✔ ✔ ✔ Function Call Hooking (LD_PRELOAD) ✔ ✔ ✔ Previous Mechanisms: designed for x86 for ARM64: HermiTux, ASC-Hook Instruction Punning, e9patch, DataHook, lazypoline, X-Containers, zpoline … only a few choices for ARM64 Table 1: Categories of system call hook mechanisms and their properties (§ 2).
triggers a system call • svc: 0x01 0x00 0x00 0xd4 (#imm is 0) • Our goal: • replace svc with something that jumps to a user-de fi ned hook function. 27 ??? … … … … virtual memory Jump user-de fi ned hook function
triggers a system call • svc: 0x01 0x00 0x00 0xd4 (#imm is 0) • Our goal: • replace svc with something that jumps to a user-de fi ned hook function. 28 ??? … … … … virtual memory Jump user-de fi ned hook function Question: what should we put there?
xed size • -> We can replace svc with any other single instruction • Existing Methods: • #1: bl (HermiTux [VEE '19]) • #2: br (ASC-Hook [LCTES '25]) 31 … … … … virtual memory user-de fi ned hook function ???
xed size • -> We can replace svc with any other single instruction • Existing Methods: • #1: bl (HermiTux [VEE '19]) • #2: br (ASC-Hook [LCTES '25]) 32 … … … … virtual memory user-de fi ned hook function ???
fi xed size • -> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option 33 … … … … virtual memory user-de fi ned hook function bl?
fi xed size • -> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call 34 … … … … virtual memory user-de fi ned hook function bl? Jump?
bytes fi xed size • -> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register 35 … … … … virtual memory user-de fi ned hook function bl?
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 36 … … … … virtual memory user-de fi ned hook function bl? Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 37 … … … … virtual memory user-de fi ned hook function bl? R1 Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 38 … … … … virtual memory user-de fi ned hook function bl? pc R1 register state: pc: R1-4 x30: <original return address> Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 39 … … … … virtual memory user-de fi ned hook function bl? pc R1 register state: pc: R1-4 x30: <original return address> Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 40 … … … … virtual memory user-de fi ned hook function bl? pc R1 register state: pc: R1 x30: <original return address> Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 41 … … … … virtual memory user-de fi ned hook function bl? pc R1 register state: pc: R1 x30: R1+4 Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 42 … … … … virtual memory user-de fi ned hook function bl? pc R1 register state: pc: <hook entry> x30: R1+4 Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 43 … … … … virtual memory user-de fi ned hook function bl? pc R1 ret register state: pc: <hook exit> x30: R1+4 Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 44 … … … … virtual memory user-de fi ned hook function bl? pc R1 ret register state: pc: R1+4 x30: R1+4 Pitfall: bl Breaks Return Address
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 45 ret … … … … virtual memory user-de fi ned hook function bl? pc R1 ret register state: pc: <end of the function> x30: R1+4 bl: Return Address Lost
-> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 46 ret … … … … virtual memory user-de fi ned hook function bl? pc R1 ret Cannot return to the caller register state: pc: <end of the function> x30: R1+4 bl: Return Address Lost
fi xed size • -> We can replace svc with any other single instruction • At fi rst glance, a bl instruction looks reasonable option • bl: used for function call • Pitfall: bl saves return address to x30 register -> Loses original return address 47 ret … … … … virtual memory user-de fi ned hook function bl? pc R1 ret Cannot return to the caller register state: pc: <end of the function> x30: R1+4 • HermiTux [VEE '19] uses this approach • It mitigates this by checking the lost of the return address is critical using binary analysis • It has fallbacks such as trap-based approach • -> High performance penalty
the address stored in a speci fi ed register without saving the return address to x30 49 … … … … virtual memory user-de fi ned hook function br? Possible Primitive #2: br
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach 50 … … … … virtual memory user-de fi ned hook function br? br x8: zpoline-Like Approach
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux 51 … … … … virtual memory user-de fi ned hook function br? sets syscall_nr to x8 br x8: zpoline-Like Approach
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 52 … … … … virtual memory user-de fi ned hook function br x8? sets syscall_nr to x8 br x8: zpoline-Like Approach
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 53 … … … … virtual memory user-de fi ned hook function br x8? sets syscall_nr to x8 Jump to x8 br x8: zpoline-Like Approach
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 54 … … … … virtual memory br x8? sets syscall_nr to x8 0x0 Jump to x8 br x8: zpoline-Like Approach user-de fi ned hook function
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 55 … … … … virtual memory user-de fi ned hook function br x8? sets syscall_nr to x8 bl hook 0x0 Jump br x8: zpoline-Like Approach
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 56 … … … … virtual memory user-de fi ned hook function br x8? sets syscall_nr to x8 0x0 Jump to x8 !!!!Program stops due to the PC misalignment fault !!!! Pitfall of br x8: PC Alignment
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 57 … … … … virtual memory user-de fi ned hook function br x8? sets syscall_nr to x8 0x0 Jump to x8 !!!!Program stops due to the PC misalignment fault !!!! Pitfall of br x8: PC Alignment The system call number is not always a multiple of 4, so it violates the requirement in most cases
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 58 … … … … virtual memory user-de fi ned hook function br x8? sets syscall_nr to x8 0x0 Jump to x8 Requires additional runtime costs to fi nd and replace instructions which sets system call number to x8 before svc instruction br x8: Additional Costs
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 59 … … … … virtual memory user-de fi ned hook function br x8? sets syscall_nr to x8 0x0 Jump to x8 Requires additional runtime costs to fi nd and replace instructions which sets system call number to x8 before svc instruction br x8: Additional Costs
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 60 … … … … virtual memory user-de fi ned hook function br x8? sets (syscall_nr * 4) to x8 0x0 Jump to x8 Requires additional runtime costs to fi nd and replace instructions which sets system call number to x8 before svc instruction Rewrite an instruction that sets system call number to x8 before svc to satisfy pc alignment br x8: Additional Costs
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 61 … … … … virtual memory user-de fi ned hook function br x8? sets (syscall_nr * 4) to x8 0x0 Jump to x8 Refers to x8 Requires additional runtime costs to fi nd and replace instructions which sets system call number to x8 before svc instruction What if there is an instruction that refers to x8 AFTER aligning x8? br x8: Another Issue
the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 62 … … … … virtual memory user-de fi ned hook function br x8? sets (syscall_nr * 4) to x8 0x0 Jump to x8 Refers to x8 Requires additional runtime costs to fi nd and replace instructions which sets system call number to x8 before svc instruction What if there is an instruction that refers to x8 AFTER aligning x8? May cause unde fi ned behavior br x8: Another Issue
• br jumps to the address stored in a speci fi ed register without saving the return address to x30 • Possible Idea: zpoline-like approach • x8 register holds system call number on Linux • What if we replace svc with ‘br x8’… 63 … … … … virtual memory user-de fi ned hook function br x8? sets (syscall_nr * 4) to x8 0x0 Jump to x8 What if there is an instruction that refers to x8 AFTER aligning x8? Refers to x8 May cause unde fi ned behavior Requires additional runtime costs to fi nd and replace instructions which sets system call number to x8 before svc instruction • ASC-Hook [LCTES '25] uses this approach • Cons: • Additional run-time costs • May cause unexpected behavior • -> It su ff ers from these drawbacks
• jumps to an o ff set within ±128 MiB • without saving the return address 67 … … … … virtual memory user-de fi ned hook function b #imm svc-hook Primitive: b R1
• jumps to an o ff set within ±128 MiB • without saving the return address 68 … … … … virtual memory user-de fi ned hook function b #imm svc-hook Primitive: b R1
• jumps to an o ff set within ±128 MiB • without saving the return address • -> This means it does not su ff er from the bl- related issue of overwriting x30 69 … … … … virtual memory user-de fi ned hook function b #imm Jump svc-hook Primitive: b R1
• jumps to an o ff set within ±128 MiB • without saving the return address • -> This means it does not su ff er from the bl- related issue of overwriting x30 70 … … … … virtual memory user-de fi ned hook function b #imm Jump pc register state: pc: R1 x30: <original return address> svc-hook: Why Trampoline? R1
• jumps to an o ff set within ±128 MiB • without saving the return address • -> This means it does not su ff er from the bl- related issue of overwriting x30 71 … … … … virtual memory user-de fi ned hook function b #imm Jump pc register state: pc: <hook entry> x30: <original return address> svc-hook: Why Trampoline? R1
• jumps to an o ff set within ±128 MiB • without saving the return address • -> This means it does not su ff er from the bl- related issue of overwriting x30 72 … … … … virtual memory user-de fi ned hook function b #imm Jump pc register state: pc: <end of the function> x30: <original return address> svc-hook: Why Trampoline? R1
• jumps to an o ff set within ±128 MiB • without saving the return address • -> This means it does not su ff er from the bl- related issue of overwriting x30 73 … … … … virtual memory user-de fi ned hook function b #imm Jump pc register state: pc: <end of the function> x30: <original return addr> Cannot return to R1+4 svc-hook: Why Trampoline? R1
• jumps to an o ff set within ±128 MiB • without saving the return address • -> This means it does not su ff er from the bl- related issue of overwriting x30 74 … … … … virtual memory user-de fi ned hook function b #imm Jump pc we cannot simply jump straight to the hook function register state: pc: <end of the function> x30: <original return addr> Cannot return to R1+4 svc-hook: Why Trampoline? R1
a trampoline code for each replaced svc instruction • To embed the return address 75 … … … virtual memory user-de fi ned hook function b #imm per-svc trampoline code svc-hook: Trampoline for Each svc R1
a trampoline code for each replaced svc instruction • To embed the return address • The b jumps to the instantiated trampoline code 76 … … … virtual memory user-de fi ned hook function b #imm Jump per-svc trampoline code svc-hook: Trampoline for Each svc R1
2) call hook function • 3) return from the hook function 80 save regs call hook func … … … virtual memory user-de fi ned hook function b #imm Jump svc-hook: How Trampoline Works R1
2) call hook function • 3) return from the hook function • 4) restore registers 81 save regs call hook func … … … virtual memory user-de fi ned hook function b #imm Jump restore regs svc-hook: How Trampoline Works R1
2) call hook function • 3) return from the hook function • 4) restore registers • 5) return to the original control fl ow 82 save regs call hook func … … … virtual memory user-de fi ned hook function b #imm Jump restore regs jump to R1+4 svc-hook: How Trampoline Works R1
2) call hook function • 3) return from the hook function • 4) restore registers • 5) return to the original control fl ow 83 save regs call hook func … … … virtual memory user-de fi ned hook function b #imm restore regs jump to R1+4 No register information lost svc-hook: No Context Lost R1
#imm trampoline code We cannot place trampoline ❌ -> Hook Fails Allocated Region Allocated Region Does this situation ever occur in real-world binaries?
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction 92 … … … ELF memory map svc R1
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 93 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB Usable memory region for R1 svc trampoline code per-svc trampoline code
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 94 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB per-svc trampoline code • Run this program for all Ubuntu 24.04 LTS apt packages
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 95 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB per-svc trampoline code Results - Total: 81625 packages - ARM64 Binary: 22441 packages - ELF Binary with svc: 705 packages - svc instructions: 1588 • Run this program for all Ubuntu 24.04 LTS apt packages
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 96 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB per-svc trampoline code Results - Total: 81625 packages - ARM64 Binary: 22441 packages - ELF Binary with svc: 705 packages - svc instructions: 1588 no svc whose ±128 MiB range is used for placing objects of the ELF • Run this program for all Ubuntu 24.04 LTS apt packages
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 97 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB per-svc trampoline code Results - Total: 81625 packages - ARM64 Binary: 22441 packages - ELF Binary with svc: 705 packages - svc instructions: 1588 no svc whose ±128 MiB range is used for placing objects of the ELF 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 120 140 90.2 128 CDF Offset [MiB] • Run this program for all Ubuntu 24.04 LTS apt packages Figure 2: CDF (§ 3.3.2).
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 98 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB per-svc trampoline code Results - Total: 81625 packages - ARM64 Binary: 22441 packages - ELF Binary with svc: 705 packages - svc instructions: 1588 no svc whose ±128 MiB range is used for placing objects of the ELF 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 120 140 90.2 128 CDF Offset [MiB] Distance to the nearest free memory region • Run this program for all Ubuntu 24.04 LTS apt packages Figure 2: CDF (§ 3.3.2).
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 99 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB per-svc trampoline code Results - Total: 81625 packages - ARM64 Binary: 22441 packages - ELF Binary with svc: 705 packages - svc instructions: 1588 no svc whose ±128 MiB range is used for placing objects of the ELF 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 120 140 90.2 128 CDF Offset [MiB] Distance to the nearest free memory region • Run this program for all Ubuntu 24.04 LTS apt packages Figure 2: CDF (§ 3.3.2). Maximum
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 100 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB per-svc trampoline code Results - Total: 81625 packages - ARM64 Binary: 22441 packages - ELF Binary with svc: 705 packages - svc instructions: 1588 no svc whose ±128 MiB range is used for placing objects of the ELF 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 120 140 90.2 128 CDF Offset [MiB] Figure 2: CDF (§ 3.3.2). Maximum 99th percentile 23.4 MiB Distance to the nearest free memory region • Run this program for all Ubuntu 24.04 LTS apt packages
checker program • reads the ELF header and create memory map • fi nd all svc instructions in the executable regions • for each svc instruction: • checks whether the ±128 MiB range of the svc instruction is entirely used for placing objects of the binary or not 101 … … … ELF memory map svc R1 R1+128 MiB R1-128 MiB per-svc trampoline code Results - Total: 81625 packages - ARM64 Binary: 22441 packages - ELF Binary with svc: 705 packages - svc instructions: 1588 no svc whose ±128 MiB range is used for placing objects of the ELF 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 120 140 90.2 128 CDF Offset [MiB] this issue does not signi fi cantly limit the applicability • Run this program for all Ubuntu 24.04 LTS apt packages Figure 2: CDF (§ 3.3.2). Maximum 99th percentile 23.4 MiB
overhead of system call hooking • Measured the time needed to hook a system call • Hook the getpid system call (the simplest system call) • Emulate it by returning a dummy process ID
reqs/sec] 0 100 200 300 400 500 Linux ptrace seccom p brk svc-hook LD_PRELO AD Linux 93% 16% 15% 2% Compared to LD_PRELOAD Figure 3: Network server performance (§ 4.2).
28 56 84 112 140 Linux ptrace seccom p brk svc-hook LD_PRELO AD Linux 95% 36% 34% 7% Compared to LD_PRELOAD Figure 3: Network server performance (§ 4.2).
28 56 84 112 140 Linux ptrace seccom p brk svc-hook LD_PRELO AD Linux 95% 36% 34% 7% Compared to LD_PRELOAD svc-hook is ef fi cient enough to preserve the performance bene fi ts of user-space OS subsystems
CPUs • based on binary rewriting • replaces svc with b instruction • instantiates dedicated trampoline code for each replaced svc • ±128 MiB jump o ff set of b does not limit the applicability • E ffi cient enough to maintain the performance merit of the user-pace OS subsystems • svc-hook is open source: https://github.com/retrage/svc-hook • Supported Platforms: Linux (Android), FreeBSD, NetBSD 118 GitHub
OS Environment C. Other CPU Architectures D. Memory Footprint of the Trampoline Code E. Comparison with ASC-Hook F. Performance Overhead for Other Workloads A. SQLite B. PostgreSQL C. Samba 119 Supplemental The 6-page paper is too short to present all of our work!
zpoline [ATC '23] • Employed by almost all the run-time binary rewriting syscall hook 1. Run LIBSVCHOOK=hook.so LD_PRELOAD=libsvchook.so <target> 2. libsvchook.so has a function that called before the target main function: 1. Scan memory region to fi nd svc 2. Setup trampoline code 3. Rewrite svc 4. Load the hook function speci fi ed with LIBSVCHOOK environment variable 120
supported • OpenBSD and macOS: Not Applicable due to restrictions of mprotect • OpenBSD: • mimmutable(2): Prohibits further manipulation of page mapping • The libc calls this syscall once a program loaded • macOS: • Denies R-X -> RWX memory region attribute changes • These limitations are common to all the run-time binary rewriting approaches 121
value • x86: Not applicable • syscall/sysenter 2-bytes: Too small to jump/call instructions • RISC-V: May applicable, but less fl exible • j imm is equivalent to ARM64 b instruction • o ff set range: ±1 MiB: Narrower than ARM64 122
restore regs jump to original CF Concept save regs call hook func restore regs Implementation Common call syscall table jump to original CF Uncommon Breakdown Misc - Only one common part - 428 bytes - Uncommon part is instantiated for each svc - 28 bytes
restore regs Implementation Common call syscall table jump to original CF Uncommon Misc - Only one common part - 428 bytes - Uncommon part is instantiated for each svc - 28 bytes • Measured the footprint • Target: int main() {} • 728 svc instructions Size [bytes] Common 428 Uncommon 20384 Total 20812
restore regs Implementation Common call syscall table jump to original CF Uncommon Misc - Only one common part - 428 bytes - Uncommon part is instantiated for each svc - 28 bytes • Measured the footprint • Target: int main() {} • 728 svc instructions Size [bytes] Common 428 Uncommon 20384 Total 20812 Results 428 bytes + 728 svc * 28 bytes = 20,812 bytes ~ 5.08 pages in 4 KiB page size The memory footprint is acceptably small
nish installing a hook to ls command binary • ASC-Hook requires disassembly • To fi nd syscall assignment instructions • svc-hook only needs to fi nd svc instructions • A simple pattern matching works well 127 Time to install a hook Mechanism Time [milisec] ASC-Hook 1890 svc-hook 25 75x faster
pgbench 10923 10790 1.22% • PostgreSQL 17.6 • Ran benchmark on local • 12/16 cores: DB • 4/16 cores: pgbench • 36 clients in total • TPS: Transaction Per Second