Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introspecting Mach

Introspecting Mach

Digging into Mach/Darwin internals with DTrace and GDB.

Volodymyr Kyrylov

March 31, 2012
Tweet

More Decks by Volodymyr Kyrylov

Other Decks in Programming

Transcript

  1. introspection |ˌintrəәˈspekSHəәn| noun the examination or observation of one's own

    mental and emotional processes: quiet introspection can be extremely valuable.
  2. History • Carnegie Mellon University • 1984-1995 • Accent successor

    • UNIX kernel is a “dumping ground for virtually every new feature or facility” • service operating system • multiprocessing • transparent distributed operation, message-passing • started on top of 4.3BSD code base
  3. Mach • Microkernel based on four key abstractions • tasks

    hold resources and run threads • threads are contexts of execution on processors • ports are unidirectional queues between tasks • messages are structured objects sent to ports
  4. Mach •All about IPC •no direct access to other task’s

    data — separate address spaces •no syscalls: a trap results in a message xfer •Everything is a server •even a filesystem •BSD code components turned into Mach servers •component failure does not take the system down
  5. Real World: BSD syscall — 21 μs Mach IPC —

    144 μs * numbers are almost made up, just get the idea ** L4 guys are happy; that’s the other huge story The Mach project failed, Mach did not.
  6. Mach appliances •DEC OSF/1 (1988) •IBM Workplace OS (1991) •NeXTSTEP

    (1989) •MkLinux (1996) •First Linux on PowerPC by Apple •and ...
  7. UNIX • Single UNIX Specification • just a standardized set

    of APIs! • everything is a file, right • Implementation details are not defined! • Linux • BSD (Free, Net, Open, DragonFly) • Solaris (Illumos, SmartOS) • Darwin (XNU kernel) • definitely the most fun one
  8. XNU recipe • let’s take a Mach microkernel • add

    a BSD (FreeBSD) server • blend them together • we need speed, right • write a stable driver framework • in C++
  9. Getting inside the kernel % hdiutil mount -quiet kernel_debug_kit_10.7.3_11d50.dmg %

    plj /Library/Preferences/SystemConfiguration/com.apple.Boot.plist { "Kernel Flags": "-v kmem=1" } % cat kdebug.gdb set logging file /tmp/gdb set logging on #set logging overwrite target darwin-kernel file /Volumes/KernelDebugKit/mach_kernel source /tank/proger/dev/darwin/xnu/kgmacros source /tank/proger/dev/darwin/xnu/kgmacros.my attach % sudo gdb -q -x ./kdebug.gdb Loading Kernel GDB Macros package. Type "help kgm" for more info. Connected. (gdb)
  10. Let’s begin % /bin/cat & [1] 11131 [1] + 11131

    suspended (tty input) /bin/cat % ps -lp $! UID PID PPID F CPU PRI NI SZ RSS WCHAN S ADDR TTY TIME CMD 501 11131 10620 4006 0 26 5 2434924 464 - TN ffffff801a388740 ttys000 0:00.00 / bin/ cat (gdb) showproc 0xffffff801a388740 task vm_map ipc_space #acts pid process io_policy wq_state command 0xffffff80172aa530 0xffffff80194c8d98 0xffffff8018d6e3b0 1 - - 11131 0xffffff801a388740 cat
  11. BSD to Mach • A BSD process maps on to

    a Mach task • A task is a set of: • virtual address space (struct vmmap) • port namespace (struct ipc_space) • a set of execution thread states • A process extends that by: • UNIX credentials (pid, uid, etc) • file descriptor table • signals • ...
  12. Mach Ports •A unique queue identifier •Associated with capabilities •One

    receive right •(it actually is a port identifier) •Many send/send-once rights •Capabilities associate with port namespaces •Namespaces associate with tasks •A thread within a task references a port by a local name (like FD)
  13. (gdb) showtaskrights 0xffffff80172aa530 task vm_map ipc_space #acts pid process io_policy

    wq_state command 0xffffff80172aa530 0xffffff80194c8d98 0xffffff8018d6e3b0 1 11131 0xffffff801a388740 cat ipc_space is_table table_next flags ports splaysize splaybase 0xffffff8018d6e3b0 0xffffff8019343a00 0xffffff801341c804 A 21 0x0000000000000000 0xffffff8018d6e3f8 object name rite urefs destname destination 0xffffff802639bc60 0x00000107 R 0 0x0000000000000107 cat(11131) 0xffffff802ec19338 0x00000207 R 0 0x0000000000000207 cat(11131) 0xffffff8027200ba8 0x0000030b R 0 0x000000000000030b cat(11131) 0xffffff802ee7fb50 0x00000407 R 0 0x0000000000000407 cat(11131) 0xffffff80154786d0 0x00000507 S 32771 0x0000000000003703 launchd(334) 0xffffff80187c9c60 0x0000060b R 0 0x000000000000060b cat(11131) 0xffffff802752fac8 0x00000707 S 1 0xffffff80277b5cc0 kobject(THREAD) 0xffffff8019dcb730 0x00000807 R 0 0x0000000000000807 cat(11131) 0xffffff80194dd810 0x00000907 S 3 0xffffff80172aa530 kobject(TASK) 0xffffff802ec0b0b8 0x00000a07 SR 1 0x0000000000000a07 cat(11131) 0xffffff8013425308 0x00000b03 S 1 0xffffff800082c1a0 kobject(CLOCK) 0xffffff80261428a0 0x00000c03 S 1 0xffffff80182fb6e0 kobject(SEMAPH 0xffffff8017a26cc0 0x00000d03 SR 1 0x0000000000000d03 cat(11131) 0xffffff80177783f0 0x00000e03 R 0 0x0000000000000e03 cat(11131) 0xffffff8026760c08 0x00000f03 R 0 0x0000000000000f03 cat(11131) 0xffffff8016adb8f8 0x00001003 S 1 0x0000000000008603 launchd(334) 0xffffff8015b183f0 0x00001103 S 1 0xffffff8014896298 kobject()
  14. % cat portsnoop.d #define ipc_space_kernel (void *)0xffffff801341bf40 /* looked up

    in gdb */ #define space_comm(space) ((proc_t)space->is_task->bsd_info)->p_comm #define! IO_BITS_ACTIVE!0x80000000 ipc_kmsg_send:entry { ! this->h = ((struct ipc_kmsg *)arg0)->ikm_header; ! this->localp = this->h->msgh_local_port; ! this->remotep = this->h->msgh_remote_port; ! this->rvalid = this->remotep->ip_object.io_bits & IO_BITS_ACTIVE; ! this->rspace = this->remotep->data.receiver; } ipc_kmsg_send:entry /this->rvalid != 0 && this->rspace != ipc_space_kernel && this->rspace->is_task == 0/ { ! printf("%s -> NULL TASK (space %p)", execname, this->rspace); ! stack(); }
  15. ipc_kmsg_send:entry /this->rspace == ipc_space_kernel #ifdef FILTER_COMM && execname == FILTER_COMM

    #endif / { ! printf("%p -> %p (%s -> kernel)", this->localp, this->remotep, execname); } ipc_kmsg_send:entry /this->rvalid != 0 && this->rspace != ipc_space_kernel && this->rspace- >is_task != 0 #ifdef FILTER_COMM && (space_comm(this->rspace) == FILTER_COMM || execname == FILTER_COMM) #endif / { ! printf("%p -> %p (%s -> %s)", this->localp, this->remotep, execname, space_comm(this->rspace)); ! @[execname, this->localp, this->remotep, ustack()] = count(); }
  16. % dtrace -C -DFILTER_COMM='"cat"' -s ./portsnoop.d CPU ID FUNCTION:NAME 0

    146323 ipc_kmsg_send:entry ffffff80271e0b50 -> ffffff801a059840 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275ee648 -> ffffff8013425e88 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff80272e8228 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) .... 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff80154786d0 (cat ->launchd) 0 146323 ipc_kmsg_send:entry 0 -> ffffff80275259e0 (launchd -> cat) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) .... 0 146323 ipc_kmsg_send:entry ffffff80275259e0 -> ffffff801a059840 (cat -> kernel) 1 146323 ipc_kmsg_send:entry taskgated -> NULL TASK (space ffffff801341beb0) mach_kernel`mach_msg_overwrite_trap+0xb8 mach_kernel`thread_set_child+0x150 mach_kernel`hndl_mach_scall64+0x13
  17. snooping message dispatcher ipc_kmsg_send:entry /this->rspace == ipc_space_kernel #ifdef FILTER_COMM &&

    execname == FILTER_COMM #endif / { ! printf("%p -> %p (%s -> kernel)", this->localp, this->remotep, execname); self->mon = 1; } fbt::: /self->mon == 1/ {} ipc_kmsg_send:return /self->mon == 1/ { ! self->mon = 0; }
  18. MIG const struct mig_subsystem *mig_e[] = { (const struct mig_subsystem

    *)&mach_vm_subsystem, (const struct mig_subsystem *)&mach_port_subsystem, (const struct mig_subsystem *)&mach_host_subsystem, (const struct mig_subsystem *)&host_priv_subsystem, (const struct mig_subsystem *)&host_security_subsystem, (const struct mig_subsystem *)&clock_subsystem, (const struct mig_subsystem *)&clock_priv_subsystem, (const struct mig_subsystem *)&processor_subsystem, (const struct mig_subsystem *)&processor_set_subsystem, (const struct mig_subsystem *)&is_iokit_subsystem, (const struct mig_subsystem *)&memory_object_name_subsystem, ! (const struct mig_subsystem *)&lock_set_subsystem, ! (const struct mig_subsystem *)&ledger_subsystem, ! (const struct mig_subsystem *)&task_subsystem, ! (const struct mig_subsystem *)&thread_act_subsystem, #if VM32_SUPPORT ! (const struct mig_subsystem *)&vm32_map_subsystem, #endif ! (const struct mig_subsystem *)&UNDReply_subsystem, ! (const struct mig_subsystem *)&default_pager_object_subsystem, #if XK_PROXY (const struct mig_subsystem *)&do_uproxy_xk_uproxy_subsystem, #endif /* XK_PROXY */ #if MACH_MACHINE_ROUTINES (const struct mig_subsystem *)&MACHINE_SUBSYSTEM, #endif /* MACH_MACHINE_ROUTINES */ #if MCMSG && iPSC860 ! (const struct mig_subsystem *)&mcmsg_info_subsystem, #endif /* MCMSG && iPSC860 */ #if CONFIG_MACF ! (const struct mig_subsystem *)&security_subsystem, #endif };
  19. env SRCROOT=$PWD/osfmk OBJROOT=/tmp/mig ARCHS=i386 bash -x ./libsyscall/xcodescripts/mach_install_mig.sh % ls /tmp/mig

    clockServer.c! ! host_privServer.c! mach_hostServer.c ! processor_setServer.c clockUser.c! ! host_privUser.c! ! mach_hostUser.c! ! processor_setUser.c clock_privServer.c! host_securityServer.c!mach_portServer.c ! taskServer.c clock_privUser.c! host_securityUser.c! mach_portUser.c! ! taskUser.c clock_replyServer.c! ledgerServer.c! ! mach_vmServer.c! ! thread_actServer.c clock_replyUser.c! ledgerUser.c! ! mach_vmUser.c! ! thread_actUser.c excServer.c! ! lock_setServer.c!processorServer.c! vm_mapServer.c excUser.c! ! lock_setUser.c! ! processorUser.c! ! vm_mapUser.c