Some Topics to Cover ● Request lifecycle ● Types of PHP extensions ● Internal data structures ● Navigating PHP’s source ● Executing a PHP file ● Opcodes, caching, and optimization ● Examining how PHP interacts with C ● Debugging crashes
Types of Extensions ● Extensions (aka “modules”) ○ Allows new concepts to be added to PHP ○ Integration with C, system libraries ○ e.g. APCu, MongoDB, OpenSSL ● Zend Extensions ○ More hooks for changing PHP’s behavior ○ Commonly used for debuggers and profilers ○ May also register normal extensions for user APIs ○ e.g. OPCache, Xdebug, phpdbg, Blackfire
Global Scope ● Globals ○ Use persistent allocation, initialize to zero ○ Requests access globals via TSRM macros ○ Zend hash globals can be used as a cache ● Process model ○ GINIT is called once, before MINIT ○ GSHUTDOWN is called once, during MSHUTDOWN ● Thread model ○ Additionally called each time a thread spawns or dies ○ TSRM macros take care of thread local storage
PHP Process Manager ● Uses ReactPHP to manage a pool of PHP processes (CLI) ● Bootstraps application once per worker process ● Each worker handles a series of HTTP requests ● Leverages request/response design in frameworks ● Operates entirely within one CLI “request” ● Issues with memory leaks, resetting state
Zvals ● 16-byte struct consisting of three union fields ● value (8-bytes) ○ Integer and double values are stored inline ○ Pointers are used for other types (e.g. string, array, object) ○ Not used for null and boolean values (denoted by type_info) ● u1 contains type_info (4-bytes) ○ Type byte and various bit flags ● u2 is a multi-purpose union (4-bytes) ○ Used for hash tables, AST line numbers, foreach iteration, etc.
Zval Improvements in PHP 7 ● Zvals are no longer individually heap-allocated ○ Can now be directly embedded (e.g. hash buckets) ● Zvals are no longer refcounted ○ Refcounts stored on values themselves (e.g. zend_string) ○ Values can be shared independently of the zval struct ● Much less indirection and pointer traversal
Improvements to Other Types ● New string representation ○ zend_string struct replaces char* ○ Encapsulates refcount, string length, and data ● Hash table redesigned ○ Buckets allocated in sequence (less pointer traversal) ○ Optimizations for “packed arrays” ● Objects are more lightweight ○ Declared properties embedded in zend_object
Examining Opcodes (vld) function name: test number of ops: 4 compiled vars: none line #* E I O op fetch ext return operands ----------------------------------------------------- 5 0 E > ECHO 'Hello%21%0A' 6 1 > RETURN 8 2* ECHO 'Not+executed.%0A' 9 3* > RETURN null
Opcode Caching ● Various caching strategies ● OpArrays (i.e. opcode sequences) can be optimized ● OpArrays still interpreted at runtime ● JIT would allow machine code caching
System Calls ● open and close file descriptors ● read and write files, sockets, devices ● fork, exec, or wait on another process ● exit the current process ● Send signals to other processes (kill) ● Map files or devices to memory (mmap, munmap) ● Allocate process memory (brk, sbrk)
Debugging Crashes (gdb) ● The most common crashes are segfaults ○ C makes it trivially easy to access memory incorrectly ● Ideally, crashes produce core dumps, which can be inspected ○ https://bugs.php.net/bugs-generating-backtrace.php ● PHP source includes a .gdbinit file with helpful macros
Capturing a Core Dump $ gdb -q --args php class-tostring-recursion.php Reading symbols from php...done. (gdb) run Starting program: /usr/bin/php class-tostring-recursion.php Program received signal SIGSEGV, Segmentation fault. 0x0000555555c9ab9e in zend_call_function (fci=variable: Cannot access memory at address 0x7fffff7fef78>, fci_cache=address 0x7fffff7fef70>) at /tmp/build_php-7.2.6.sx8/php-7.2.6/Zend/zend_execute_API.c:659 (gdb)
Traversing the Backtrace (gdb) bt -20 #45993 0x0000555555ce6e8a in zend_call_method (object=0x7ffff3423110, #45994 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x #45995 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423110) #45996 0x0000555555cb4b68 in zend_make_printable_zval (expr=0x7ffff342 #45997 0x0000555555cae0a8 in concat_function (result=0x7ffff3423120, o #45998 0x0000555555d4069a in ZEND_CONCAT_SPEC_CONST_TMPVAR_HANDLER () #45999 0x0000555555db6336 in execute_ex (ex=0x7ffff34230c0) at /tmp/bu #46000 0x0000555555c9b9e1 in zend_call_function (fci=0x7fffffffb0b0, f #46001 0x0000555555ce6e8a in zend_call_method (object=0x7ffff3423090, #46002 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x #46003 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423090)
#45999 0x0000555555db6336 in execute_ex (ex=0x7ffff34230c0) at /tmp/bu #46000 0x0000555555c9b9e1 in zend_call_function (fci=0x7fffffffb0b0, f #46001 0x0000555555ce6e8a in zend_call_method (object=0x7ffff3423090, #46002 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x #46003 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423090) #46004 0x0000555555cb4b68 in zend_make_printable_zval (expr=0x7ffff342 #46005 0x0000555555cae0a8 in concat_function (result=0x7ffff34230b0, o #46006 0x0000555555d4069a in ZEND_CONCAT_SPEC_CONST_TMPVAR_HANDLER () #46007 0x0000555555db6336 in execute_ex (ex=0x7ffff3423030) at /tmp/bu #46008 0x0000555555dbaa7d in zend_execute (op_array=0x7ffff3489300, re #46009 0x0000555555cb8f1e in zend_execute_scripts (type=8, retval=0x0, #46010 0x0000555555befaa9 in php_execute_script (primary_file=0x7fffff #46011 0x0000555555dbd88f in do_cli (argc=2, argv=0x5555568f9380) at / #46012 0x0000555555dbed26 in main (argc=2, argv=0x5555568f9380) at /tm
Resources and Further Reading References about Maintaining and Extending PHP https://wiki.php.net/internals/references PHP Internals (Thomas Punt) https://phpinternals.net/ Derick Rethans’ Blog https://derickrethans.nl/ Nikita Popov’s Blog https://nikic.github.io/ PHP Internals Book (Nikita and Julien Pauli) http://www.phpinternalsbook.com/