Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PHP Internals for the Inquisitive Developer

PHP Internals for the Inquisitive Developer

Presented February 8, 2019 at Sunshine PHP.

Presented December 4, 2018 at Lisbon PHP meetup.

Presented October 11, 2018 at Symfony Loves PHP 2018 - San Francisco.

Jeremy Mikola

February 08, 2019
Tweet

More Decks by Jeremy Mikola

Other Decks in Programming

Transcript

  1. Some Topics to Cover • Request lifecycle • Types of

    PHP extensions • Internal data structures • Navigating PHP’s source • Executing a PHP file • Opcodes, caching, and optimization • Examining how PHP interacts with C • Debugging crashes
  2. Server APIs • Apache mod_php • Command Line Interface (CLI)

    • Common Gateway Interface (CGI) ◦ FastCGI allows for persistent processes ◦ Apache mod_fcgid • FastCGI Process Manager (FPM) ◦ Apache mod_proxy_fcgi
  3. Types of Extensions • Extensions (aka “modules”) ◦ Allows new

    concepts to be added to PHP ◦ Integration with C, system libraries ◦ e.g. APCu, MongoDB, OpenSSL • Zend Extensions ◦ More hooks for changing PHP’s behavior ◦ Commonly used for debuggers and profilers ◦ May also register normal extensions for user APIs ◦ e.g. OPCache, Xdebug, phpdbg, Blackfire
  4. Module Scope • Module initialization ◦ Allocate persistent read-only globals

    ◦ Register INI entries, classes, constants, etc. ◦ Initialize third-party libraries • Module shutdown ◦ Free persistent allocations ◦ Unregister INI entries
  5. Request Scope • Request initialization ◦ Allocate request-bound memory ◦

    Reset globals as needed ◦ Avoid unintentionally altering global state • Request shutdown ◦ Free request-bound allocations ◦ Zend Memory Manager helps catch leaks
  6. Global Scope • Globals ◦ Use persistent allocation, initialize to

    zero ◦ Requests access globals via TSRM macros ◦ Zend hash globals can be used as a cache • Process model ◦ GINIT is called once, before MINIT ◦ GSHUTDOWN is called once, during MSHUTDOWN • Thread model ◦ Additionally called each time a thread spawns or dies ◦ TSRM macros take care of thread local storage
  7. PHP Process Manager • Uses ReactPHP to manage a pool

    of PHP processes (CLI) • Bootstraps application once per worker process • Each worker handles a series of HTTP requests • Leverages request/response design in frameworks • Operates entirely within one CLI “request” • Issues with memory leaks, resetting state
  8. It looks like you're giving a presentation on PHP internals.

    Would you like help? Talk about zvals
  9. Zvals • 16-byte struct consisting of three union fields •

    value (8-bytes) ◦ Integer and double values are stored inline ◦ Pointers are used for other types (e.g. string, array, object) ◦ Not used for null and boolean values (denoted by type_info) • u1 contains type_info (4-bytes) ◦ Type byte and various bit flags • u2 is a multi-purpose union (4-bytes) ◦ Used for hash tables, AST line numbers, foreach iteration, etc.
  10. Zval Improvements in PHP 7 • Zvals are no longer

    individually heap-allocated ◦ Can now be directly embedded (e.g. hash buckets) • Zvals are no longer refcounted ◦ Refcounts stored on values themselves (e.g. zend_string) ◦ Values can be shared independently of the zval struct • Much less indirection and pointer traversal
  11. Improvements to Other Types • New string representation ◦ zend_string

    struct replaces char* ◦ Encapsulates refcount, string length, and data • Hash table redesigned ◦ Buckets allocated in sequence (less pointer traversal) ◦ Optimizations for “packed arrays” • Objects are more lightweight ◦ Declared properties embedded in zend_object
  12. Types and Attributes Refcounted Collectable Copyable Immutable Simple Types String

    ✘ ✘ Array ✘ ✘ ✘ Object ✘ ✘ Resource ✘ Reference ✘ Interned String Immutable Array ✘
  13. <?php function test() { echo "Hello!\n"; return true; echo "Not

    executed.\n"; } An Admittedly Contrived PHP Script
  14. T_OPEN_TAG T_WHITESPACE T_FUNCTION T_WHITESPACE T_STRING ( ) T_WHITESPACE { T_WHITESPACE

    T_ECHO T_WHITESPACE T_CONSTANT_ENCAPSED_STRING ; T_WHITESPACE T_RETURN T_WHITESPACE T_STRING ; T_WHITESPACE T_ECHO T_WHITESPACE T_CONSTANT_ENCAPSED_STRING ; T_WHITESPACE } T_WHITESPACE Parsing Tokens (token_get_all)
  15. Examining Opcodes (vld) function name: test number of ops: 4

    compiled vars: none line #* E I O op fetch ext return operands ----------------------------------------------------- 5 0 E > ECHO 'Hello%21%0A' 6 1 > RETURN <true> 8 2* ECHO 'Not+executed.%0A' 9 3* > RETURN null
  16. Opcode Caching • Various caching strategies • OpArrays (i.e. opcode

    sequences) can be optimized • OpArrays still interpreted at runtime • JIT would allow machine code caching
  17. Pass Bit Optimization 1 1 << 0 Casts, operators, internal

    functions with constant and literal arguments 2 1 << 1 Type coercion in expressions, conditional elimination (e.g. if statements) 3 1 << 2 Optimize self-assignment, post-increment, and jumps 5 1 << 4 Block optimization of control flow graph (CFG) 9 1 << 8 Optimize usage of temporary variables (register allocation) 10 1 << 9 Remove NOPs - 1 << 14 Collect constants for future replacement OpCode Optimizations https://stackoverflow.com/a/21291587/162228 https://phpinternals.net/categories/opcache
  18. System Calls • open and close file descriptors • read

    and write files, sockets, devices • fork, exec, or wait on another process • exit the current process • Send signals to other processes (kill) • Map files or devices to memory (mmap, munmap) • Allocate process memory (brk, sbrk)
  19. Tracing System Calls (strace) <?php echo "Hello!\n"; $ strace -e

    write php example.php > /dev/null write(1, "Hello!\n", 7) = 7 +++ exited with 0 +++
  20. Tracing System Calls (strace) $ strace -e openat php example.php

    2>&1 | grep ini openat(AT_FDCWD, "/usr/bin/php-cli.ini", O_RDONLY) = -1 openat(AT_FDCWD, "/etc/php/7.2/cli/php-cli.ini", O_RDONLY) = -1 openat(AT_FDCWD, "/usr/bin/php.ini", O_RDONLY) = -1 openat(AT_FDCWD, "/etc/php/7.2/cli/php.ini", O_RDONLY) = 3 openat(AT_FDCWD, "/etc/php/7.2/cli/conf.d/10-opcache.ini", … openat(AT_FDCWD, "/etc/php/7.2/cli/conf.d/10-pdo.ini", …
  21. Tracing Library Calls (ltrace) $ ltrace -l mongodb.so php example.php

    mongodb.so->mongoc_log_trace_disable(0, 1, 0x55a78b57a410, … mongodb.so->mongoc_log_set_handler(0, 0, 0x55a78b57a410, … mongodb.so->mongoc_init(0, 0, 0x55a78b552bb8, 0x7f1941ec6320 … mongodb.so->_mongoc_openssl_init(0x7ffcf6e60460, … mongodb.so->bson_malloc0(40, 0, 0, 0x7f1940ecf8e0) … mongodb.so->bson_malloc0(40, 0x55a78b42a018, 0x55a78b554bc0, … mongodb.so->_mongoc_counters_init(0x55a78b5a4710, …
  22. Debugging Crashes (gdb) • The most common crashes are segfaults

    ◦ C makes it trivially easy to access memory incorrectly • Ideally, crashes produce core dumps, which can be inspected ◦ https://bugs.php.net/bugs-generating-backtrace.php • PHP source includes a .gdbinit file with helpful macros
  23. This’ll Do Just Fine <?php class Crash { public function

    __tostring() { return "".$this; } } "".(new Crash());
  24. Capturing a Core Dump $ php class-tostring-recursion.php Segmentation fault (core

    dumped) $ ls class-tostring-recursion.php core $ gdb `which php` core
  25. Capturing a Core Dump $ gdb -q --args php class-tostring-recursion.php

    Reading symbols from php...done. (gdb) run Starting program: /usr/bin/php class-tostring-recursion.php Program received signal SIGSEGV, Segmentation fault. 0x0000555555c9ab9e in zend_call_function (fci=<error reading variable: Cannot access memory at address 0x7fffff7fef78>, fci_cache=<error reading variable: Cannot access memory at address 0x7fffff7fef70>) at /tmp/build_php-7.2.6.sx8/php-7.2.6/Zend/zend_execute_API.c:659 (gdb)
  26. Traversing the Backtrace (gdb) bt -20 #45993 0x0000555555ce6e8a in zend_call_method

    (object=0x7ffff3423110, #45994 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x #45995 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423110) #45996 0x0000555555cb4b68 in zend_make_printable_zval (expr=0x7ffff342 #45997 0x0000555555cae0a8 in concat_function (result=0x7ffff3423120, o #45998 0x0000555555d4069a in ZEND_CONCAT_SPEC_CONST_TMPVAR_HANDLER () #45999 0x0000555555db6336 in execute_ex (ex=0x7ffff34230c0) at /tmp/bu #46000 0x0000555555c9b9e1 in zend_call_function (fci=0x7fffffffb0b0, f #46001 0x0000555555ce6e8a in zend_call_method (object=0x7ffff3423090, #46002 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x #46003 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423090)
  27. #45999 0x0000555555db6336 in execute_ex (ex=0x7ffff34230c0) at /tmp/bu #46000 0x0000555555c9b9e1 in

    zend_call_function (fci=0x7fffffffb0b0, f #46001 0x0000555555ce6e8a in zend_call_method (object=0x7ffff3423090, #46002 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x #46003 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423090) #46004 0x0000555555cb4b68 in zend_make_printable_zval (expr=0x7ffff342 #46005 0x0000555555cae0a8 in concat_function (result=0x7ffff34230b0, o #46006 0x0000555555d4069a in ZEND_CONCAT_SPEC_CONST_TMPVAR_HANDLER () #46007 0x0000555555db6336 in execute_ex (ex=0x7ffff3423030) at /tmp/bu #46008 0x0000555555dbaa7d in zend_execute (op_array=0x7ffff3489300, re #46009 0x0000555555cb8f1e in zend_execute_scripts (type=8, retval=0x0, #46010 0x0000555555befaa9 in php_execute_script (primary_file=0x7fffff #46011 0x0000555555dbd88f in do_cli (argc=2, argv=0x5555568f9380) at / #46012 0x0000555555dbed26 in main (argc=2, argv=0x5555568f9380) at /tm
  28. Resources and Further Reading References about Maintaining and Extending PHP

    https://wiki.php.net/internals/references PHP Internals (Thomas Punt) https://phpinternals.net/ Derick Rethans’ Blog https://derickrethans.nl/ Nikita Popov’s Blog https://nikic.github.io/ PHP Internals Book (Nikita and Julien Pauli) http://www.phpinternalsbook.com/