Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PHP Internals for the Inquisitive Developer

PHP Internals for the Inquisitive Developer

Presented February 8, 2019 at Sunshine PHP.

Presented December 4, 2018 at Lisbon PHP meetup.

Presented October 11, 2018 at Symfony Loves PHP 2018 - San Francisco.

Jeremy Mikola

February 08, 2019
Tweet

More Decks by Jeremy Mikola

Other Decks in Programming

Transcript

  1. PHP Internals for the
    Inquisitive Developer
    Jeremy Mikola
    @jmikola

    View Slide

  2. A Little About Myself

    View Slide

  3. View Slide

  4. Some Topics to Cover
    ● Request lifecycle
    ● Types of PHP extensions
    ● Internal data structures
    ● Navigating PHP’s source
    ● Executing a PHP file
    ● Opcodes, caching, and optimization
    ● Examining how PHP interacts with C
    ● Debugging crashes

    View Slide

  5. View Slide

  6. Everything Starts Here*
    int main(int argc, char *argv[])
    {
    return 0;
    }

    View Slide

  7. View Slide

  8. View Slide

  9. View Slide

  10. View Slide

  11. Server APIs
    ● Apache mod_php
    ● Command Line Interface (CLI)
    ● Common Gateway Interface (CGI)
    ○ FastCGI allows for persistent processes
    ○ Apache mod_fcgid
    ● FastCGI Process Manager (FPM)
    ○ Apache mod_proxy_fcgi

    View Slide

  12. View Slide

  13. Types of Extensions
    ● Extensions (aka “modules”)
    ○ Allows new concepts to be added to PHP
    ○ Integration with C, system libraries
    ○ e.g. APCu, MongoDB, OpenSSL
    ● Zend Extensions
    ○ More hooks for changing PHP’s behavior
    ○ Commonly used for debuggers and profilers
    ○ May also register normal extensions for user APIs
    ○ e.g. OPCache, Xdebug, phpdbg, Blackfire

    View Slide

  14. Module Scope
    ● Module initialization
    ○ Allocate persistent read-only globals
    ○ Register INI entries, classes, constants, etc.
    ○ Initialize third-party libraries
    ● Module shutdown
    ○ Free persistent allocations
    ○ Unregister INI entries

    View Slide

  15. Request Scope
    ● Request initialization
    ○ Allocate request-bound memory
    ○ Reset globals as needed
    ○ Avoid unintentionally altering global state
    ● Request shutdown
    ○ Free request-bound allocations
    ○ Zend Memory Manager helps catch leaks

    View Slide

  16. Process Model

    View Slide

  17. Thread Model

    View Slide

  18. Global Scope
    ● Globals
    ○ Use persistent allocation, initialize to zero
    ○ Requests access globals via TSRM macros
    ○ Zend hash globals can be used as a cache
    ● Process model
    ○ GINIT is called once, before MINIT
    ○ GSHUTDOWN is called once, during MSHUTDOWN
    ● Thread model
    ○ Additionally called each time a thread spawns or dies
    ○ TSRM macros take care of thread local storage

    View Slide

  19. PHP Process Manager
    ● Uses ReactPHP to manage a pool of PHP processes (CLI)
    ● Bootstraps application once per worker process
    ● Each worker handles a series of HTTP requests
    ● Leverages request/response design in frameworks
    ● Operates entirely within one CLI “request”
    ● Issues with memory leaks, resetting state

    View Slide

  20. View Slide

  21. It looks like you're giving a presentation
    on PHP internals. Would you like help?
    Talk about zvals

    View Slide

  22. Zvals
    ● 16-byte struct consisting of three union fields
    ● value (8-bytes)
    ○ Integer and double values are stored inline
    ○ Pointers are used for other types (e.g. string, array, object)
    ○ Not used for null and boolean values (denoted by type_info)
    ● u1 contains type_info (4-bytes)
    ○ Type byte and various bit flags
    ● u2 is a multi-purpose union (4-bytes)
    ○ Used for hash tables, AST line numbers, foreach iteration, etc.

    View Slide

  23. Zval Improvements in PHP 7
    ● Zvals are no longer individually heap-allocated
    ○ Can now be directly embedded (e.g. hash buckets)
    ● Zvals are no longer refcounted
    ○ Refcounts stored on values themselves (e.g. zend_string)
    ○ Values can be shared independently of the zval struct
    ● Much less indirection and pointer traversal

    View Slide

  24. Improvements to Other Types
    ● New string representation
    ○ zend_string struct replaces char*
    ○ Encapsulates refcount, string length, and data
    ● Hash table redesigned
    ○ Buckets allocated in sequence (less pointer traversal)
    ○ Optimizations for “packed arrays”
    ● Objects are more lightweight
    ○ Declared properties embedded in zend_object

    View Slide

  25. Types and Attributes
    Refcounted Collectable Copyable Immutable
    Simple Types
    String ✘ ✘
    Array ✘ ✘ ✘
    Object ✘ ✘
    Resource ✘
    Reference ✘
    Interned String
    Immutable Array ✘

    View Slide

  26. Navigating PHP

    View Slide

  27. View Slide

  28. View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. Executing PHP

    View Slide

  35. Interpretation vs. Compilation

    View Slide

  36. function test()
    {
    echo "Hello!\n";
    return true;
    echo "Not executed.\n";
    }
    An Admittedly Contrived PHP Script

    View Slide

  37. T_OPEN_TAG
    T_WHITESPACE
    T_FUNCTION T_WHITESPACE T_STRING ( ) T_WHITESPACE
    { T_WHITESPACE
    T_ECHO T_WHITESPACE T_CONSTANT_ENCAPSED_STRING ; T_WHITESPACE
    T_RETURN T_WHITESPACE T_STRING ; T_WHITESPACE
    T_ECHO T_WHITESPACE T_CONSTANT_ENCAPSED_STRING ; T_WHITESPACE
    } T_WHITESPACE
    Parsing Tokens (token_get_all)

    View Slide

  38. View Slide

  39. Vulcan Logic Dumper

    View Slide

  40. Examining Opcodes (vld)
    function name: test
    number of ops: 4
    compiled vars: none
    line #* E I O op fetch ext return operands
    -----------------------------------------------------
    5 0 E > ECHO 'Hello%21%0A'
    6 1 > RETURN
    8 2* ECHO 'Not+executed.%0A'
    9 3* > RETURN null

    View Slide

  41. Opcode Caching
    ● Various caching strategies
    ● OpArrays (i.e. opcode sequences)
    can be optimized
    ● OpArrays still interpreted at runtime
    ● JIT would allow machine code caching

    View Slide

  42. Pass Bit Optimization
    1 1 << 0 Casts, operators, internal functions with constant and literal arguments
    2 1 << 1 Type coercion in expressions, conditional elimination (e.g. if statements)
    3 1 << 2 Optimize self-assignment, post-increment, and jumps
    5 1 << 4 Block optimization of control flow graph (CFG)
    9 1 << 8 Optimize usage of temporary variables (register allocation)
    10 1 << 9 Remove NOPs
    - 1 << 14 Collect constants for future replacement
    OpCode Optimizations
    https://stackoverflow.com/a/21291587/162228
    https://phpinternals.net/categories/opcache

    View Slide

  43. Approaching C

    View Slide

  44. System Calls
    ● open and close file descriptors
    ● read and write files, sockets, devices
    ● fork, exec, or wait on another process
    ● exit the current process
    ● Send signals to other processes (kill)
    ● Map files or devices to memory (mmap, munmap)
    ● Allocate process memory (brk, sbrk)

    View Slide

  45. Tracing System Calls (strace)
    echo "Hello!\n";
    $ strace -e write php example.php > /dev/null
    write(1, "Hello!\n", 7) = 7
    +++ exited with 0 +++

    View Slide

  46. Tracing System Calls (strace)
    $ strace -e openat php example.php 2>&1 | grep ini
    openat(AT_FDCWD, "/usr/bin/php-cli.ini", O_RDONLY) = -1
    openat(AT_FDCWD, "/etc/php/7.2/cli/php-cli.ini", O_RDONLY) = -1
    openat(AT_FDCWD, "/usr/bin/php.ini", O_RDONLY) = -1
    openat(AT_FDCWD, "/etc/php/7.2/cli/php.ini", O_RDONLY) = 3
    openat(AT_FDCWD, "/etc/php/7.2/cli/conf.d/10-opcache.ini", …
    openat(AT_FDCWD, "/etc/php/7.2/cli/conf.d/10-pdo.ini", …

    View Slide

  47. Tracing Library Calls (ltrace)
    $ ltrace -l mongodb.so php example.php
    mongodb.so->mongoc_log_trace_disable(0, 1, 0x55a78b57a410, …
    mongodb.so->mongoc_log_set_handler(0, 0, 0x55a78b57a410, …
    mongodb.so->mongoc_init(0, 0, 0x55a78b552bb8, 0x7f1941ec6320 …
    mongodb.so->_mongoc_openssl_init(0x7ffcf6e60460, …
    mongodb.so->bson_malloc0(40, 0, 0, 0x7f1940ecf8e0) …
    mongodb.so->bson_malloc0(40, 0x55a78b42a018, 0x55a78b554bc0, …
    mongodb.so->_mongoc_counters_init(0x55a78b5a4710, …

    View Slide

  48. Debugging Crashes (gdb)
    ● The most common crashes are segfaults
    ○ C makes it trivially easy to access memory incorrectly
    ● Ideally, crashes produce core dumps, which can be inspected
    ○ https://bugs.php.net/bugs-generating-backtrace.php
    ● PHP source includes a .gdbinit file with helpful macros

    View Slide

  49. View Slide

  50. This’ll Do Just Fine
    class Crash
    {
    public function __tostring()
    {
    return "".$this;
    }
    }
    "".(new Crash());

    View Slide

  51. Capturing a Core Dump
    $ php class-tostring-recursion.php
    Segmentation fault (core dumped)
    $ ls
    class-tostring-recursion.php core
    $ gdb `which php` core

    View Slide

  52. Capturing a Core Dump
    $ gdb -q --args php class-tostring-recursion.php
    Reading symbols from php...done.
    (gdb) run
    Starting program: /usr/bin/php class-tostring-recursion.php
    Program received signal SIGSEGV, Segmentation fault.
    0x0000555555c9ab9e in zend_call_function (fci=variable: Cannot access memory at address 0x7fffff7fef78>,
    fci_cache=address 0x7fffff7fef70>) at
    /tmp/build_php-7.2.6.sx8/php-7.2.6/Zend/zend_execute_API.c:659
    (gdb)

    View Slide

  53. Traversing the Backtrace
    (gdb) bt -20
    #45993 0x0000555555ce6e8a in zend_call_method (object=0x7ffff3423110,
    #45994 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x
    #45995 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423110)
    #45996 0x0000555555cb4b68 in zend_make_printable_zval (expr=0x7ffff342
    #45997 0x0000555555cae0a8 in concat_function (result=0x7ffff3423120, o
    #45998 0x0000555555d4069a in ZEND_CONCAT_SPEC_CONST_TMPVAR_HANDLER ()
    #45999 0x0000555555db6336 in execute_ex (ex=0x7ffff34230c0) at /tmp/bu
    #46000 0x0000555555c9b9e1 in zend_call_function (fci=0x7fffffffb0b0, f
    #46001 0x0000555555ce6e8a in zend_call_method (object=0x7ffff3423090,
    #46002 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x
    #46003 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423090)

    View Slide

  54. #45999 0x0000555555db6336 in execute_ex (ex=0x7ffff34230c0) at /tmp/bu
    #46000 0x0000555555c9b9e1 in zend_call_function (fci=0x7fffffffb0b0, f
    #46001 0x0000555555ce6e8a in zend_call_method (object=0x7ffff3423090,
    #46002 0x0000555555d13de3 in zend_std_cast_object_tostring (readobj=0x
    #46003 0x0000555555ca5c8f in _zval_get_string_func (op=0x7ffff3423090)
    #46004 0x0000555555cb4b68 in zend_make_printable_zval (expr=0x7ffff342
    #46005 0x0000555555cae0a8 in concat_function (result=0x7ffff34230b0, o
    #46006 0x0000555555d4069a in ZEND_CONCAT_SPEC_CONST_TMPVAR_HANDLER ()
    #46007 0x0000555555db6336 in execute_ex (ex=0x7ffff3423030) at /tmp/bu
    #46008 0x0000555555dbaa7d in zend_execute (op_array=0x7ffff3489300, re
    #46009 0x0000555555cb8f1e in zend_execute_scripts (type=8, retval=0x0,
    #46010 0x0000555555befaa9 in php_execute_script (primary_file=0x7fffff
    #46011 0x0000555555dbd88f in do_cli (argc=2, argv=0x5555568f9380) at /
    #46012 0x0000555555dbed26 in main (argc=2, argv=0x5555568f9380) at /tm

    View Slide

  55. (take a breath)

    View Slide

  56. Resources and Further Reading
    References about Maintaining and Extending PHP
    https://wiki.php.net/internals/references
    PHP Internals (Thomas Punt)
    https://phpinternals.net/
    Derick Rethans’ Blog
    https://derickrethans.nl/
    Nikita Popov’s Blog
    https://nikic.github.io/
    PHP Internals Book (Nikita and Julien Pauli)
    http://www.phpinternalsbook.com/

    View Slide

  57. Thanks!
    Jeremy Mikola
    @jmikola

    View Slide

  58. Photo Credits
    ● https://imgur.com/gallery/0hcTtiW
    ● http://inception.wikia.com/wiki/Fischer_inception_job?file=Cobb_using_the_Mr._Charles_tactic.png
    ● http://www.phpinternalsbook.com/
    ● https://support.cloud.engineyard.com/hc/en-us/articles/205411888-PHP-Performance-I-Everything-You-Need-to-Kno
    w-About-OpCode-Caches
    ● http://www.leonarddavid.com/wp-content/uploads/2015/04/death-grip.jpg
    ● http://weca.mp/2016/images/coach/sara.png

    View Slide