Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Testing Complex Applications for PHP7

Adam Baratz
February 17, 2017

Testing Complex Applications for PHP7

Wayfair is one of the world's largest online destinations for the home. Our storefront is a very large PHP application -- 3.5M LoC interacting with a wide array of extensions -- that serves 2M daily visitors. So we were delighted when our upgrade to PHP7 went without a hitch. It worked so well because of a test plan that covered a wide range of strategies and tools. This case study will combine a walkthrough of this project with a practical tour of PHP testing tools, from PHPUnit to GDB.

Adam Baratz

February 17, 2017
Tweet

Other Decks in Technology

Transcript

  1. Testing Complex Applications
    for PHP7

    View Slide

  2. What We Talk About
    When We Talk About Testing

    View Slide

  3. Test for Correctness?

    View Slide

  4. https://phpunit.de/getting-started.html

    View Slide

  5. Test for Correctness?

    View Slide

  6. Test for Correctness?

    View Slide

  7. Test for Correctness?

    View Slide

  8. Test for Correctness?

    View Slide

  9. Test for Correctness?

    View Slide

  10. Test for Correctness?

    View Slide

  11. Test for Correctness?
    puppet images via https://thenounproject.com/term/puppet/

    View Slide

  12. *yawn*

    View Slide

  13. Test for Correctness?

    View Slide

  14. Test for Evaluating and
    Managing Risk

    View Slide

  15. Wayfair and PHP7

    View Slide

  16. View Slide

  17. Wayfair and PHP7
    • Page execution time dropped by about 50%
    • CPU utilization dropped by about 30%
    For the details on why:
    https://nikic.github.io/2015/05/05/Internal-value-representation-in-PHP-7-part-1.html
    https://nikic.github.io/2015/06/19/Internal-value-representation-in-PHP-7-part-2.html

    View Slide

  18. View Slide

  19. View Slide

  20. Wayfair Before PHP7
    • 3.5M LoC over 28K files, mostly our own code, but some third-party
    • Coding conventions spanning several versions of PHP
    • 66 PHP extensions
    • Many officially supported
    • Some third-party
    • Some modified third-party
    • Some totally custom

    View Slide

  21. Risk Is What You Don’t Know

    View Slide

  22. View Slide

  23. Test for Evaluating and
    Managing Risk

    View Slide

  24. What We’ll Cover
    • Risk as “what you don’t know”
    • Identifying risk in complex applications
    • Common risk for infrastructure changes
    • Testing as risk management
    • Manual and automated tools
    • AKA software engineering as information gathering
    • Product management strategies
    • Identifying goals
    • Kicking off and running big projects
    • Choosing the most important thing to work on in service of that goal
    • AKA delivering value to the business

    View Slide

  25. Common Risk for Infrastructure Changes
    • Focused more on the desired outcome than on the path there
    • Divided attention among collaborators
    • Yours may not be the only such project in flight
    • Your systems will have in-between states
    • Something doesn’t work, it works once
    • You’ll live and die by your ability to monitor
    • Might not be worth it
    • Disaster is more likely than success

    View Slide

  26. Managing Common Risk
    • Always communicate in terms of value to the business
    • “Performance saves money,” not “spaceship operators!”
    • Don’t promise too much, too soon
    • Be in touch with broader org to understand potential collisions
    • Work the room
    • Share successes
    • Reinforce idea that project is headed in the right direction
    • Identify allies at different levels of the org
    • Harvest byproducts, opportunistically

    View Slide

  27. Planning
    • Never expect to account for everything
    • Map out the moving parts and dependencies
    • Distinguish the knowns from the unknowns
    • Guide future thinking

    View Slide

  28. Planning Tools
    • phpinfo() / php -i
    • Documentation
    • https://github.com/php/php-src/blob/PHP-7.0.0/UPGRADING
    • http://php.net/manual/en/migration70.incompatible.php
    • https://wiki.php.net/rfc/remove_deprecated_functionality_in_php7
    • https://github.com/gophp7/gophp7-ext/wiki/extensions-catalog

    View Slide

  29. View Slide

  30. Todo (30 October 2015)
    • Investigate apc extension.
    • Investigate cgi-fcgi extension.
    • Investigate whether bug with dba extension will affect us.
    • Remove dependency on ereg extension. Find repo references via
    /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/. As of Oct 30, there were references to ereg() and split().
    • Update ketama extension.
    • Investigate memcache extension.
    • Remove mhash from puppet config. The only places in the php repo which could use it favor using the hash extension.
    • Investigate status of update to mongo extension.
    • Remove dependency on mssql extension.
    • Remove dependency on mysql extension.
    • Investigate whether open bugs with openssl extension will affect us.
    • Merge our PDO patches into master. It's not clear how much pdo_dblib has been tested by core maintainers, so we should test
    closely. Alternatively, SQL Relay may have supplanted pdo_dblib by the time we're ready to move forward with PHP7.
    • Investigate Zend OPcache extension.
    • Approve and commit compatibility updates for wfstring.

    View Slide

  31. Don’t Prematurely Discard Risk
    • Start simple, start naïve
    • Don’t assume anything until you do the research
    • Start big, start placeholder
    • You can reduce “Investigate apc extension” later
    • Start by overexplaining
    • Link to sources in case anyone needs to work backwards

    View Slide

  32. Prioritizing: ereg vs. mssql
    ereg
    • 7 functions
    • One-to-one replacements essentially in place via PCRE and string functions
    • No more than a dozen references, each of which could be updated easily
    mssql
    • 30 functions
    • Website would be largely unusable without DB access
    • We already started transitioning to PDO
    • Some functionality wasn’t replicated
    • Some behaviors were subtly different

    View Slide

  33. Focus First On The Biggest Risk
    • Choose the big, hard, unknown thing
    • That’s where all the risk lives
    • i.e., everything that could derail your project
    • Get the clearest view of project scope, as soon as possible
    • Guide further future thinking

    View Slide

  34. View Slide

  35. Todo (30 October 2015)
    • Investigate apc extension.
    • Investigate cgi-fcgi extension.
    • Investigate whether bug with dba extension will affect us.
    • Remove dependency on ereg extension. Find repo references via
    /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/. As of Oct 30, there were references to ereg() and split().
    • Update ketama extension.
    • Investigate memcache extension.
    • Remove mhash from puppet config. The only places in the php repo which could use it favor using the hash extension.
    • Investigate status of update to mongo extension.
    • Remove dependency on mssql extension.
    • Remove dependency on mysql extension.
    • Investigate whether open bugs with openssl extension will affect us.
    • Merge our PDO patches into master. It's not clear how much pdo_dblib has been tested by core maintainers, so we should test
    closely. Alternatively, SQL Relay may have supplanted pdo_dblib by the time we're ready to move forward with PHP7.
    • Investigate Zend OPcache extension.
    • Approve and commit compatibility updates for wfstring.

    View Slide

  36. Engineering Time: None
    Execution Time: Short
    Actionability: Varies
    Followup Time: Varies, but can be highly parallelizable
    Planning Fact Sheet

    View Slide

  37. Think Diminishing Risk, Not Deadlines
    • Use dates to say when something should be started, not finished
    • Calendar dates are a Procrustean bed
    • Opportunity for team to feel they messed up
    • Opportunity for stakeholders to dangle a knife
    • Team should focus on solving problems and delivering value
    • Stakeholders should know your pace through regular conversation
    • You want them to be partners in the process, not adversaries
    • Talk in rough terms: days, weeks, months
    • Talk in ranges

    View Slide

  38. Always Be Planning
    Continuously…
    • Review task list (single source of truth)
    • Add and reorder items
    • Add information as it’s obtained
    • Be specific, cite sources
    • Test certainty of previously made assertions
    • Ensure format of “map” exposes context and risk to group
    • Facilitate active participation across group

    View Slide

  39. Changes to variable handling (PHP7)
    Indirect variable, property and method references are now interpreted with left-to-right semantics.
    $$foo['bar']['baz'] // interpreted as ($$foo)['bar']['baz']
    $foo->$bar['baz'] // interpreted as ($foo->$bar)['baz']
    $foo->$bar['baz']() // interpreted as ($foo->$bar)['baz']()
    Foo::$bar['baz']() // interpreted as (Foo::$bar)['baz']()
    To restore the previous behavior add explicit curly braces:
    ${$foo['bar']['baz']}
    $foo->{$bar['baz']}
    $foo->{$bar['baz']}()
    Foo::{$bar['baz']}()

    View Slide

  40. Static Analysis
    • grep (e.g., for deprecated functions)
    • php -l (linter)
    • php7mar (detect potential compatibility issues)
    • Phan (AST)
    For many more: https://github.com/exakat/php-static-analysis-tools

    View Slide

  41. Sample php7mar output
    Scanning testcases.php
    Including file extensions: php
    Processed 148 lines contained in 1 files.
    Processing took 0.048094034194946 seconds.
    # critical#### testcases.php
    * variableInterpolation
    * Line 2: `$$foo['bar']['baz']; //Interpreted as ($$foo)['bar']['baz']`
    * Line 3: `$foo->$bar['baz']; //Interpreted as ($foo->$bar)['baz']`
    * Line 4: `$foo->$bar['baz'](); //Interpreted as ($foo->$bar)['baz']()`
    * Line 5: `Foo::$bar['baz'](); //Interpreted as (Foo::$bar)['baz']()`
    * Line 6: `global $$foo->bar; //The global keyword now only accepts simple variables.`
    * duplicateFunctionParameter
    * Line 15: `function foo($a, $b, $unused, $unused) { /*...*/ }`
    * reservedNames
    ...

    View Slide

  42. Evaluating Static Analysis Tools
    • Start with “weakest” analysis
    • https://github.com/etsy/phan/wiki/Tutorial-for-Analyzing-a-Large-Sloppy-
    Code-Base
    • Spot check output
    • Consider…
    • Output volume
    • Output value
    • Ownership cost (ad hoc runs or someone’s ongoing responsibility?)

    View Slide

  43. Engineering Time: Short
    Execution Time: Short (consider as CI step)
    Actionability: Generates concrete punch list, but there could be false
    positives
    Followup Time: Varies, but can be highly parallelizable
    Static Analysis Fact Sheet

    View Slide

  44. Finding the Next Biggest Risk
    • Be alert to momentum
    • Momentum implies reduced risk
    • Reduced risk implies the project is winding down
    • Managing execution is different from managing the project
    • Even if there’s a lot of execution to manage
    • Don’t confuse tasks with objectives
    • Invert any underlying assumptions
    • In this case: when can we start running code?
    • Look for simplest opportunity to start sizing next biggest risk

    View Slide

  45. Prioritizing: when can we start running code?
    What are all the ways we can run code?
    • Web requests (php-fpm)
    • Batch processes (i.e., php -f)
    • Automated tests
    • Unit tests
    • Integration tests
    • Acceptance tests (browser tests)

    View Slide

  46. From: PHP7 Engineer
    Sent: Mar 16, 2016
    To: PHP7 Team; Chief Architect
    Subject: Exciting PHP7 news
    Exciting proof of something everyone already assumes – Adam and I ran PHP5.6 and PHP7 head to head, and the results are
    that PHP7 is significantly faster than PHP5.6.
    Methodology
    We compared running times while unit testing feature_detect_test.php (for those who are unfamiliar, this is a regex-heavy set of tests
    that is notoriously long-running). PHP5.6 completed the tests in an average of 14.37 seconds, whereas PHP7 took an average of 2.9
    seconds.
    Conclusions
    This is primarily a proof-of-concept while working in a vacuum, but it is promising that we are heading in the right direction.
    Other Observations
    PHP7 used up significantly more memory, as reported by PHPUnit (10MB vs 5.75MB).
    The initial iteration of PHP5.6 was quite the outlier – removing that reduces the average to about 13.44 seconds.
    Next Steps
    Our next project is to get PHP7 to render actual pages on our site – so far this has been throwing 502s while the server segfaults.
    Steps we had to take to get the unit tests to run:
    • Disable codeception + dependencies
    • Manually build DOM + ctype extensions
    • Update MPDF to latest development branch
    Below you’ll find a few charts outlining the runtimes of each individual iteration. […]

    View Slide

  47. Learning From Tests
    • Accept small victories
    • Mostly not working is okay
    • Identify components that are not working the most
    • Rearticulate next most important goal
    • AKA Always Be Planning

    View Slide

  48. Managing Common Risk
    • Always communicate in terms of value to the business
    • “Performance saves money,” not “spaceship operators!”
    • Don’t promise too much, too soon
    • Be in touch with broader org to understand potential collisions
    • Work the room
    • Share successes
    • Reinforce idea that project is headed in the right direction
    • Identify allies at different levels of the org
    • Harvest byproducts, opportunistically

    View Slide

  49. Automated Testing
    • Unit tests
    • Acceptance tests (browser tests)
    • Integration tests
    Most automated tests can be built on the same framework (PHPUnit).

    View Slide

  50. https://phpunit.de/getting-started.html

    View Slide

  51. Sample PHPUnit output
    PHPUnit 6.0.0 by Sebastian Bergmann and contributors.
    ...F
    Time: 0 seconds, Memory: 5.75Mb
    There was 1 failure:
    1) DataTest::testAdd with data set #3 (1, 1, 3)
    Failed asserting that 2 matches expected 3.
    /home/sb/DataTest.php:9
    FAILURES!
    Tests: 4, Assertions: 4, Failures: 1.

    View Slide

  52. Engineering Time: Moderate – to create
    Execution Time: Short
    Actionability: Failures require some interpretation by engineers, but
    frameworks usually let you annotate them in test output
    Followup Time: Moderate to extensive – all engineers should monitor
    tests at build time and contribute to ongoing maintenance
    Automated Testing Fact Sheet

    View Slide

  53. Test for Evaluating and Managing Risk
    • Identify risk by...
    • Understanding what you know
    • Admitting what you don’t know
    • Creating a shared “map” that your team can build on
    • Discussing concerns early and often
    • Tests should…
    • Tell you as much as possible
    • As soon as possible
    • About the biggest sources of risk
    • And be updated to cover what you learn along the way

    View Slide

  54. Are we good?

    View Slide

  55. Rollout
    • Project began on 30 October 2015
    • First test runs in March 2016
    • First production deploy planned for June 2016

    View Slide

  56. Fatal error: Allowed memory size
    of 536870912 bytes exhausted (tried to
    allocate 140729445144864 bytes) in ...

    View Slide

  57. Identify Vague Hypotheses
    • Start with biggest components of system
    • Rank by least implausible
    • Identify means of validating hypotheses
    • Seek consistent reproducibility

    View Slide

  58. Extension Memory Management?
    • Type mismatches involving size_t
    • Side effect of incomplete changes for updated C APIs
    • Reread code to identify misses
    • Invalid read/writes that Valgrind would catch
    • Had previously tested extensions
    • Would suggest lacking test coverage
    • Shared memory corruption (very vague)
    • Had previously seen issues with PCRE JIT
    • Would have to reproduce error state with GDB

    View Slide

  59. Use Automated Tests To Find Memory Issues
    • Use php7dev vagrant box (or similar) for controlled environment
    • Use Gcov to evaluate test coverage
    • Create additional tests as needed
    • Run all tests with Valgrind

    View Slide

  60. Automated Testing For Extensions
    • Use make test or php run-tests.php
    • Failures recorded in test runner output
    • Failed tests will produce artifacts
    • Use .sh artifact to rerun test more easily
    • Or to run test with GDB
    • Tools documented at https://qa.php.net
    • Read run-tests.php to fill in any gaps

    View Slide

  61. Example .phpt Test
    --TEST—
    Mustache::parse() member function
    --SKIPIF—
    if( !extension_loaded('mustache') ) die('skip ');
    ?>
    --FILE—
    $m = new Mustache();
    $tmpl = $m->parse('{{test}}');
    var_dump(get_class($tmpl));
    ?>
    --EXPECT—
    string(11) "MustacheAST"

    View Slide

  62. Example Test Runner Output
    =====================================================================
    PHP : /home/vagrant/php-src/sapi/cli/php
    PHP_SAPI : cli
    PHP_VERSION : 7.2.0-dev
    ZEND_VERSION: 3.2.0-dev
    PHP_OS : Linux - Linux php7dev 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64
    INI actual : /home/vagrant/php-src/tmp-php.ini
    More .INIs :
    ---------------------------------------------------------------------
    PHP : /home/vagrant/php-src/sapi/phpdbg/phpdbg
    PHP_SAPI : phpdbg
    PHP_VERSION : 7.2.0-dev
    ZEND_VERSION: 3.2.0-dev
    PHP_OS : Linux - Linux php7dev 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64
    INI actual : /home/vagrant/php-src/tmp-php.ini
    More .INIs :
    ---------------------------------------------------------------------
    CWD : /home/vagrant/php-src
    Extra dirs :
    VALGRIND : Not used
    =====================================================================
    Running selected tests.
    PASS Mustache::parse() member function [001.phpt]
    =====================================================================
    Number of tests : 1 1
    Tests skipped : 0 ( 0.0%) --------
    Tests warned : 0 ( 0.0%) ( 0.0%)
    Tests failed : 0 ( 0.0%) ( 0.0%)
    Expected fail : 0 ( 0.0%) ( 0.0%)
    Tests passed : 1 (100.0%) (100.0%)
    ---------------------------------------------------------------------
    Time taken : 0 seconds
    =====================================================================

    View Slide

  63. Example With Memory Leak
    [...]
    ---------------------------------------------------------------------
    CWD : /home/vagrant/php-src
    Extra dirs :
    VALGRIND : valgrind-3.10.0
    =====================================================================
    Running selected tests.
    LEAK Mustache::parse() member function [001.phpt]
    =====================================================================
    Number of tests : 1 1
    Tests skipped : 0 ( 0.0%) --------
    Tests warned : 0 ( 0.0%) ( 0.0%)
    Tests failed : 0 ( 0.0%) ( 0.0%)
    Expected fail : 0 ( 0.0%) ( 0.0%)
    Tests leaked : 1 (100.0%) (100.0%)
    Tests passed : 0 ( 0.0%) ( 0.0%)
    ---------------------------------------------------------------------
    Time taken : 4 seconds
    =====================================================================
    =====================================================================
    LEAKED TEST SUMMARY
    ---------------------------------------------------------------------
    Mustache::parse() member function [001.phpt]
    =====================================================================

    View Slide

  64. Example Valgrind Output (001.mem)
    ==13733== Invalid write of size 4
    ==13733== at 0xF849E55: zim_Mustache_parse (mustache_mustache.cpp:850)
    ==13733== by 0x91ED13: ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER (zend_vm_execute.h:1097)
    ==13733== by 0x8C60DA: execute_ex (zend_vm_execute.h:429)
    ==13733== by 0x92115F: zend_execute (zend_vm_execute.h:474)
    ==13733== by 0x87F7E3: zend_execute_scripts (zend.c:1474)
    ==13733== by 0x81EE9F: php_execute_script (main.c:2533)
    ==13733== by 0x9232E9: do_cli (php_cli.c:990)
    ==13733== by 0x44E7BB: main (php_cli.c:1378)

    View Slide

  65. Use Gcov To Evaluate Test Coverage
    1. phpize
    2. ./configure CFLAGS="--coverage" CXXFLAGS="--coverage" LDFLAGS="-
    -coverage" # may vary slightly for your extension
    3. make clean all
    4. lcov --directory . --zerocounters # cleanup previous run, if any
    5. # run tests
    6. lcov --directory . --capture --output-file coverage.info
    7. genhtml --output-directory lcov_html coverage.info
    For php-src, use existing build targets: https://wiki.php.net/doc/articles/writing-tests

    View Slide

  66. View Slide

  67. Capture Error State In GDB
    • Find simplest means to reproduce
    • Ideally script: gdb --args php –f [.php file]
    • Or core file: gdb php-fpm [core file]
    • But maybe HTTP request (and live process): gdb php-fpm [pid]
    • But first you’ll need a debug build
    • May behave differently!
    • Script GDB to replay debugging steps
    Excellent tutorial: http://www.unknownroad.com/rtfm/gdbtut/

    View Slide

  68. The Culprit: APCu
    • Invalid write to memory region containing string size
    • Solved problem with latest extension version

    View Slide

  69. What We Could’ve Tried
    • More thorough review of known issues with third-party extensions
    • Better scrutiny of extensions that operated on shared memory
    • Had already disabled PCRE JIT
    • Assumed APCu extension was “safe” because we didn’t write it
    • Ensure we’d deployed latest version of all extensions
    • Strived earlier for better sense of “correctness”?

    View Slide

  70. Replay Testing
    • Sample access logs
    • Feed to cURL
    • Examine logs
    • Consider combining with stress testing (ApacheBench)

    View Slide

  71. Working Through Unknown Unknowns
    • Identify vague hypotheses
    • Break into smaller pieces
    • Overdocument, overcommunicate
    • Look for anything you recognize in the noise
    • Rotate through problem solving strategies
    • Pair problem solving
    • Keep multiple irons in the fire

    View Slide

  72. Managing Expectations Around Unknowns
    • Be sensitive to stress and frustration
    • Reframe around thrill of solving big problems
    • Call out indications that the problem is getting resolved
    • Be able to identify when you have more information than you used to
    • Keep working the room
    • Always be ready to pull the plug

    View Slide

  73. Fringe Benefits
    • Expand scope of what’s “known”
    • Develop new skills and tools
    • New tests?
    • New monitoring opportunities?
    • New things to automate?
    • Engineers like solving big problems!

    View Slide

  74. Engineering Time: Extensive
    Execution Time: Extensive
    Actionability: Varies, but often low
    Followup Time: Extensive
    System Testing Fact Sheet

    View Slide

  75. Are we good?...

    View Slide

  76. Rollout Considerations
    • Datacenter move was in-flight simultaneously
    • Testing focused on customer-facing website
    • No one wants to impact revenue

    View Slide

  77. Rollout
    • Schedule with lead time for hard dates around datacenter move
    • Be ready to pause project if the two will collide
    • Start with non-production environments
    • Then focus on customer-facing website
    • The most value would be in speeding up that experience
    • Use production load balancer to control traffic served by PHP7
    • But first agree on monitoring plan
    • Make changes from “war room”
    • Let changes bake in
    • Overcommunicate plan to broader org
    • Not impacting others is different from not surprising them

    View Slide

  78. Your Systems Will Have In-Between States
    • Embrace simultaneous realities
    • Think in terms of backwards- and forwards-compatibility
    • Attachment to a single state is a form of risk
    • Detachment facilitates separating workflows
    Harold Pinter: “A thing is not necessarily either true or false; it can be
    both true and false.”

    View Slide

  79. View Slide

  80. Rollout, Continued
    • Confirm metrics
    • “Well, now my capacity planning is done for the year!”
    • Celebrate!
    • Continue slow roll to cover all website traffic
    • Begin slow roll on other services

    View Slide

  81. Rollout, Continued?
    • Project began on 30 October 2015
    • First test runs in March 2016
    • First production deploy planned for June 2016
    • Last blocking issues resolved in July 2016
    • Development and staging environments updated in July 2016
    • First production deploy in August 2016
    • All servers on PHP7 on 7 February 2017

    View Slide

  82. Test for Evaluating and Managing Risk
    • Identify risk by...
    • Understanding what you know
    • Admitting what you don’t know
    • Creating a shared “map” that your team can build on
    • Discussing concerns early and often
    • Tests should…
    • Tell you as much as possible
    • As soon as possible
    • About the biggest sources of risk
    • And be updated to cover what you learn along the way

    View Slide

  83. Test to Continuously Challenge Worldview
    • Developers shouldn't just think about whether something works
    • Something doesn't work, it works just once
    • We initially assumed metrics were wrong!

    View Slide

  84. What We Covered
    • Risk as “what you don’t know”
    • Identifying risk in complex applications
    • Common risk for infrastructure changes
    • Testing as risk management
    • Manual and automated tools
    • AKA software engineering as information gathering
    • Product management strategies
    • Identifying goals
    • Kicking off and running big projects
    • Choosing the most important thing to work on in service of that goal
    • AKA delivering value to the business

    View Slide

  85. Questions?
    Adam Baratz
    [email protected]

    View Slide