Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Testing Complex Applications for PHP7

66ac33180677270f2ca40027faeb9b88?s=47 Adam Baratz
February 17, 2017

Testing Complex Applications for PHP7

Wayfair is one of the world's largest online destinations for the home. Our storefront is a very large PHP application -- 3.5M LoC interacting with a wide array of extensions -- that serves 2M daily visitors. So we were delighted when our upgrade to PHP7 went without a hitch. It worked so well because of a test plan that covered a wide range of strategies and tools. This case study will combine a walkthrough of this project with a practical tour of PHP testing tools, from PHPUnit to GDB.

66ac33180677270f2ca40027faeb9b88?s=128

Adam Baratz

February 17, 2017
Tweet

Transcript

  1. Testing Complex Applications for PHP7

  2. What We Talk About When We Talk About Testing

  3. Test for Correctness?

  4. https://phpunit.de/getting-started.html

  5. Test for Correctness?

  6. Test for Correctness?

  7. Test for Correctness?

  8. Test for Correctness?

  9. Test for Correctness?

  10. Test for Correctness?

  11. Test for Correctness? puppet images via https://thenounproject.com/term/puppet/

  12. *yawn*

  13. Test for Correctness?

  14. Test for Evaluating and Managing Risk

  15. Wayfair and PHP7

  16. None
  17. Wayfair and PHP7 • Page execution time dropped by about

    50% • CPU utilization dropped by about 30% For the details on why: https://nikic.github.io/2015/05/05/Internal-value-representation-in-PHP-7-part-1.html https://nikic.github.io/2015/06/19/Internal-value-representation-in-PHP-7-part-2.html
  18. None
  19. None
  20. Wayfair Before PHP7 • 3.5M LoC over 28K files, mostly

    our own code, but some third-party • Coding conventions spanning several versions of PHP • 66 PHP extensions • Many officially supported • Some third-party • Some modified third-party • Some totally custom
  21. Risk Is What You Don’t Know

  22. None
  23. Test for Evaluating and Managing Risk

  24. What We’ll Cover • Risk as “what you don’t know”

    • Identifying risk in complex applications • Common risk for infrastructure changes • Testing as risk management • Manual and automated tools • AKA software engineering as information gathering • Product management strategies • Identifying goals • Kicking off and running big projects • Choosing the most important thing to work on in service of that goal • AKA delivering value to the business
  25. Common Risk for Infrastructure Changes • Focused more on the

    desired outcome than on the path there • Divided attention among collaborators • Yours may not be the only such project in flight • Your systems will have in-between states • Something doesn’t work, it works once • You’ll live and die by your ability to monitor • Might not be worth it • Disaster is more likely than success
  26. Managing Common Risk • Always communicate in terms of value

    to the business • “Performance saves money,” not “spaceship operators!” • Don’t promise too much, too soon • Be in touch with broader org to understand potential collisions • Work the room • Share successes • Reinforce idea that project is headed in the right direction • Identify allies at different levels of the org • Harvest byproducts, opportunistically
  27. Planning • Never expect to account for everything • Map

    out the moving parts and dependencies • Distinguish the knowns from the unknowns • Guide future thinking
  28. Planning Tools • phpinfo() / php -i • Documentation •

    https://github.com/php/php-src/blob/PHP-7.0.0/UPGRADING • http://php.net/manual/en/migration70.incompatible.php • https://wiki.php.net/rfc/remove_deprecated_functionality_in_php7 • https://github.com/gophp7/gophp7-ext/wiki/extensions-catalog
  29. None
  30. Todo (30 October 2015) • Investigate apc extension. • Investigate

    cgi-fcgi extension. • Investigate whether bug with dba extension will affect us. • Remove dependency on ereg extension. Find repo references via /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/. As of Oct 30, there were references to ereg() and split(). • Update ketama extension. • Investigate memcache extension. • Remove mhash from puppet config. The only places in the php repo which could use it favor using the hash extension. • Investigate status of update to mongo extension. • Remove dependency on mssql extension. • Remove dependency on mysql extension. • Investigate whether open bugs with openssl extension will affect us. • Merge our PDO patches into master. It's not clear how much pdo_dblib has been tested by core maintainers, so we should test closely. Alternatively, SQL Relay may have supplanted pdo_dblib by the time we're ready to move forward with PHP7. • Investigate Zend OPcache extension. • Approve and commit compatibility updates for wfstring.
  31. Don’t Prematurely Discard Risk • Start simple, start naïve •

    Don’t assume anything until you do the research • Start big, start placeholder • You can reduce “Investigate apc extension” later • Start by overexplaining • Link to sources in case anyone needs to work backwards
  32. Prioritizing: ereg vs. mssql ereg • 7 functions • One-to-one

    replacements essentially in place via PCRE and string functions • No more than a dozen references, each of which could be updated easily mssql • 30 functions • Website would be largely unusable without DB access • We already started transitioning to PDO • Some functionality wasn’t replicated • Some behaviors were subtly different
  33. Focus First On The Biggest Risk • Choose the big,

    hard, unknown thing • That’s where all the risk lives • i.e., everything that could derail your project • Get the clearest view of project scope, as soon as possible • Guide further future thinking
  34. None
  35. Todo (30 October 2015) • Investigate apc extension. • Investigate

    cgi-fcgi extension. • Investigate whether bug with dba extension will affect us. • Remove dependency on ereg extension. Find repo references via /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/. As of Oct 30, there were references to ereg() and split(). • Update ketama extension. • Investigate memcache extension. • Remove mhash from puppet config. The only places in the php repo which could use it favor using the hash extension. • Investigate status of update to mongo extension. • Remove dependency on mssql extension. • Remove dependency on mysql extension. • Investigate whether open bugs with openssl extension will affect us. • Merge our PDO patches into master. It's not clear how much pdo_dblib has been tested by core maintainers, so we should test closely. Alternatively, SQL Relay may have supplanted pdo_dblib by the time we're ready to move forward with PHP7. • Investigate Zend OPcache extension. • Approve and commit compatibility updates for wfstring.
  36. Engineering Time: None Execution Time: Short Actionability: Varies Followup Time:

    Varies, but can be highly parallelizable Planning Fact Sheet
  37. Think Diminishing Risk, Not Deadlines • Use dates to say

    when something should be started, not finished • Calendar dates are a Procrustean bed • Opportunity for team to feel they messed up • Opportunity for stakeholders to dangle a knife • Team should focus on solving problems and delivering value • Stakeholders should know your pace through regular conversation • You want them to be partners in the process, not adversaries • Talk in rough terms: days, weeks, months • Talk in ranges
  38. Always Be Planning Continuously… • Review task list (single source

    of truth) • Add and reorder items • Add information as it’s obtained • Be specific, cite sources • Test certainty of previously made assertions • Ensure format of “map” exposes context and risk to group • Facilitate active participation across group
  39. Changes to variable handling (PHP7) Indirect variable, property and method

    references are now interpreted with left-to-right semantics. $$foo['bar']['baz'] // interpreted as ($$foo)['bar']['baz'] $foo->$bar['baz'] // interpreted as ($foo->$bar)['baz'] $foo->$bar['baz']() // interpreted as ($foo->$bar)['baz']() Foo::$bar['baz']() // interpreted as (Foo::$bar)['baz']() To restore the previous behavior add explicit curly braces: ${$foo['bar']['baz']} $foo->{$bar['baz']} $foo->{$bar['baz']}() Foo::{$bar['baz']}()
  40. Static Analysis • grep (e.g., for deprecated functions) • php

    -l (linter) • php7mar (detect potential compatibility issues) • Phan (AST) For many more: https://github.com/exakat/php-static-analysis-tools
  41. Sample php7mar output Scanning testcases.php Including file extensions: php Processed

    148 lines contained in 1 files. Processing took 0.048094034194946 seconds. # critical#### testcases.php * variableInterpolation * Line 2: `$$foo['bar']['baz']; //Interpreted as ($$foo)['bar']['baz']` * Line 3: `$foo->$bar['baz']; //Interpreted as ($foo->$bar)['baz']` * Line 4: `$foo->$bar['baz'](); //Interpreted as ($foo->$bar)['baz']()` * Line 5: `Foo::$bar['baz'](); //Interpreted as (Foo::$bar)['baz']()` * Line 6: `global $$foo->bar; //The global keyword now only accepts simple variables.` * duplicateFunctionParameter * Line 15: `function foo($a, $b, $unused, $unused) { /*...*/ }` * reservedNames ...
  42. Evaluating Static Analysis Tools • Start with “weakest” analysis •

    https://github.com/etsy/phan/wiki/Tutorial-for-Analyzing-a-Large-Sloppy- Code-Base • Spot check output • Consider… • Output volume • Output value • Ownership cost (ad hoc runs or someone’s ongoing responsibility?)
  43. Engineering Time: Short Execution Time: Short (consider as CI step)

    Actionability: Generates concrete punch list, but there could be false positives Followup Time: Varies, but can be highly parallelizable Static Analysis Fact Sheet
  44. Finding the Next Biggest Risk • Be alert to momentum

    • Momentum implies reduced risk • Reduced risk implies the project is winding down • Managing execution is different from managing the project • Even if there’s a lot of execution to manage • Don’t confuse tasks with objectives • Invert any underlying assumptions • In this case: when can we start running code? • Look for simplest opportunity to start sizing next biggest risk
  45. Prioritizing: when can we start running code? What are all

    the ways we can run code? • Web requests (php-fpm) • Batch processes (i.e., php -f) • Automated tests • Unit tests • Integration tests • Acceptance tests (browser tests)
  46. From: PHP7 Engineer Sent: Mar 16, 2016 To: PHP7 Team;

    Chief Architect Subject: Exciting PHP7 news Exciting proof of something everyone already assumes – Adam and I ran PHP5.6 and PHP7 head to head, and the results are that PHP7 is significantly faster than PHP5.6. Methodology We compared running times while unit testing feature_detect_test.php (for those who are unfamiliar, this is a regex-heavy set of tests that is notoriously long-running). PHP5.6 completed the tests in an average of 14.37 seconds, whereas PHP7 took an average of 2.9 seconds. Conclusions This is primarily a proof-of-concept while working in a vacuum, but it is promising that we are heading in the right direction. Other Observations PHP7 used up significantly more memory, as reported by PHPUnit (10MB vs 5.75MB). The initial iteration of PHP5.6 was quite the outlier – removing that reduces the average to about 13.44 seconds. Next Steps Our next project is to get PHP7 to render actual pages on our site – so far this has been throwing 502s while the server segfaults. Steps we had to take to get the unit tests to run: • Disable codeception + dependencies • Manually build DOM + ctype extensions • Update MPDF to latest development branch Below you’ll find a few charts outlining the runtimes of each individual iteration. […]
  47. Learning From Tests • Accept small victories • Mostly not

    working is okay • Identify components that are not working the most • Rearticulate next most important goal • AKA Always Be Planning
  48. Managing Common Risk • Always communicate in terms of value

    to the business • “Performance saves money,” not “spaceship operators!” • Don’t promise too much, too soon • Be in touch with broader org to understand potential collisions • Work the room • Share successes • Reinforce idea that project is headed in the right direction • Identify allies at different levels of the org • Harvest byproducts, opportunistically
  49. Automated Testing • Unit tests • Acceptance tests (browser tests)

    • Integration tests Most automated tests can be built on the same framework (PHPUnit).
  50. https://phpunit.de/getting-started.html

  51. Sample PHPUnit output PHPUnit 6.0.0 by Sebastian Bergmann and contributors.

    ...F Time: 0 seconds, Memory: 5.75Mb There was 1 failure: 1) DataTest::testAdd with data set #3 (1, 1, 3) Failed asserting that 2 matches expected 3. /home/sb/DataTest.php:9 FAILURES! Tests: 4, Assertions: 4, Failures: 1.
  52. Engineering Time: Moderate – to create Execution Time: Short Actionability:

    Failures require some interpretation by engineers, but frameworks usually let you annotate them in test output Followup Time: Moderate to extensive – all engineers should monitor tests at build time and contribute to ongoing maintenance Automated Testing Fact Sheet
  53. Test for Evaluating and Managing Risk • Identify risk by...

    • Understanding what you know • Admitting what you don’t know • Creating a shared “map” that your team can build on • Discussing concerns early and often • Tests should… • Tell you as much as possible • As soon as possible • About the biggest sources of risk • And be updated to cover what you learn along the way
  54. Are we good?

  55. Rollout • Project began on 30 October 2015 • First

    test runs in March 2016 • First production deploy planned for June 2016
  56. Fatal error: Allowed memory size of 536870912 bytes exhausted (tried

    to allocate 140729445144864 bytes) in ...
  57. Identify Vague Hypotheses • Start with biggest components of system

    • Rank by least implausible • Identify means of validating hypotheses • Seek consistent reproducibility
  58. Extension Memory Management? • Type mismatches involving size_t • Side

    effect of incomplete changes for updated C APIs • Reread code to identify misses • Invalid read/writes that Valgrind would catch • Had previously tested extensions • Would suggest lacking test coverage • Shared memory corruption (very vague) • Had previously seen issues with PCRE JIT • Would have to reproduce error state with GDB
  59. Use Automated Tests To Find Memory Issues • Use php7dev

    vagrant box (or similar) for controlled environment • Use Gcov to evaluate test coverage • Create additional tests as needed • Run all tests with Valgrind
  60. Automated Testing For Extensions • Use make test or php

    run-tests.php • Failures recorded in test runner output • Failed tests will produce artifacts • Use .sh artifact to rerun test more easily • Or to run test with GDB • Tools documented at https://qa.php.net • Read run-tests.php to fill in any gaps
  61. Example .phpt Test --TEST— Mustache::parse() member function --SKIPIF— <?php if(

    !extension_loaded('mustache') ) die('skip '); ?> --FILE— <?php $m = new Mustache(); $tmpl = $m->parse('{{test}}'); var_dump(get_class($tmpl)); ?> --EXPECT— string(11) "MustacheAST"
  62. Example Test Runner Output ===================================================================== PHP : /home/vagrant/php-src/sapi/cli/php PHP_SAPI :

    cli PHP_VERSION : 7.2.0-dev ZEND_VERSION: 3.2.0-dev PHP_OS : Linux - Linux php7dev 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 INI actual : /home/vagrant/php-src/tmp-php.ini More .INIs : --------------------------------------------------------------------- PHP : /home/vagrant/php-src/sapi/phpdbg/phpdbg PHP_SAPI : phpdbg PHP_VERSION : 7.2.0-dev ZEND_VERSION: 3.2.0-dev PHP_OS : Linux - Linux php7dev 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 INI actual : /home/vagrant/php-src/tmp-php.ini More .INIs : --------------------------------------------------------------------- CWD : /home/vagrant/php-src Extra dirs : VALGRIND : Not used ===================================================================== Running selected tests. PASS Mustache::parse() member function [001.phpt] ===================================================================== Number of tests : 1 1 Tests skipped : 0 ( 0.0%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 0 ( 0.0%) ( 0.0%) Expected fail : 0 ( 0.0%) ( 0.0%) Tests passed : 1 (100.0%) (100.0%) --------------------------------------------------------------------- Time taken : 0 seconds =====================================================================
  63. Example With Memory Leak [...] --------------------------------------------------------------------- CWD : /home/vagrant/php-src Extra

    dirs : VALGRIND : valgrind-3.10.0 ===================================================================== Running selected tests. LEAK Mustache::parse() member function [001.phpt] ===================================================================== Number of tests : 1 1 Tests skipped : 0 ( 0.0%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 0 ( 0.0%) ( 0.0%) Expected fail : 0 ( 0.0%) ( 0.0%) Tests leaked : 1 (100.0%) (100.0%) Tests passed : 0 ( 0.0%) ( 0.0%) --------------------------------------------------------------------- Time taken : 4 seconds ===================================================================== ===================================================================== LEAKED TEST SUMMARY --------------------------------------------------------------------- Mustache::parse() member function [001.phpt] =====================================================================
  64. Example Valgrind Output (001.mem) ==13733== Invalid write of size 4

    ==13733== at 0xF849E55: zim_Mustache_parse (mustache_mustache.cpp:850) ==13733== by 0x91ED13: ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER (zend_vm_execute.h:1097) ==13733== by 0x8C60DA: execute_ex (zend_vm_execute.h:429) ==13733== by 0x92115F: zend_execute (zend_vm_execute.h:474) ==13733== by 0x87F7E3: zend_execute_scripts (zend.c:1474) ==13733== by 0x81EE9F: php_execute_script (main.c:2533) ==13733== by 0x9232E9: do_cli (php_cli.c:990) ==13733== by 0x44E7BB: main (php_cli.c:1378)
  65. Use Gcov To Evaluate Test Coverage 1. phpize 2. ./configure

    CFLAGS="--coverage" CXXFLAGS="--coverage" LDFLAGS="- -coverage" # may vary slightly for your extension 3. make clean all 4. lcov --directory . --zerocounters # cleanup previous run, if any 5. # run tests 6. lcov --directory . --capture --output-file coverage.info 7. genhtml --output-directory lcov_html coverage.info For php-src, use existing build targets: https://wiki.php.net/doc/articles/writing-tests
  66. None
  67. Capture Error State In GDB • Find simplest means to

    reproduce • Ideally script: gdb --args php –f [.php file] • Or core file: gdb php-fpm [core file] • But maybe HTTP request (and live process): gdb php-fpm [pid] • But first you’ll need a debug build • May behave differently! • Script GDB to replay debugging steps Excellent tutorial: http://www.unknownroad.com/rtfm/gdbtut/
  68. The Culprit: APCu • Invalid write to memory region containing

    string size • Solved problem with latest extension version
  69. What We Could’ve Tried • More thorough review of known

    issues with third-party extensions • Better scrutiny of extensions that operated on shared memory • Had already disabled PCRE JIT • Assumed APCu extension was “safe” because we didn’t write it • Ensure we’d deployed latest version of all extensions • Strived earlier for better sense of “correctness”?
  70. Replay Testing • Sample access logs • Feed to cURL

    • Examine logs • Consider combining with stress testing (ApacheBench)
  71. Working Through Unknown Unknowns • Identify vague hypotheses • Break

    into smaller pieces • Overdocument, overcommunicate • Look for anything you recognize in the noise • Rotate through problem solving strategies • Pair problem solving • Keep multiple irons in the fire
  72. Managing Expectations Around Unknowns • Be sensitive to stress and

    frustration • Reframe around thrill of solving big problems • Call out indications that the problem is getting resolved • Be able to identify when you have more information than you used to • Keep working the room • Always be ready to pull the plug
  73. Fringe Benefits • Expand scope of what’s “known” • Develop

    new skills and tools • New tests? • New monitoring opportunities? • New things to automate? • Engineers like solving big problems!
  74. Engineering Time: Extensive Execution Time: Extensive Actionability: Varies, but often

    low Followup Time: Extensive System Testing Fact Sheet
  75. Are we good?...

  76. Rollout Considerations • Datacenter move was in-flight simultaneously • Testing

    focused on customer-facing website • No one wants to impact revenue
  77. Rollout • Schedule with lead time for hard dates around

    datacenter move • Be ready to pause project if the two will collide • Start with non-production environments • Then focus on customer-facing website • The most value would be in speeding up that experience • Use production load balancer to control traffic served by PHP7 • But first agree on monitoring plan • Make changes from “war room” • Let changes bake in • Overcommunicate plan to broader org • Not impacting others is different from not surprising them
  78. Your Systems Will Have In-Between States • Embrace simultaneous realities

    • Think in terms of backwards- and forwards-compatibility • Attachment to a single state is a form of risk • Detachment facilitates separating workflows Harold Pinter: “A thing is not necessarily either true or false; it can be both true and false.”
  79. None
  80. Rollout, Continued • Confirm metrics • “Well, now my capacity

    planning is done for the year!” • Celebrate! • Continue slow roll to cover all website traffic • Begin slow roll on other services
  81. Rollout, Continued? • Project began on 30 October 2015 •

    First test runs in March 2016 • First production deploy planned for June 2016 • Last blocking issues resolved in July 2016 • Development and staging environments updated in July 2016 • First production deploy in August 2016 • All servers on PHP7 on 7 February 2017
  82. Test for Evaluating and Managing Risk • Identify risk by...

    • Understanding what you know • Admitting what you don’t know • Creating a shared “map” that your team can build on • Discussing concerns early and often • Tests should… • Tell you as much as possible • As soon as possible • About the biggest sources of risk • And be updated to cover what you learn along the way
  83. Test to Continuously Challenge Worldview • Developers shouldn't just think

    about whether something works • Something doesn't work, it works just once • We initially assumed metrics were wrong!
  84. What We Covered • Risk as “what you don’t know”

    • Identifying risk in complex applications • Common risk for infrastructure changes • Testing as risk management • Manual and automated tools • AKA software engineering as information gathering • Product management strategies • Identifying goals • Kicking off and running big projects • Choosing the most important thing to work on in service of that goal • AKA delivering value to the business
  85. Questions? Adam Baratz abaratz@wayfair.com