Testing Complex Applications for PHP7

What We Talk About When We Talk About Testing

Test for Correctness?

https://phpunit.de/getting-started.html

Test for Correctness? puppet images via https://thenounproject.com/term/puppet/

*yawn*

Test for Evaluating and Managing Risk

Wayfair and PHP7

Wayfair and PHP7 • Page execution time dropped by about
50% • CPU utilization dropped by about 30% For the details on why: https://nikic.github.io/2015/05/05/Internal-value-representation-in-PHP-7-part-1.html https://nikic.github.io/2015/06/19/Internal-value-representation-in-PHP-7-part-2.html

Wayfair Before PHP7 • 3.5M LoC over 28K files, mostly
our own code, but some third-party • Coding conventions spanning several versions of PHP • 66 PHP extensions • Many officially supported • Some third-party • Some modified third-party • Some totally custom

Risk Is What You Don’t Know

Test for Evaluating and Managing Risk

What We’ll Cover • Risk as “what you don’t know”
• Identifying risk in complex applications • Common risk for infrastructure changes • Testing as risk management • Manual and automated tools • AKA software engineering as information gathering • Product management strategies • Identifying goals • Kicking off and running big projects • Choosing the most important thing to work on in service of that goal • AKA delivering value to the business

Common Risk for Infrastructure Changes • Focused more on the
desired outcome than on the path there • Divided attention among collaborators • Yours may not be the only such project in flight • Your systems will have in-between states • Something doesn’t work, it works once • You’ll live and die by your ability to monitor • Might not be worth it • Disaster is more likely than success

Managing Common Risk • Always communicate in terms of value
to the business • “Performance saves money,” not “spaceship operators!” • Don’t promise too much, too soon • Be in touch with broader org to understand potential collisions • Work the room • Share successes • Reinforce idea that project is headed in the right direction • Identify allies at different levels of the org • Harvest byproducts, opportunistically

Planning • Never expect to account for everything • Map
out the moving parts and dependencies • Distinguish the knowns from the unknowns • Guide future thinking

Planning Tools • phpinfo() / php -i • Documentation •
https://github.com/php/php-src/blob/PHP-7.0.0/UPGRADING • http://php.net/manual/en/migration70.incompatible.php • https://wiki.php.net/rfc/remove_deprecated_functionality_in_php7 • https://github.com/gophp7/gophp7-ext/wiki/extensions-catalog

Todo (30 October 2015) • Investigate apc extension. • Investigate
cgi-fcgi extension. • Investigate whether bug with dba extension will affect us. • Remove dependency on ereg extension. Find repo references via /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/. As of Oct 30, there were references to ereg() and split(). • Update ketama extension. • Investigate memcache extension. • Remove mhash from puppet config. The only places in the php repo which could use it favor using the hash extension. • Investigate status of update to mongo extension. • Remove dependency on mssql extension. • Remove dependency on mysql extension. • Investigate whether open bugs with openssl extension will affect us. • Merge our PDO patches into master. It's not clear how much pdo_dblib has been tested by core maintainers, so we should test closely. Alternatively, SQL Relay may have supplanted pdo_dblib by the time we're ready to move forward with PHP7. • Investigate Zend OPcache extension. • Approve and commit compatibility updates for wfstring.

Don’t Prematurely Discard Risk • Start simple, start naïve •
Don’t assume anything until you do the research • Start big, start placeholder • You can reduce “Investigate apc extension” later • Start by overexplaining • Link to sources in case anyone needs to work backwards

Prioritizing: ereg vs. mssql ereg • 7 functions • One-to-one
replacements essentially in place via PCRE and string functions • No more than a dozen references, each of which could be updated easily mssql • 30 functions • Website would be largely unusable without DB access • We already started transitioning to PDO • Some functionality wasn’t replicated • Some behaviors were subtly different

Focus First On The Biggest Risk • Choose the big,
hard, unknown thing • That’s where all the risk lives • i.e., everything that could derail your project • Get the clearest view of project scope, as soon as possible • Guide further future thinking

Todo (30 October 2015) • Investigate apc extension. • Investigate
cgi-fcgi extension. • Investigate whether bug with dba extension will affect us. • Remove dependency on ereg extension. Find repo references via /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/. As of Oct 30, there were references to ereg() and split(). • Update ketama extension. • Investigate memcache extension. • Remove mhash from puppet config. The only places in the php repo which could use it favor using the hash extension. • Investigate status of update to mongo extension. • Remove dependency on mssql extension. • Remove dependency on mysql extension. • Investigate whether open bugs with openssl extension will affect us. • Merge our PDO patches into master. It's not clear how much pdo_dblib has been tested by core maintainers, so we should test closely. Alternatively, SQL Relay may have supplanted pdo_dblib by the time we're ready to move forward with PHP7. • Investigate Zend OPcache extension. • Approve and commit compatibility updates for wfstring.

Engineering Time: None Execution Time: Short Actionability: Varies Followup Time:
Varies, but can be highly parallelizable Planning Fact Sheet

Think Diminishing Risk, Not Deadlines • Use dates to say
when something should be started, not finished • Calendar dates are a Procrustean bed • Opportunity for team to feel they messed up • Opportunity for stakeholders to dangle a knife • Team should focus on solving problems and delivering value • Stakeholders should know your pace through regular conversation • You want them to be partners in the process, not adversaries • Talk in rough terms: days, weeks, months • Talk in ranges

Always Be Planning Continuously… • Review task list (single source
of truth) • Add and reorder items • Add information as it’s obtained • Be specific, cite sources • Test certainty of previously made assertions • Ensure format of “map” exposes context and risk to group • Facilitate active participation across group

Changes to variable handling (PHP7) Indirect variable, property and method
references are now interpreted with left-to-right semantics. $$foo['bar']['baz'] // interpreted as ($$foo)['bar']['baz'] $foo->$bar['baz'] // interpreted as ($foo->$bar)['baz'] $foo->$bar['baz']() // interpreted as ($foo->$bar)['baz']() Foo::$bar['baz']() // interpreted as (Foo::$bar)['baz']() To restore the previous behavior add explicit curly braces: ${$foo['bar']['baz']} $foo->{$bar['baz']} $foo->{$bar['baz']}() Foo::{$bar['baz']}()

Static Analysis • grep (e.g., for deprecated functions) • php
-l (linter) • php7mar (detect potential compatibility issues) • Phan (AST) For many more: https://github.com/exakat/php-static-analysis-tools

Sample php7mar output Scanning testcases.php Including file extensions: php Processed
148 lines contained in 1 files. Processing took 0.048094034194946 seconds. # critical#### testcases.php * variableInterpolation * Line 2: `$$foo['bar']['baz']; //Interpreted as ($$foo)['bar']['baz']` * Line 3: `$foo->$bar['baz']; //Interpreted as ($foo->$bar)['baz']` * Line 4: `$foo->$bar['baz'](); //Interpreted as ($foo->$bar)['baz']()` * Line 5: `Foo::$bar['baz'](); //Interpreted as (Foo::$bar)['baz']()` * Line 6: `global $$foo->bar; //The global keyword now only accepts simple variables.` * duplicateFunctionParameter * Line 15: `function foo($a, $b, $unused, $unused) { /*...*/ }` * reservedNames ...

Evaluating Static Analysis Tools • Start with “weakest” analysis •
https://github.com/etsy/phan/wiki/Tutorial-for-Analyzing-a-Large-Sloppy- Code-Base • Spot check output • Consider… • Output volume • Output value • Ownership cost (ad hoc runs or someone’s ongoing responsibility?)

Engineering Time: Short Execution Time: Short (consider as CI step)
Actionability: Generates concrete punch list, but there could be false positives Followup Time: Varies, but can be highly parallelizable Static Analysis Fact Sheet

Finding the Next Biggest Risk • Be alert to momentum
• Momentum implies reduced risk • Reduced risk implies the project is winding down • Managing execution is different from managing the project • Even if there’s a lot of execution to manage • Don’t confuse tasks with objectives • Invert any underlying assumptions • In this case: when can we start running code? • Look for simplest opportunity to start sizing next biggest risk

Prioritizing: when can we start running code? What are all
the ways we can run code? • Web requests (php-fpm) • Batch processes (i.e., php -f) • Automated tests • Unit tests • Integration tests • Acceptance tests (browser tests)

From: PHP7 Engineer Sent: Mar 16, 2016 To: PHP7 Team;
Chief Architect Subject: Exciting PHP7 news Exciting proof of something everyone already assumes – Adam and I ran PHP5.6 and PHP7 head to head, and the results are that PHP7 is significantly faster than PHP5.6. Methodology We compared running times while unit testing feature_detect_test.php (for those who are unfamiliar, this is a regex-heavy set of tests that is notoriously long-running). PHP5.6 completed the tests in an average of 14.37 seconds, whereas PHP7 took an average of 2.9 seconds. Conclusions This is primarily a proof-of-concept while working in a vacuum, but it is promising that we are heading in the right direction. Other Observations PHP7 used up significantly more memory, as reported by PHPUnit (10MB vs 5.75MB). The initial iteration of PHP5.6 was quite the outlier – removing that reduces the average to about 13.44 seconds. Next Steps Our next project is to get PHP7 to render actual pages on our site – so far this has been throwing 502s while the server segfaults. Steps we had to take to get the unit tests to run: • Disable codeception + dependencies • Manually build DOM + ctype extensions • Update MPDF to latest development branch Below you’ll find a few charts outlining the runtimes of each individual iteration. […]

Learning From Tests • Accept small victories • Mostly not
working is okay • Identify components that are not working the most • Rearticulate next most important goal • AKA Always Be Planning

Managing Common Risk • Always communicate in terms of value
to the business • “Performance saves money,” not “spaceship operators!” • Don’t promise too much, too soon • Be in touch with broader org to understand potential collisions • Work the room • Share successes • Reinforce idea that project is headed in the right direction • Identify allies at different levels of the org • Harvest byproducts, opportunistically

Automated Testing • Unit tests • Acceptance tests (browser tests)
• Integration tests Most automated tests can be built on the same framework (PHPUnit).

https://phpunit.de/getting-started.html

Sample PHPUnit output PHPUnit 6.0.0 by Sebastian Bergmann and contributors.
...F Time: 0 seconds, Memory: 5.75Mb There was 1 failure: 1) DataTest::testAdd with data set #3 (1, 1, 3) Failed asserting that 2 matches expected 3. /home/sb/DataTest.php:9 FAILURES! Tests: 4, Assertions: 4, Failures: 1.

Engineering Time: Moderate – to create Execution Time: Short Actionability:
Failures require some interpretation by engineers, but frameworks usually let you annotate them in test output Followup Time: Moderate to extensive – all engineers should monitor tests at build time and contribute to ongoing maintenance Automated Testing Fact Sheet

Test for Evaluating and Managing Risk • Identify risk by...
• Understanding what you know • Admitting what you don’t know • Creating a shared “map” that your team can build on • Discussing concerns early and often • Tests should… • Tell you as much as possible • As soon as possible • About the biggest sources of risk • And be updated to cover what you learn along the way

Are we good?

Rollout • Project began on 30 October 2015 • First
test runs in March 2016 • First production deploy planned for June 2016

Fatal error: Allowed memory size of 536870912 bytes exhausted (tried
to allocate 140729445144864 bytes) in ...

Identify Vague Hypotheses • Start with biggest components of system
• Rank by least implausible • Identify means of validating hypotheses • Seek consistent reproducibility

Extension Memory Management? • Type mismatches involving size_t • Side
effect of incomplete changes for updated C APIs • Reread code to identify misses • Invalid read/writes that Valgrind would catch • Had previously tested extensions • Would suggest lacking test coverage • Shared memory corruption (very vague) • Had previously seen issues with PCRE JIT • Would have to reproduce error state with GDB

Use Automated Tests To Find Memory Issues • Use php7dev
vagrant box (or similar) for controlled environment • Use Gcov to evaluate test coverage • Create additional tests as needed • Run all tests with Valgrind

Automated Testing For Extensions • Use make test or php
run-tests.php • Failures recorded in test runner output • Failed tests will produce artifacts • Use .sh artifact to rerun test more easily • Or to run test with GDB • Tools documented at https://qa.php.net • Read run-tests.php to fill in any gaps

Example .phpt Test --TEST— Mustache::parse() member function --SKIPIF— <?php if(
!extension_loaded('mustache') ) die('skip '); ?> --FILE— <?php $m = new Mustache(); $tmpl = $m->parse('{{test}}'); var_dump(get_class($tmpl)); ?> --EXPECT— string(11) "MustacheAST"

Example Test Runner Output ===================================================================== PHP : /home/vagrant/php-src/sapi/cli/php PHP_SAPI :
cli PHP_VERSION : 7.2.0-dev ZEND_VERSION: 3.2.0-dev PHP_OS : Linux - Linux php7dev 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 INI actual : /home/vagrant/php-src/tmp-php.ini More .INIs : --------------------------------------------------------------------- PHP : /home/vagrant/php-src/sapi/phpdbg/phpdbg PHP_SAPI : phpdbg PHP_VERSION : 7.2.0-dev ZEND_VERSION: 3.2.0-dev PHP_OS : Linux - Linux php7dev 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 INI actual : /home/vagrant/php-src/tmp-php.ini More .INIs : --------------------------------------------------------------------- CWD : /home/vagrant/php-src Extra dirs : VALGRIND : Not used ===================================================================== Running selected tests. PASS Mustache::parse() member function [001.phpt] ===================================================================== Number of tests : 1 1 Tests skipped : 0 ( 0.0%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 0 ( 0.0%) ( 0.0%) Expected fail : 0 ( 0.0%) ( 0.0%) Tests passed : 1 (100.0%) (100.0%) --------------------------------------------------------------------- Time taken : 0 seconds =====================================================================

Example With Memory Leak [...] --------------------------------------------------------------------- CWD : /home/vagrant/php-src Extra
dirs : VALGRIND : valgrind-3.10.0 ===================================================================== Running selected tests. LEAK Mustache::parse() member function [001.phpt] ===================================================================== Number of tests : 1 1 Tests skipped : 0 ( 0.0%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 0 ( 0.0%) ( 0.0%) Expected fail : 0 ( 0.0%) ( 0.0%) Tests leaked : 1 (100.0%) (100.0%) Tests passed : 0 ( 0.0%) ( 0.0%) --------------------------------------------------------------------- Time taken : 4 seconds ===================================================================== ===================================================================== LEAKED TEST SUMMARY --------------------------------------------------------------------- Mustache::parse() member function [001.phpt] =====================================================================

Example Valgrind Output (001.mem) ==13733== Invalid write of size 4
==13733== at 0xF849E55: zim_Mustache_parse (mustache_mustache.cpp:850) ==13733== by 0x91ED13: ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER (zend_vm_execute.h:1097) ==13733== by 0x8C60DA: execute_ex (zend_vm_execute.h:429) ==13733== by 0x92115F: zend_execute (zend_vm_execute.h:474) ==13733== by 0x87F7E3: zend_execute_scripts (zend.c:1474) ==13733== by 0x81EE9F: php_execute_script (main.c:2533) ==13733== by 0x9232E9: do_cli (php_cli.c:990) ==13733== by 0x44E7BB: main (php_cli.c:1378)

Use Gcov To Evaluate Test Coverage 1. phpize 2. ./configure
CFLAGS="--coverage" CXXFLAGS="--coverage" LDFLAGS="- -coverage" # may vary slightly for your extension 3. make clean all 4. lcov --directory . --zerocounters # cleanup previous run, if any 5. # run tests 6. lcov --directory . --capture --output-file coverage.info 7. genhtml --output-directory lcov_html coverage.info For php-src, use existing build targets: https://wiki.php.net/doc/articles/writing-tests

Capture Error State In GDB • Find simplest means to
reproduce • Ideally script: gdb --args php –f [.php file] • Or core file: gdb php-fpm [core file] • But maybe HTTP request (and live process): gdb php-fpm [pid] • But first you’ll need a debug build • May behave differently! • Script GDB to replay debugging steps Excellent tutorial: http://www.unknownroad.com/rtfm/gdbtut/

The Culprit: APCu • Invalid write to memory region containing
string size • Solved problem with latest extension version

What We Could’ve Tried • More thorough review of known
issues with third-party extensions • Better scrutiny of extensions that operated on shared memory • Had already disabled PCRE JIT • Assumed APCu extension was “safe” because we didn’t write it • Ensure we’d deployed latest version of all extensions • Strived earlier for better sense of “correctness”?

Replay Testing • Sample access logs • Feed to cURL
• Examine logs • Consider combining with stress testing (ApacheBench)

Working Through Unknown Unknowns • Identify vague hypotheses • Break
into smaller pieces • Overdocument, overcommunicate • Look for anything you recognize in the noise • Rotate through problem solving strategies • Pair problem solving • Keep multiple irons in the fire

Managing Expectations Around Unknowns • Be sensitive to stress and
frustration • Reframe around thrill of solving big problems • Call out indications that the problem is getting resolved • Be able to identify when you have more information than you used to • Keep working the room • Always be ready to pull the plug

Fringe Benefits • Expand scope of what’s “known” • Develop
new skills and tools • New tests? • New monitoring opportunities? • New things to automate? • Engineers like solving big problems!

Engineering Time: Extensive Execution Time: Extensive Actionability: Varies, but often
low Followup Time: Extensive System Testing Fact Sheet

Are we good?...

Rollout Considerations • Datacenter move was in-flight simultaneously • Testing
focused on customer-facing website • No one wants to impact revenue

Rollout • Schedule with lead time for hard dates around
datacenter move • Be ready to pause project if the two will collide • Start with non-production environments • Then focus on customer-facing website • The most value would be in speeding up that experience • Use production load balancer to control traffic served by PHP7 • But first agree on monitoring plan • Make changes from “war room” • Let changes bake in • Overcommunicate plan to broader org • Not impacting others is different from not surprising them

Your Systems Will Have In-Between States • Embrace simultaneous realities
• Think in terms of backwards- and forwards-compatibility • Attachment to a single state is a form of risk • Detachment facilitates separating workflows Harold Pinter: “A thing is not necessarily either true or false; it can be both true and false.”

Rollout, Continued • Confirm metrics • “Well, now my capacity
planning is done for the year!” • Celebrate! • Continue slow roll to cover all website traffic • Begin slow roll on other services

Rollout, Continued? • Project began on 30 October 2015 •
First test runs in March 2016 • First production deploy planned for June 2016 • Last blocking issues resolved in July 2016 • Development and staging environments updated in July 2016 • First production deploy in August 2016 • All servers on PHP7 on 7 February 2017

Test for Evaluating and Managing Risk • Identify risk by...
• Understanding what you know • Admitting what you don’t know • Creating a shared “map” that your team can build on • Discussing concerns early and often • Tests should… • Tell you as much as possible • As soon as possible • About the biggest sources of risk • And be updated to cover what you learn along the way

Test to Continuously Challenge Worldview • Developers shouldn't just think
about whether something works • Something doesn't work, it works just once • We initially assumed metrics were wrong!

What We Covered • Risk as “what you don’t know”
• Identifying risk in complex applications • Common risk for infrastructure changes • Testing as risk management • Manual and automated tools • AKA software engineering as information gathering • Product management strategies • Identifying goals • Kicking off and running big projects • Choosing the most important thing to work on in service of that goal • AKA delivering value to the business

Questions? Adam Baratz [email protected]

Testing Complex Applications for PHP7

Testing Complex Applications for PHP7

Other Decks in Technology

Featured

Transcript