Slide 1

Slide 1 text

Testing Complex Applications for PHP7

Slide 2

Slide 2 text

What We Talk About When We Talk About Testing

Slide 3

Slide 3 text

Test for Correctness?

Slide 4

Slide 4 text

https://phpunit.de/getting-started.html

Slide 5

Slide 5 text

Test for Correctness?

Slide 6

Slide 6 text

Test for Correctness?

Slide 7

Slide 7 text

Test for Correctness?

Slide 8

Slide 8 text

Test for Correctness?

Slide 9

Slide 9 text

Test for Correctness?

Slide 10

Slide 10 text

Test for Correctness?

Slide 11

Slide 11 text

Test for Correctness? puppet images via https://thenounproject.com/term/puppet/

Slide 12

Slide 12 text

*yawn*

Slide 13

Slide 13 text

Test for Correctness?

Slide 14

Slide 14 text

Test for Evaluating and Managing Risk

Slide 15

Slide 15 text

Wayfair and PHP7

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Wayfair and PHP7 • Page execution time dropped by about 50% • CPU utilization dropped by about 30% For the details on why: https://nikic.github.io/2015/05/05/Internal-value-representation-in-PHP-7-part-1.html https://nikic.github.io/2015/06/19/Internal-value-representation-in-PHP-7-part-2.html

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

Wayfair Before PHP7 • 3.5M LoC over 28K files, mostly our own code, but some third-party • Coding conventions spanning several versions of PHP • 66 PHP extensions • Many officially supported • Some third-party • Some modified third-party • Some totally custom

Slide 21

Slide 21 text

Risk Is What You Don’t Know

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

Test for Evaluating and Managing Risk

Slide 24

Slide 24 text

What We’ll Cover • Risk as “what you don’t know” • Identifying risk in complex applications • Common risk for infrastructure changes • Testing as risk management • Manual and automated tools • AKA software engineering as information gathering • Product management strategies • Identifying goals • Kicking off and running big projects • Choosing the most important thing to work on in service of that goal • AKA delivering value to the business

Slide 25

Slide 25 text

Common Risk for Infrastructure Changes • Focused more on the desired outcome than on the path there • Divided attention among collaborators • Yours may not be the only such project in flight • Your systems will have in-between states • Something doesn’t work, it works once • You’ll live and die by your ability to monitor • Might not be worth it • Disaster is more likely than success

Slide 26

Slide 26 text

Managing Common Risk • Always communicate in terms of value to the business • “Performance saves money,” not “spaceship operators!” • Don’t promise too much, too soon • Be in touch with broader org to understand potential collisions • Work the room • Share successes • Reinforce idea that project is headed in the right direction • Identify allies at different levels of the org • Harvest byproducts, opportunistically

Slide 27

Slide 27 text

Planning • Never expect to account for everything • Map out the moving parts and dependencies • Distinguish the knowns from the unknowns • Guide future thinking

Slide 28

Slide 28 text

Planning Tools • phpinfo() / php -i • Documentation • https://github.com/php/php-src/blob/PHP-7.0.0/UPGRADING • http://php.net/manual/en/migration70.incompatible.php • https://wiki.php.net/rfc/remove_deprecated_functionality_in_php7 • https://github.com/gophp7/gophp7-ext/wiki/extensions-catalog

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

Todo (30 October 2015) • Investigate apc extension. • Investigate cgi-fcgi extension. • Investigate whether bug with dba extension will affect us. • Remove dependency on ereg extension. Find repo references via /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/. As of Oct 30, there were references to ereg() and split(). • Update ketama extension. • Investigate memcache extension. • Remove mhash from puppet config. The only places in the php repo which could use it favor using the hash extension. • Investigate status of update to mongo extension. • Remove dependency on mssql extension. • Remove dependency on mysql extension. • Investigate whether open bugs with openssl extension will affect us. • Merge our PDO patches into master. It's not clear how much pdo_dblib has been tested by core maintainers, so we should test closely. Alternatively, SQL Relay may have supplanted pdo_dblib by the time we're ready to move forward with PHP7. • Investigate Zend OPcache extension. • Approve and commit compatibility updates for wfstring.

Slide 31

Slide 31 text

Don’t Prematurely Discard Risk • Start simple, start naïve • Don’t assume anything until you do the research • Start big, start placeholder • You can reduce “Investigate apc extension” later • Start by overexplaining • Link to sources in case anyone needs to work backwards

Slide 32

Slide 32 text

Prioritizing: ereg vs. mssql ereg • 7 functions • One-to-one replacements essentially in place via PCRE and string functions • No more than a dozen references, each of which could be updated easily mssql • 30 functions • Website would be largely unusable without DB access • We already started transitioning to PDO • Some functionality wasn’t replicated • Some behaviors were subtly different

Slide 33

Slide 33 text

Focus First On The Biggest Risk • Choose the big, hard, unknown thing • That’s where all the risk lives • i.e., everything that could derail your project • Get the clearest view of project scope, as soon as possible • Guide further future thinking

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

Todo (30 October 2015) • Investigate apc extension. • Investigate cgi-fcgi extension. • Investigate whether bug with dba extension will affect us. • Remove dependency on ereg extension. Find repo references via /\b(ereg_replace|ereg|eregi_replace|eregi|split|spliti|sql_regcase)(/. As of Oct 30, there were references to ereg() and split(). • Update ketama extension. • Investigate memcache extension. • Remove mhash from puppet config. The only places in the php repo which could use it favor using the hash extension. • Investigate status of update to mongo extension. • Remove dependency on mssql extension. • Remove dependency on mysql extension. • Investigate whether open bugs with openssl extension will affect us. • Merge our PDO patches into master. It's not clear how much pdo_dblib has been tested by core maintainers, so we should test closely. Alternatively, SQL Relay may have supplanted pdo_dblib by the time we're ready to move forward with PHP7. • Investigate Zend OPcache extension. • Approve and commit compatibility updates for wfstring.

Slide 36

Slide 36 text

Engineering Time: None Execution Time: Short Actionability: Varies Followup Time: Varies, but can be highly parallelizable Planning Fact Sheet

Slide 37

Slide 37 text

Think Diminishing Risk, Not Deadlines • Use dates to say when something should be started, not finished • Calendar dates are a Procrustean bed • Opportunity for team to feel they messed up • Opportunity for stakeholders to dangle a knife • Team should focus on solving problems and delivering value • Stakeholders should know your pace through regular conversation • You want them to be partners in the process, not adversaries • Talk in rough terms: days, weeks, months • Talk in ranges

Slide 38

Slide 38 text

Always Be Planning Continuously… • Review task list (single source of truth) • Add and reorder items • Add information as it’s obtained • Be specific, cite sources • Test certainty of previously made assertions • Ensure format of “map” exposes context and risk to group • Facilitate active participation across group

Slide 39

Slide 39 text

Changes to variable handling (PHP7) Indirect variable, property and method references are now interpreted with left-to-right semantics. $$foo['bar']['baz'] // interpreted as ($$foo)['bar']['baz'] $foo->$bar['baz'] // interpreted as ($foo->$bar)['baz'] $foo->$bar['baz']() // interpreted as ($foo->$bar)['baz']() Foo::$bar['baz']() // interpreted as (Foo::$bar)['baz']() To restore the previous behavior add explicit curly braces: ${$foo['bar']['baz']} $foo->{$bar['baz']} $foo->{$bar['baz']}() Foo::{$bar['baz']}()

Slide 40

Slide 40 text

Static Analysis • grep (e.g., for deprecated functions) • php -l (linter) • php7mar (detect potential compatibility issues) • Phan (AST) For many more: https://github.com/exakat/php-static-analysis-tools

Slide 41

Slide 41 text

Sample php7mar output Scanning testcases.php Including file extensions: php Processed 148 lines contained in 1 files. Processing took 0.048094034194946 seconds. # critical#### testcases.php * variableInterpolation * Line 2: `$$foo['bar']['baz']; //Interpreted as ($$foo)['bar']['baz']` * Line 3: `$foo->$bar['baz']; //Interpreted as ($foo->$bar)['baz']` * Line 4: `$foo->$bar['baz'](); //Interpreted as ($foo->$bar)['baz']()` * Line 5: `Foo::$bar['baz'](); //Interpreted as (Foo::$bar)['baz']()` * Line 6: `global $$foo->bar; //The global keyword now only accepts simple variables.` * duplicateFunctionParameter * Line 15: `function foo($a, $b, $unused, $unused) { /*...*/ }` * reservedNames ...

Slide 42

Slide 42 text

Evaluating Static Analysis Tools • Start with “weakest” analysis • https://github.com/etsy/phan/wiki/Tutorial-for-Analyzing-a-Large-Sloppy- Code-Base • Spot check output • Consider… • Output volume • Output value • Ownership cost (ad hoc runs or someone’s ongoing responsibility?)

Slide 43

Slide 43 text

Engineering Time: Short Execution Time: Short (consider as CI step) Actionability: Generates concrete punch list, but there could be false positives Followup Time: Varies, but can be highly parallelizable Static Analysis Fact Sheet

Slide 44

Slide 44 text

Finding the Next Biggest Risk • Be alert to momentum • Momentum implies reduced risk • Reduced risk implies the project is winding down • Managing execution is different from managing the project • Even if there’s a lot of execution to manage • Don’t confuse tasks with objectives • Invert any underlying assumptions • In this case: when can we start running code? • Look for simplest opportunity to start sizing next biggest risk

Slide 45

Slide 45 text

Prioritizing: when can we start running code? What are all the ways we can run code? • Web requests (php-fpm) • Batch processes (i.e., php -f) • Automated tests • Unit tests • Integration tests • Acceptance tests (browser tests)

Slide 46

Slide 46 text

From: PHP7 Engineer Sent: Mar 16, 2016 To: PHP7 Team; Chief Architect Subject: Exciting PHP7 news Exciting proof of something everyone already assumes – Adam and I ran PHP5.6 and PHP7 head to head, and the results are that PHP7 is significantly faster than PHP5.6. Methodology We compared running times while unit testing feature_detect_test.php (for those who are unfamiliar, this is a regex-heavy set of tests that is notoriously long-running). PHP5.6 completed the tests in an average of 14.37 seconds, whereas PHP7 took an average of 2.9 seconds. Conclusions This is primarily a proof-of-concept while working in a vacuum, but it is promising that we are heading in the right direction. Other Observations PHP7 used up significantly more memory, as reported by PHPUnit (10MB vs 5.75MB). The initial iteration of PHP5.6 was quite the outlier – removing that reduces the average to about 13.44 seconds. Next Steps Our next project is to get PHP7 to render actual pages on our site – so far this has been throwing 502s while the server segfaults. Steps we had to take to get the unit tests to run: • Disable codeception + dependencies • Manually build DOM + ctype extensions • Update MPDF to latest development branch Below you’ll find a few charts outlining the runtimes of each individual iteration. […]

Slide 47

Slide 47 text

Learning From Tests • Accept small victories • Mostly not working is okay • Identify components that are not working the most • Rearticulate next most important goal • AKA Always Be Planning

Slide 48

Slide 48 text

Managing Common Risk • Always communicate in terms of value to the business • “Performance saves money,” not “spaceship operators!” • Don’t promise too much, too soon • Be in touch with broader org to understand potential collisions • Work the room • Share successes • Reinforce idea that project is headed in the right direction • Identify allies at different levels of the org • Harvest byproducts, opportunistically

Slide 49

Slide 49 text

Automated Testing • Unit tests • Acceptance tests (browser tests) • Integration tests Most automated tests can be built on the same framework (PHPUnit).

Slide 50

Slide 50 text

https://phpunit.de/getting-started.html

Slide 51

Slide 51 text

Sample PHPUnit output PHPUnit 6.0.0 by Sebastian Bergmann and contributors. ...F Time: 0 seconds, Memory: 5.75Mb There was 1 failure: 1) DataTest::testAdd with data set #3 (1, 1, 3) Failed asserting that 2 matches expected 3. /home/sb/DataTest.php:9 FAILURES! Tests: 4, Assertions: 4, Failures: 1.

Slide 52

Slide 52 text

Engineering Time: Moderate – to create Execution Time: Short Actionability: Failures require some interpretation by engineers, but frameworks usually let you annotate them in test output Followup Time: Moderate to extensive – all engineers should monitor tests at build time and contribute to ongoing maintenance Automated Testing Fact Sheet

Slide 53

Slide 53 text

Test for Evaluating and Managing Risk • Identify risk by... • Understanding what you know • Admitting what you don’t know • Creating a shared “map” that your team can build on • Discussing concerns early and often • Tests should… • Tell you as much as possible • As soon as possible • About the biggest sources of risk • And be updated to cover what you learn along the way

Slide 54

Slide 54 text

Are we good?

Slide 55

Slide 55 text

Rollout • Project began on 30 October 2015 • First test runs in March 2016 • First production deploy planned for June 2016

Slide 56

Slide 56 text

Fatal error: Allowed memory size of 536870912 bytes exhausted (tried to allocate 140729445144864 bytes) in ...

Slide 57

Slide 57 text

Identify Vague Hypotheses • Start with biggest components of system • Rank by least implausible • Identify means of validating hypotheses • Seek consistent reproducibility

Slide 58

Slide 58 text

Extension Memory Management? • Type mismatches involving size_t • Side effect of incomplete changes for updated C APIs • Reread code to identify misses • Invalid read/writes that Valgrind would catch • Had previously tested extensions • Would suggest lacking test coverage • Shared memory corruption (very vague) • Had previously seen issues with PCRE JIT • Would have to reproduce error state with GDB

Slide 59

Slide 59 text

Use Automated Tests To Find Memory Issues • Use php7dev vagrant box (or similar) for controlled environment • Use Gcov to evaluate test coverage • Create additional tests as needed • Run all tests with Valgrind

Slide 60

Slide 60 text

Automated Testing For Extensions • Use make test or php run-tests.php • Failures recorded in test runner output • Failed tests will produce artifacts • Use .sh artifact to rerun test more easily • Or to run test with GDB • Tools documented at https://qa.php.net • Read run-tests.php to fill in any gaps

Slide 61

Slide 61 text

Example .phpt Test --TEST— Mustache::parse() member function --SKIPIF— --FILE— parse('{{test}}'); var_dump(get_class($tmpl)); ?> --EXPECT— string(11) "MustacheAST"

Slide 62

Slide 62 text

Example Test Runner Output ===================================================================== PHP : /home/vagrant/php-src/sapi/cli/php PHP_SAPI : cli PHP_VERSION : 7.2.0-dev ZEND_VERSION: 3.2.0-dev PHP_OS : Linux - Linux php7dev 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 INI actual : /home/vagrant/php-src/tmp-php.ini More .INIs : --------------------------------------------------------------------- PHP : /home/vagrant/php-src/sapi/phpdbg/phpdbg PHP_SAPI : phpdbg PHP_VERSION : 7.2.0-dev ZEND_VERSION: 3.2.0-dev PHP_OS : Linux - Linux php7dev 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u1 x86_64 INI actual : /home/vagrant/php-src/tmp-php.ini More .INIs : --------------------------------------------------------------------- CWD : /home/vagrant/php-src Extra dirs : VALGRIND : Not used ===================================================================== Running selected tests. PASS Mustache::parse() member function [001.phpt] ===================================================================== Number of tests : 1 1 Tests skipped : 0 ( 0.0%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 0 ( 0.0%) ( 0.0%) Expected fail : 0 ( 0.0%) ( 0.0%) Tests passed : 1 (100.0%) (100.0%) --------------------------------------------------------------------- Time taken : 0 seconds =====================================================================

Slide 63

Slide 63 text

Example With Memory Leak [...] --------------------------------------------------------------------- CWD : /home/vagrant/php-src Extra dirs : VALGRIND : valgrind-3.10.0 ===================================================================== Running selected tests. LEAK Mustache::parse() member function [001.phpt] ===================================================================== Number of tests : 1 1 Tests skipped : 0 ( 0.0%) -------- Tests warned : 0 ( 0.0%) ( 0.0%) Tests failed : 0 ( 0.0%) ( 0.0%) Expected fail : 0 ( 0.0%) ( 0.0%) Tests leaked : 1 (100.0%) (100.0%) Tests passed : 0 ( 0.0%) ( 0.0%) --------------------------------------------------------------------- Time taken : 4 seconds ===================================================================== ===================================================================== LEAKED TEST SUMMARY --------------------------------------------------------------------- Mustache::parse() member function [001.phpt] =====================================================================

Slide 64

Slide 64 text

Example Valgrind Output (001.mem) ==13733== Invalid write of size 4 ==13733== at 0xF849E55: zim_Mustache_parse (mustache_mustache.cpp:850) ==13733== by 0x91ED13: ZEND_DO_FCALL_SPEC_RETVAL_USED_HANDLER (zend_vm_execute.h:1097) ==13733== by 0x8C60DA: execute_ex (zend_vm_execute.h:429) ==13733== by 0x92115F: zend_execute (zend_vm_execute.h:474) ==13733== by 0x87F7E3: zend_execute_scripts (zend.c:1474) ==13733== by 0x81EE9F: php_execute_script (main.c:2533) ==13733== by 0x9232E9: do_cli (php_cli.c:990) ==13733== by 0x44E7BB: main (php_cli.c:1378)

Slide 65

Slide 65 text

Use Gcov To Evaluate Test Coverage 1. phpize 2. ./configure CFLAGS="--coverage" CXXFLAGS="--coverage" LDFLAGS="- -coverage" # may vary slightly for your extension 3. make clean all 4. lcov --directory . --zerocounters # cleanup previous run, if any 5. # run tests 6. lcov --directory . --capture --output-file coverage.info 7. genhtml --output-directory lcov_html coverage.info For php-src, use existing build targets: https://wiki.php.net/doc/articles/writing-tests

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

Capture Error State In GDB • Find simplest means to reproduce • Ideally script: gdb --args php –f [.php file] • Or core file: gdb php-fpm [core file] • But maybe HTTP request (and live process): gdb php-fpm [pid] • But first you’ll need a debug build • May behave differently! • Script GDB to replay debugging steps Excellent tutorial: http://www.unknownroad.com/rtfm/gdbtut/

Slide 68

Slide 68 text

The Culprit: APCu • Invalid write to memory region containing string size • Solved problem with latest extension version

Slide 69

Slide 69 text

What We Could’ve Tried • More thorough review of known issues with third-party extensions • Better scrutiny of extensions that operated on shared memory • Had already disabled PCRE JIT • Assumed APCu extension was “safe” because we didn’t write it • Ensure we’d deployed latest version of all extensions • Strived earlier for better sense of “correctness”?

Slide 70

Slide 70 text

Replay Testing • Sample access logs • Feed to cURL • Examine logs • Consider combining with stress testing (ApacheBench)

Slide 71

Slide 71 text

Working Through Unknown Unknowns • Identify vague hypotheses • Break into smaller pieces • Overdocument, overcommunicate • Look for anything you recognize in the noise • Rotate through problem solving strategies • Pair problem solving • Keep multiple irons in the fire

Slide 72

Slide 72 text

Managing Expectations Around Unknowns • Be sensitive to stress and frustration • Reframe around thrill of solving big problems • Call out indications that the problem is getting resolved • Be able to identify when you have more information than you used to • Keep working the room • Always be ready to pull the plug

Slide 73

Slide 73 text

Fringe Benefits • Expand scope of what’s “known” • Develop new skills and tools • New tests? • New monitoring opportunities? • New things to automate? • Engineers like solving big problems!

Slide 74

Slide 74 text

Engineering Time: Extensive Execution Time: Extensive Actionability: Varies, but often low Followup Time: Extensive System Testing Fact Sheet

Slide 75

Slide 75 text

Are we good?...

Slide 76

Slide 76 text

Rollout Considerations • Datacenter move was in-flight simultaneously • Testing focused on customer-facing website • No one wants to impact revenue

Slide 77

Slide 77 text

Rollout • Schedule with lead time for hard dates around datacenter move • Be ready to pause project if the two will collide • Start with non-production environments • Then focus on customer-facing website • The most value would be in speeding up that experience • Use production load balancer to control traffic served by PHP7 • But first agree on monitoring plan • Make changes from “war room” • Let changes bake in • Overcommunicate plan to broader org • Not impacting others is different from not surprising them

Slide 78

Slide 78 text

Your Systems Will Have In-Between States • Embrace simultaneous realities • Think in terms of backwards- and forwards-compatibility • Attachment to a single state is a form of risk • Detachment facilitates separating workflows Harold Pinter: “A thing is not necessarily either true or false; it can be both true and false.”

Slide 79

Slide 79 text

No content

Slide 80

Slide 80 text

Rollout, Continued • Confirm metrics • “Well, now my capacity planning is done for the year!” • Celebrate! • Continue slow roll to cover all website traffic • Begin slow roll on other services

Slide 81

Slide 81 text

Rollout, Continued? • Project began on 30 October 2015 • First test runs in March 2016 • First production deploy planned for June 2016 • Last blocking issues resolved in July 2016 • Development and staging environments updated in July 2016 • First production deploy in August 2016 • All servers on PHP7 on 7 February 2017

Slide 82

Slide 82 text

Test for Evaluating and Managing Risk • Identify risk by... • Understanding what you know • Admitting what you don’t know • Creating a shared “map” that your team can build on • Discussing concerns early and often • Tests should… • Tell you as much as possible • As soon as possible • About the biggest sources of risk • And be updated to cover what you learn along the way

Slide 83

Slide 83 text

Test to Continuously Challenge Worldview • Developers shouldn't just think about whether something works • Something doesn't work, it works just once • We initially assumed metrics were wrong!

Slide 84

Slide 84 text

What We Covered • Risk as “what you don’t know” • Identifying risk in complex applications • Common risk for infrastructure changes • Testing as risk management • Manual and automated tools • AKA software engineering as information gathering • Product management strategies • Identifying goals • Kicking off and running big projects • Choosing the most important thing to work on in service of that goal • AKA delivering value to the business

Slide 85

Slide 85 text

Questions? Adam Baratz [email protected]