Continuous integration/Tutorials/Debugging PHPUnit Parallel Test Failures

Since adding functionality for parallel PHPUnit testing in CI, it may sometimes be the case that tests fail in CI because of test sequencing issues that are not present on local development machines. Many PHPUnit tests rely or on or modify global state - the lack of isolation means that a random permutation of the test sequence may fail.

Reproducing the failure locally

edit

The first step to debugging a failure out see in CI is to try and reproduce the same failure on your local machine.

Reproducing the failure in your IDE (trivial case)

edit

If you are able to run the tests inside your development environment, first try simply running the failing test cases there and see if you can reproduce the failure. If so, the error is in the code or the test case - debug this as you normally would.

Running Quibble locally

edit

If the tests pass when run in your development environment, or if you are unable to run the tests locally, try reproducing the CI / Quibble execution on your local machine. In this example, we will use a quibble-vendor-mysql-php74-noselenium job with the URL https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-noselenium/40047/console.

Using the `jenkins-run-analysis` tool

edit

To help with the rapid debugging of CI jobs, you can use the jenkins-run-analysis tool. Clone the latest version of the repository and install the tool using pipenv:

$ pipenv install
$ pipenv shell

This will load the required python settings into the environment of your terminal. You can use pipenv shell to load this environment again if you start a new terminal.

From a fresh directory (in this example, the quibble-run-folder), you can use the run_local.py command to launch a local version of the failed test run, simply by passing in the URL of the failed run on Jenkins:

$ mkdir ~/quibble-run-folder
$ cd ~/quibble-run-folder
$ python ../jenkins-run-analysis/run_local.py https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-noselenium/40047/console

The tool will copy the environment and the docker arguments of the CI job and execute them on your local system. By default, a quibble folder will be created in the current working directory that contains the local state required by docker. This state can be reused, but if you notice that the local run is not identical to the server run, try removing the quibble/src folder - the local checkout may have different timestamps on the files compared to a fresh checkout, which might affect test ordering.

Debugging the failed run

edit

At this point you hopefully have a failing local run of the CI job, where the failure you see locally is the failure you see in CI, for example:

> phpunit '--testsuite' 'split_group_4' '--exclude-group' 'Broken,ParserFuzz,Stub,Standalone' '--group' 'Database' '--cache-result-file=.phpunit_group_4_database.result.cache'
Using PHP 7.4.33
Running with MediaWiki settings because there might be integration tests
PHPUnit 9.6.19 by Sebastian Bergmann and contributors.

.............................................................   61 / 5520 (  1%)
.............................................................  122 / 5520 (  2%)
.............................................................  183 / 5520 (  3%)
...
...................................F......................... 5246 / 5520 ( 95%)
............................................................. 5307 / 5520 ( 96%)
............................................................. 5368 / 5520 ( 97%)
...
..............................                                5520 / 5520 (100%)

Time: 04:12.036, Memory: 996.00 MB

There was 1 failure:

1) MediaWiki\Extension\CentralAuth\Tests\Phpunit\Integration\Special\SpecialCentralAuthTest::testViewForExistingGlobalTemporaryAccount
Failed asserting that '<div class='mw-htmlform-oo...' contains "(centralauth-admin-info-expired".

/workspace/src/extensions/CentralAuth/tests/phpunit/integration/Special/SpecialCentralAuthTest.php:380
phpvfscomposer:///workspace/src/vendor/phpunit/phpunit/phpunit:106

Here we can see that split_group_4 contains the failing test.

Reproducing the failure case in Quibble

edit

To debug further, we need a shell inside the failing test environment. We can achieve this by adding --bash to the run_local.py invocation:

$ python ../ci-run-analysis/run_local.py --bash https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php74-noselenium/40047/console

Before the parallel run, the "split groups" need to be created. This can be reproduced locally by running the associated composer command in the bash terminal:

$ composer run phpunit:prepare-parallel:extensions

It should then be possible to reproduce the failing "split group" by running it directly in PHPUnit, taking the exact command from the output of the failed job above:

$ composer run -- phpunit --testsuite split_group_4 --exclude-group Broken,ParserFuzz,Stub,Standalone --group Database --cache-result-file=.phpunit_group_4_database.result.cache

At this point, you should see the group of tests fail in exactly the way it failed in CI.

Reducing the failure case

edit

Usually the cause of the failure will be an interaction between two specific tests, whereas there might be thousands of tests in a split group. To track down the cause of the failure, we need to reduce the size of the split group. Using your favourite editor (with undo function), open the phpunit.xml file on the host system:

$ sudo vi ~/quibble-run-folder/quibble/src/phpunit.xml

Find the <testsuite name="split_group_4"> tag, and then delete half of the tests that appear before the failing test in the suite. Re-run the suite and see if the failure persists. Using a binary search you should fairly quickly come to a minimal reproduction set, which will usually be just two test classes - the failing test class, and the class which is causing the tests to fail:

    <testsuite name="split_group_4">
      <file>/workspace/src/extensions/CheckUser/tests/phpunit/integration/maintenance/PopulateCentralCheckUserIndexTablesTest.php</file>
      <file>/workspace/src/extensions/CentralAuth/tests/phpunit/integration/Special/SpecialCentralAuthTest.php</file>
    </testsuite>

Reproducing the failure in your IDE (parallel case)

edit

Once you know which tests are causing the issue, you can try and reproduce the issue in your IDE. Copy the failing test group to your phpunit.xml file, updating the paths of the files to match your local environment:

    <testsuite name="failing_group">
      <file>extensions/CheckUser/tests/phpunit/integration/maintenance/PopulateCentralCheckUserIndexTablesTest.php</file>
      <file>extensions/CentralAuth/tests/phpunit/integration/Special/SpecialCentralAuthTest.php</file>
    </testsuite>

Run the tests locally and observe the failure:

mw docker mediawiki exec -- MW_DB=wikidatawikidev composer run phpunit:entrypoint -- --testsuite failing_group

Once you have the failure locally and inside your IDE, you can further reduce the failure case by commenting out tests in both classes, and attach a debugger to help you find the source of the issue.