Core Platform Team/Initiatives/Add API integration tests

Initiative Description

< Initiatives


Create a suite of end-to-end tests for all modules of MediaWiki’s action API.

Significance and Motivation

We need to ensure we don’t break anything while refactoring core code for the Decoupling and HTTP API projects. Unit test coverage of MediaWIki core is insufficient for this purpose. Instead of writing unit tests for code what we plan to dismantle, we should instead ensure that the behavior of the outward facing API remains stable. Since all key functionality of MediaWiki can be accessed via that API, testing that API could cover all critical code paths, assuming tests are written for all relevant actions and combinations of parameters.


Ultimately: Increase predictability and safety of deploying code changes

Immediately: Provide a standard framework for testing HTTP services and APIs.

Enables/unblocks: Allow confident refactoring by improving test coverage.

Baseline Metrics
  • No end-to-end API tests exist.
Target Metrics
  • Measure: Percentage of modules covered, percentage of parameters covered for each module, number of parameter permutations covered, number of cross-module use cases tested.
  • API actions that cause database updates have tests covering all relevant modes of operation (combinations of parameters).
  • End-to-end API tests can easily be run in a local development environment
  • Stretch/later: All API actions have tests covering all relevant modes of operation.
  • Stretch/later: End-to-end API tests are run automatically as part of CI
  • Stretch/later: End-to-end API tests are run periodically against deployment-prep ("beta") projects.
  • API users
  • Core code developers
  • Release Engineering
Known Dependencies/Blockers


Epics, User Stories, and Requirements

< Initiatives

Epic 1: Identify requirements for test runner

Features, MVP:

  • the test runner executes each test case against the given API and reports any failures to comply with the expected results
  • tests are specified in yaml. Test contains an interaction sequence consisting of a request/response pairs (aka steps).
  • for each step, the request as specified in the interaction step is sent to the server, and the response received is compared to the response specified in the step.
  • Responses with a JSON body are compared to a JSON/YAML structure in the response part of the step. All keys present in the expected response must be present in the actual response. For primitive values, the actual value must match the expected value.
  • Expected primitive values can either be given verbatim (so the actual value is expected to be exactly the same), or yaml tags can be used to specify the desired type of match.
  • For the MVP, the only kind of match supported beyond equivalence is one based on regular expressions.
  • the test runner generates human readable plain text output

Features required to fulfill this project's goals:

  • support for variables
    • extracted from response
    • randomized
    • loaded from config file
  • support for cookies and sessions
  • discover tests by recursively examining directories
  • support fixtures
    • execute fixtures in order of dependency

Features expected to become useful or necessary later, or for stretch goals:

(roughly in order or relevance)

  • discover tests specified by extensions (needs MW specific code?)
  • allow test suites to be filtered by tag
    • execute only fixtures needed by tests with the tag (or other fixtures that are needed)
  • allow for retries
  • allow tests to be skipped (preconditions)
  • support JSON output
  • Support running the same tests with several sets of parameters (test cases)
  • support for cookie jars
  • allow test for interactions spanning multiple sites
  • support HTML output
  • parallel test execution

Requirements that seem particular to (or especially important for) the Wikimedia use case:

  • HTTP-centric paradigm, focusing on specifying the headers and body of requests, and running assertions against headers and body of the response.
  • Support for running assertions against parts of a structured (JSON) response (JSON-to-JSON comparison, with the ability to use the more human friendly YAML syntax)
  • filtering by tags (because we expect to have a large number of tests)
  • parallel execution (because we expect to have a large number of tests)
  • yaml based declarative tests and fixtures: tests should be language agnostic, it should be easy to write tests for people involved with different language ecosystems and code bases. This also avoids lock-in to a specific tool, since yaml is easy to parse and convert.
  • generalized fixture creation, entirely API based, without the need to write "code" other than specifying requests in yaml.
  • randomized fixtures, so we can create privileged users on potentially public tests systems.
  • control over cookies and sessions
  • ease of running on in dev environments without the need to install additional tools / infrastructure (this might by a reason to switch to python for implementation; node.js is also still in the race).
  • ease of running in WMF's CI environment (jenkins, quibble)
  • option to integrate with PHPUnit
  • discovery of tests defined by extensions.
  • dual use for monitoring a live system, in addition to testing against a dummy system

Usage of variables and fixtures:

  • have a well known root user with fixed credentials in config.
  • name and credentials for that user are loaded into variables that can be accessed from within the yaml files.
  • yaml files that create fixtures, such as pages and users, can read but also define variables.
  • Variable values are either random with a fixed prefix, or they are extracted from an http response.
  • Variable value extraction happens using the same mechanism we use for checking/asserting responses

Epic 2: Baseline implementation of Phester test runner

T221088: Baseline implementation of Phester test runner

Functional requirements:

  • Run the test suites in sequence, according to the order given on the command line
    • Execute the requests within each suite in sequence.
  • The runner can be invoked from the command line
    • Required input: the base URL of a MediaWiki instance
    • Required input: one or more test description files.
  • The test runner executes each test case against the given API and reports any failures to comply with the expected results
  • human readable plain text output
  • Support regular expression based value matching
  • Test definitions are declarative (YAML)

Rationale for using declarative test definition and YAML:

  • test definition not bound to a specific programming language (PHP, JS, python)
  • keeps tests simple and "honest", with well defined input and output, no hidden state, and no loops or conditionals
  • no binding to additional tools or libraries, tests stay "pure and simple"
  • Easy to parse and process, and thus to port away from, use as a front-end for something else, or analyze and evaluate.
  • YAML is JSON compatible. JSON payloads can just be copied in.
  • Creating a good DSL is hard, evolving a DSL is harder. YAML may be a bit ugly, but it's generic and flexible.

Implementation notes:

  • The test runner should be implemented in PHP. Rationale: It is intended to run in a development environment used to write PHP code. Also, we may want to pull this into the MediaWIki core project via composer at some point.
  • Use the Guzzle library for making HTTP requests
  • The test runner should not depend on MediaWiki core code.
  • The test runner should not hard code any knowledge about the MediaWiki action API, and should be designed to be usable for testing other APIs, such as RESTbase.
  • The test runner MUST ask for confirmation that it is ok for any information in the given target API to be damaged or lost (unless --force is specified)
  • No cleanup (tear-down) is performed between tests.

Epic 3: Pick a test runner (our own, or an existing one)

Story: Try writing basic tests

T225614:Create an initial set of API integration tests

The tests below should work on the MVP, so they must not require variables. So no login.

For MediaWiki, relevant stories to test are:

In addition, it would be useful to see one or two basic tests for Kask and RESTbase.

It would be nice to see what these tests look like in our own runner (phester) and for some of the other candidates like tavern, behat, codeception, or dredd. It's not necessary to write all the tests for all the different systems, though.

Story: Try writing tests using variables

T228001: Create a set of API integration tests using variables

Without support for variables in phester, these tests have to be written "dry", with no way to execute them.

  • watchlist (watch/unwatch/auto-watch)
  • changing preferences
  • bot edits (interaction of bot permission and bot parameter)
  • undo
  • rollback
  • diffs with fixed revision IDs (test special case for last and first revision)
  • patrolling, patrol log
  • auto-patrolling
  • listing user contributions
  • listing users
  • page deletion/undeletion (effectiveness)
  • page protection (effectiveness, levels)
  • user blocking (effectiveness, various options)
  • MediaWiki namespace restrictions
    • user script protection (can only edit own)
    • site script protection (needs special permission)
  • minor edits
  • remaining core parser functions
  • remaining core magic words (in particular REVISIONID and friends)
  • Pre-save transform (PST), signatures, subst templates, subst REVISIONUSER.
  • Special pages transclusion
  • newtalk notifications
  • media file uploads (need to be enabled on the wiki) (needs file upload support in phester)
  • site stats (may need support for arithmetic functions)

It would be nice to see what these tests look like in our own runner (phester) and for some of the other candidates, like tavern, behat, codeception, or dredd.

Story: Decide whether to invest in Phester, and decide what to use instead

T222100: Decide whether creating Phester is actually worth while

Decision matrix:

Decision made: we will use SuperTest. DoR: Using SuperTest Rather Than Creating an API Integration Test Runner.


  • Phester
  • strst
  • tavern
  • behat
  • codeception
  • RobotFramework
  • SoapUI
  • dredd


  • runtime
  • execution model
  • ease of running locally
  • easy of running in CI
  • test language
  • ease of editing
  • easy of migration
  • scope/purpose fit
  • stability/support
  • cost to modify/maintain
  • control over development
  • documentation
  • license model
  • license sympathy
  • recursive body matches
  • regex matches
  • variables
  • variable injection
  • global fixtures
  • JSON output
  • scan for test files
  • filter tests by tag
  • parallel execution

Epic 4: Add any missing functionality to the selected runner

Phabricator Task: T227999: Extended implementation of test runner

Non-Functional Requirements:

Requirement Name Requirement Priority Notes
1 CI integrations Test runner has a low cost integrating with CI infrastructure Should Have
2 IDE integration IDE integration including autocompletion and validation Should Have
3 FLOSS Test runner must be distributed under an open source license Should Have
4 documentation and support Test runner should be well-documented and well-supported Should Have
5 familiar language Tests are written in a well-documented, familiar language Nice To Have
6 run from CLI Tests should be runnable from the command line Must Have
7 avoid vendor lock-in Reduce vendor lock-in Should Have
8 functional fit Test runner should have a good fit in scope and purpose for running API tests over HTTP Should Have
9 maintenance burden Minimize the amount of code we have to maintain Should Have


  • Developer of the functionality being tested
  • Test author
  • CI engineer
  • Operations engineer
  • Tester (anybody running tests)

User Stories:

User Story Title User Story Description Priority Notes Status
1 Run Tests Locally As a developer, I want to be able to quickly run tests against the functionality I'm writing Must Have
  • ability to run locally
  • includes selecting the appropriate subset
  • includes running a container
2 Monitoring As an operations engineer, I would like to be able to use the test running in production to test if services are alive Optional
  • needs support for retries
3 Parallel Testing As a CI engineer, I want multiple tests to run in parallel Should Have
  • need to learn how to use and apply it
Sprint 1
4 Run All Tests As a CI engineer, I want to be able to easily run all of the core tests Must Have Resolved
5 Machine-Readable Test Output As a CI engineer, I want machine-readable output from my test framework Must Have Resolved
6 Resources As a test author, I want an easy way to create resources Must Have
  • resources include users and pages
7 Fixtures As a test author, I want an easy way to create fixtures that are used across tests Must Have
  • fixtures are singleton resources that can be used across tests
8 Chaining As a test author, I want a way to use parts of a response in a subsequent request, or when validating responses Must Have
  • for feeding responses into requests: variable extraction (grab) and variable reference or injection
  • for validation: string interpolation or regular expressions
9 Multi-Site Tests As a test author, I need to be able to run tests that span multiple sites Must Have
  • requires a retry mechanism
10 Parameterized Tests As a test author, I would like to be able to run tests with multiple sets of parameters Should Have
  • data provider
11 Validate Test Responses As a test author, I want to be able to validate responses Must Have
  • use of a standard assertion library would be a plus
  • must be able to use powerful assertions: support regular expressions, string interpolation, and JSON structure containment
12 Multiple Agents As a test author, I want to write tests that combine requests made using different sessions Must Have
  • control which cookie jar to use
  • emulates user login, browser context
13 Control HTTP Requests As a test author, I need to be able to control all parts of HTTP request Must Have
  • must include headers and cookies
  • do not need to control HTTP auth
14 Unique Fake Values As a test author, I need a way to create unique, fake values Must Have Resolved
15 Redirect Support As a test author, I need the framework to support redirects Must Have Resolved
16 Configuration As a test author, I want a way to access configuration values Must Have
  • root user credentials
  • address of target system(s)
17 Known State As a tester, I want to know that the wiki is empty before I begin to run tests Must Have Sprint 1
18 Example Tests As a test author, I want a set of example tests to base my tests upon Must Have
  • port example tests
Sprint 1
19 Action API Test Utilities As a test author, I want to have a set of utilities to help me test against the action API Must Have
  • action API call: name and parameters
  • get tokens
  • login
  • edit
  • create a user
  • create a random user
  • create a random page
Sprint 1
20 Minimal Testing Environment As a test author, I want to be able to set up a local environment to write and run tests Must Have
  • may just need a set of instructions
Sprint 1
21 Run Extension Tests As a tester, I want to be able to run all of the core and extension tests Must Have
  • may require a change to extension.json to identify the test directory
  • needs a script to take the info from extension.json and build a Mocha config

Epic 5: Test runner in production

Story: Implement initial set of tests

Implement the tests defined during the evaluation phase (Epic 3) for the actual runner. If we go with Phester, the tests should not have to change at all, or just need minor tweaks. If we go with a different frameworks, the experimental tests need to be ported to that framework.

Story: Deploy test runner

Story: Document test runner

Create documentation that allows others to create tests and run them.

Possible follow-up initiatives

Create a containerized version of the test runner for testing MediaWiki

Make API tests part of CI gateway tests

Write comprehensive suite of tests covering core actions that modify the database

Any API module that returns true from needsToken()

Write comprehensive suite of tests covering core query actions

And API module extending ApiQueryModule

Build out test suite to cover extensions deployed by WMF

Time and Resource Estimates

< Initiatives

Estimated Start Date

April 2019

Actual Start Date

April 2019

Estimated Completion Date

None given

Actual Completion Date

None given

Resource Estimates
  • Specifying test runner: 5 hours over two weeks, plus 5 hours of other people's time for review and discussion.
  • Implementing test runner: 20 hours over two weeks, plus 10 hours of other people's time for review and discussion. May need additional time to decide on technology choices.
  • Creating a docker environment to run tests against: 10 hours over two weeks. May need additional time to learn more about docker.
  • Writing tests for cross-module stories and all actions that modify the database: 100 to 200 hours. Most of this work is trivial, but some of it is rather involved. May be slow going at the beginning, while we figure out the fixtures we need.
  • Core Platform
  • Release Engineering

Open Questions

< Initiatives

  • What testing framework shall be used for the end-to-end tests?
  • Should we use Selenium?

Documentation Links

< Initiatives


None given


None given

Other Documents

Using our Integration Testing Framework for Monitoring