Core Platform Team/Initiatives/Add API integration tests
Add API integration tests
Create a suite of end-to-end tests for all modules of MediaWiki’s action API.
- Significance and Motivation
We need to ensure we don’t break anything while refactoring core code for the Decoupling and HTTP API projects. Unit test coverage of MediaWIki core is insufficient for this purpose. Instead of writing unit tests for code what we plan to dismantle, we should instead ensure that the behavior of the outward facing API remains stable. Since all key functionality of MediaWiki can be accessed via that API, testing that API could cover all critical code paths, assuming tests are written for all relevant actions and combinations of parameters.
Ultimately: Increase predictability and safety of deploying code changes
Immediately: Provide a standard framework for testing HTTP services and APIs.
Enables/unblocks: Allow confident refactoring by improving test coverage.
- Baseline Metrics
- No end-to-end API tests exist.
- Target Metrics
- Measure: Percentage of modules covered, percentage of parameters covered for each module, number of parameter permutations covered, number of cross-module use cases tested.
- API actions that cause database updates have tests covering all relevant modes of operation (combinations of parameters).
- End-to-end API tests can easily be run in a local development environment
- Stretch/later: All API actions have tests covering all relevant modes of operation.
- Stretch/later: End-to-end API tests are run automatically as part of CI
- Stretch/later: End-to-end API tests are run periodically against deployment-prep ("beta") projects.
- API users
- Core code developers
- Release Engineering
- Known Dependencies/Blockers
Epics, User Stories, and Requirements
Epic 1: Identify requirements for test runner
- the test runner executes each test case against the given API and reports any failures to comply with the expected results
- tests are specified in yaml. Test contains an interaction sequence consisting of a request/response pairs (aka steps).
- for each step, the request as specified in the interaction step is sent to the server, and the response received is compared to the response specified in the step.
- Responses with a JSON body are compared to a JSON/YAML structure in the response part of the step. All keys present in the expected response must be present in the actual response. For primitive values, the actual value must match the expected value.
- Expected primitive values can either be given verbatim (so the actual value is expected to be exactly the same), or yaml tags can be used to specify the desired type of match.
- For the MVP, the only kind of match supported beyond equivalence is one based on regular expressions.
- the test runner generates human readable plain text output
Features required to fulfill this project's goals:
- support for variables
- extracted from response
- loaded from config file
- support for cookies and sessions
- discover tests by recursively examining directories
- support fixtures
- execute fixtures in order of dependency
Features expected to become useful or necessary later, or for stretch goals:
(roughly in order or relevance)
- discover tests specified by extensions (needs MW specific code?)
- allow test suites to be filtered by tag
- execute only fixtures needed by tests with the tag (or other fixtures that are needed)
- allow for retries
- allow tests to be skipped (preconditions)
- support JSON output
- Support running the same tests with several sets of parameters (test cases)
- support for cookie jars
- allow test for interactions spanning multiple sites
- support HTML output
- parallel test execution
Requirements that seem particular to (or especially important for) the Wikimedia use case:
- HTTP-centric paradigm, focusing on specifying the headers and body of requests, and running assertions against headers and body of the response.
- Support for running assertions against parts of a structured (JSON) response (JSON-to-JSON comparison, with the ability to use the more human friendly YAML syntax)
- filtering by tags (because we expect to have a large number of tests)
- parallel execution (because we expect to have a large number of tests)
- yaml based declarative tests and fixtures: tests should be language agnostic, it should be easy to write tests for people involved with different language ecosystems and code bases. This also avoids lock-in to a specific tool, since yaml is easy to parse and convert.
- generalized fixture creation, entirely API based, without the need to write "code" other than specifying requests in yaml.
- randomized fixtures, so we can create privileged users on potentially public tests systems.
- control over cookies and sessions
- ease of running on in dev environments without the need to install additional tools / infrastructure (this might by a reason to switch to python for implementation; node.js is also still in the race).
- ease of running in WMF's CI environment (jenkins, quibble)
- option to integrate with PHPUnit
- discovery of tests defined by extensions.
- dual use for monitoring a live system, in addition to testing against a dummy system
Usage of variables and fixtures:
- have a well known root user with fixed credentials in config.
- name and credentials for that user are loaded into variables that can be accessed from within the yaml files.
- yaml files that create fixtures, such as pages and users, can read but also define variables.
- Variable values are either random with a fixed prefix, or they are extracted from an http response.
- Variable value extraction happens using the same mechanism we use for checking/asserting responses
Epic 2: Baseline implementation of Phester test runner
T221088: Baseline implementation of Phester test runner
- Run the test suites in sequence, according to the order given on the command line
- Execute the requests within each suite in sequence.
- The runner can be invoked from the command line
- Required input: the base URL of a MediaWiki instance
- Required input: one or more test description files.
- The test runner executes each test case against the given API and reports any failures to comply with the expected results
- human readable plain text output
- Support regular expression based value matching
- Test definitions are declarative (YAML)
Rationale for using declarative test definition and YAML:
- test definition not bound to a specific programming language (PHP, JS, python)
- keeps tests simple and "honest", with well defined input and output, no hidden state, and no loops or conditionals
- no binding to additional tools or libraries, tests stay "pure and simple"
- Easy to parse and process, and thus to port away from, use as a front-end for something else, or analyze and evaluate.
- YAML is JSON compatible. JSON payloads can just be copied in.
- Creating a good DSL is hard, evolving a DSL is harder. YAML may be a bit ugly, but it's generic and flexible.
- The test runner should be implemented in PHP. Rationale: It is intended to run in a development environment used to write PHP code. Also, we may want to pull this into the MediaWIki core project via composer at some point.
- Use the Guzzle library for making HTTP requests
- The test runner should not depend on MediaWiki core code.
- The test runner should not hard code any knowledge about the MediaWiki action API, and should be designed to be usable for testing other APIs, such as RESTbase.
- The test runner MUST ask for confirmation that it is ok for any information in the given target API to be damaged or lost (unless --force is specified)
- No cleanup (tear-down) is performed between tests.
Epic 3: Pick a test runner (our own, or an existing one)
Story: Try writing basic tests
T225614:Create an initial set of API integration tests
The tests below should work on the MVP, so they must not require variables. So no login.
For MediaWiki, relevant stories to test are:
- anonymous page creation and editing, verify change in content (verifying access to old revisions requires variable support) via API:Edit
- re-parse of dependent pages (red links turning blue, missing templates getting used after being created) via API:Edit and then fetching the rendered page from the article path (index.php?title=xyz)
- page history with edit summary, size diff (testing minor edits and user names requires variables for login) see API:Revisions
- recent changes with edit summary, size diff, etc API:RecentChanges
- renaming/moving a page (basic) via API:Move
- pre-save transform (PSR) (via API:Edit)
- template transclusion via API:Parsing wikitext
- parser functions via API:Parsing wikitext (some)
- magic words via API:Parsing wikitext (some)
- diffs (use relative revision ids) via API:Compare
- fetching different kinds of links / reverse links (see API:Query#Page_types)
- listing category contents (see API:Query#Page_types)
In addition, it would be useful to see one or two basic tests for Kask and RESTbase.
It would be nice to see what these tests look like in our own runner (phester) and for some of the other candidates like tavern, behat, codeception, or dredd. It's not necessary to write all the tests for all the different systems, though.
Story: Try writing tests using variables
T228001: Create a set of API integration tests using variables
Without support for variables in phester, these tests have to be written "dry", with no way to execute them.
- watchlist (watch/unwatch/auto-watch)
- changing preferences
- bot edits (interaction of bot permission and bot parameter)
- diffs with fixed revision IDs (test special case for last and first revision)
- patrolling, patrol log
- listing user contributions
- listing users
- page deletion/undeletion (effectiveness)
- page protection (effectiveness, levels)
- user blocking (effectiveness, various options)
- MediaWiki namespace restrictions
- user script protection (can only edit own)
- site script protection (needs special permission)
- minor edits
- remaining core parser functions
- remaining core magic words (in particular REVISIONID and friends)
- Pre-save transform (PST), signatures, subst templates, subst REVISIONUSER.
- Special pages transclusion
- newtalk notifications
- media file uploads (need to be enabled on the wiki) (needs file upload support in phester)
- site stats (may need support for arithmetic functions)
It would be nice to see what these tests look like in our own runner (phester) and for some of the other candidates, like tavern, behat, codeception, or dredd.
Story: Decide whether to invest in Phester, and decide what to use instead
T222100: Decide whether creating Phester is actually worth while
Decision matrix: https://docs.google.com/spreadsheets/d/1G50XPisubSRttq4QhakSij8RDF5TBAxrJBwZ7xdBZG0
Decision made: we will use SuperTest. DoR: Using SuperTest Rather Than Creating an API Integration Test Runner.
- execution model
- ease of running locally
- easy of running in CI
- test language
- ease of editing
- easy of migration
- scope/purpose fit
- cost to modify/maintain
- control over development
- license model
- license sympathy
- recursive body matches
- regex matches
- variable injection
- global fixtures
- JSON output
- scan for test files
- filter tests by tag
- parallel execution
Epic 4: Add any missing functionality to the selected runner
Phabricator Task: T227999: Extended implementation of test runner
|1||CI integrations||Test runner has a low cost integrating with CI infrastructure||Should Have|
|2||IDE integration||IDE integration including autocompletion and validation||Should Have|
|3||FLOSS||Test runner must be distributed under an open source license||Should Have|
|4||documentation and support||Test runner should be well-documented and well-supported||Should Have|
|5||familiar language||Tests are written in a well-documented, familiar language||Nice To Have|
|6||run from CLI||Tests should be runnable from the command line||Must Have|
|7||avoid vendor lock-in||Reduce vendor lock-in||Should Have|
|8||functional fit||Test runner should have a good fit in scope and purpose for running API tests over HTTP||Should Have|
|9||maintenance burden||Minimize the amount of code we have to maintain||Should Have|
- Developer of the functionality being tested
- Test author
- CI engineer
- Operations engineer
- Tester (anybody running tests)
|User Story Title||User Story Description||Priority||Notes||Status|
|1||Run Tests Locally||As a developer, I want to be able to quickly run tests against the functionality I'm writing||Must Have||
|2||Monitoring||As an operations engineer, I would like to be able to use the test running in production to test if services are alive||Optional||
|3||Parallel Testing||As a CI engineer, I want multiple tests to run in parallel||Should Have||
|4||Run All Tests||As a CI engineer, I want to be able to easily run all of the core tests||Must Have||Resolved|
|5||Machine-Readable Test Output||As a CI engineer, I want machine-readable output from my test framework||Must Have||Resolved|
|6||Resources||As a test author, I want an easy way to create resources||Must Have||
|7||Fixtures||As a test author, I want an easy way to create fixtures that are used across tests||Must Have||
|8||Chaining||As a test author, I want a way to use parts of a response in a subsequent request, or when validating responses||Must Have||
|9||Multi-Site Tests||As a test author, I need to be able to run tests that span multiple sites||Must Have||
|10||Parameterized Tests||As a test author, I would like to be able to run tests with multiple sets of parameters||Should Have||
|11||Validate Test Responses||As a test author, I want to be able to validate responses||Must Have||
|12||Multiple Agents||As a test author, I want to write tests that combine requests made using different sessions||Must Have||
|13||Control HTTP Requests||As a test author, I need to be able to control all parts of HTTP request||Must Have||
|14||Unique Fake Values||As a test author, I need a way to create unique, fake values||Must Have||Resolved|
|15||Redirect Support||As a test author, I need the framework to support redirects||Must Have||Resolved|
|16||Configuration||As a test author, I want a way to access configuration values||Must Have||
|17||Known State||As a tester, I want to know that the wiki is empty before I begin to run tests||Must Have||Sprint 1|
|18||Example Tests||As a test author, I want a set of example tests to base my tests upon||Must Have||
|19||Action API Test Utilities||As a test author, I want to have a set of utilities to help me test against the action API||Must Have||
|20||Minimal Testing Environment||As a test author, I want to be able to set up a local environment to write and run tests||Must Have||
|21||Run Extension Tests||As a tester, I want to be able to run all of the core and extension tests||Must Have||
Epic 5: Test runner in production
Story: Implement initial set of tests
Implement the tests defined during the evaluation phase (Epic 3) for the actual runner. If we go with Phester, the tests should not have to change at all, or just need minor tweaks. If we go with a different frameworks, the experimental tests need to be ported to that framework.
Story: Deploy test runner
Story: Document test runner
Create documentation that allows others to create tests and run them.
Possible follow-up initiatives
Create a containerized version of the test runner for testing MediaWiki
Make API tests part of CI gateway tests
Write comprehensive suite of tests covering core actions that modify the database
Any API module that returns true from needsToken()
Write comprehensive suite of tests covering core query actions
And API module extending ApiQueryModule
Build out test suite to cover extensions deployed by WMF
Time and Resource Estimates
- Estimated Start Date
- Actual Start Date
- Estimated Completion Date
- Actual Completion Date
- Resource Estimates
- Specifying test runner: 5 hours over two weeks, plus 5 hours of other people's time for review and discussion.
- Implementing test runner: 20 hours over two weeks, plus 10 hours of other people's time for review and discussion. May need additional time to decide on technology choices.
- Creating a docker environment to run tests against: 10 hours over two weeks. May need additional time to learn more about docker.
- Writing tests for cross-module stories and all actions that modify the database: 100 to 200 hours. Most of this work is trivial, but some of it is rather involved. May be slow going at the beginning, while we figure out the fixtures we need.
- Core Platform
- Release Engineering
- What testing framework shall be used for the end-to-end tests?
- Should we use Selenium?
- Other Documents