Requests for comment/Proper command-line runner for maintenance tasks

Request for comment (RFC)
Proper command-line runner for maintenance tasks
Component General
Creation date
Author(s) Ori Livneh, Legoktm
Document status implemented
See Phabricator.

Background edit

MediaWiki maintenance scripts are used for tons of things, but writing them is annoying, and finding the right one to use is kind of impossible.

Logs from 2015 IRC discussion.

Problem edit

maintenance/ is a mess: it contains over 150 PHP scripts, which range from essential to obscure or obsolete, with no sign-posts to guide the user to the one they need or to facilitate discovery. Were the README and online documentation comprehensive and current, they would still not be an adequate substitute for a good command-line interface.

It's also a problem on a technical level as well. Because maintenance scripts are their own entry point, every script has to copy in the same boilerplate:

$IP = getenv( 'MW_INSTALL_PATH' );
if ( $IP === false ) {
        $IP = __DIR__ . '/../../..';
}
require_once "$IP/maintenance/Maintenance.php";

For extensions, they have to check whether the extension is even installed ($this->requireExtension(...)), since wiki farms will often have extensions checked out, but not necessarily enabled on the specific wiki.

Proposal edit

This RFC proposes to introduce a top-level command-line entry-point to MediaWiki which would provide access to individual maintenance tasks via subcommands. If the user does not specify a subcommand, a listing of the most commonly-used maintenance script should be enumerated, with appropriate in-line help, and presented to the user along with a tip on how to access detailed help for a particular subcommand.

This proposal could be implemented simply and without breaking backward-compatibility. The maintenance/ file hierarchy will stay and users will continue to be able to execute maintenance scripts directly. (This guarantee should not extend to future maintenance tasks, as a way of gradually driving users to adopt the top-level entry-point.) And eventually the old entry points will be removed.

The scaffolding for this design is the existing Maintenance class hierarchy, which already does most of the heavy lifting we need.

Structure edit

The classes that power maintenance scripts will be moved out of the entrypoints, probably into includes/Maintenance. This will probably happen gradually. The entry points will continue to function with some back-compat shim, but eventually will emit deprecation notices.

We'll extract a smaller interface out of Maintenance that doesn't include all of the option parsing/initialization stuff, provisionally named MaintenanceTask. Classes will extend that, and should require minimal modification to work.

fatalError will throw an exception instead of immediately quitting so the parent can identify whether the script was successful or not (e.g. update.php). The script runner will catch that exception and emulate the old fatalError behavior (print to stderr and exit).

Registry edit

We'll maintain a registry of maintenance scripts.

I'm thinking that scripts can be classified into two types based on audience: sysadmins and developers. For example, createAndPromote.php is for sysadmins to use, while checkLess.php (verifies less syntax) is really intended for developers. By default when you ask the runner for the list of scripts, it'll provide just the sysadmin ones, and there will be a command-line flag so you can ask for the developer ones as well. We could probably add a basic search functionality of descriptions.

For core the registry will be some static arrays in a class, and for extensions it'll be in extension.json. It'll map the human name to the class name. For example, 'mysql' => 'MysqlMaintenance',.

Entry point edit

It'll be called maintenance.php in the root directory. If you pass it no arguments, it'll output a list of sysadmin scripts that can be run. Adding something like --type=developer will have it output development ones instead.

Syntax: php maintenance.php update (for update.php), php maintenance.php MassMessage:sendMessages (for extension scripts). There will also need to be a way to run scripts out of the registry for e.g. local hacks.

To allow people to use locally hacked/modified maintenance scripts (a common use case in Wikimedia production), there will be an --extra-include=~/foo.php argument, which will be included before the core maintenance class is autoloaded, so the extra included copy should shadow it. If --extra-include is provided, then the name of the script can be a class name (verified with instanceof MaintenanceTask) for one-off scripts that aren't in the registry already.