API/Architecture work/i18n

With Gerrit change 160798 merged in November 2014, API help messages can be internationalized. This does require some work on the part of extensions; the changes shown here can serve as an example.

Help messages edit

Message details edit

The help messages for API modules are namespaced using the "module path", which is the string used for action=help's "modules" parameter. For modules added to $wgAPIModules this is going to be the same as the key used in that array, while for modules added to $wgAPIPropModules , $wgAPIListModules , or $wgAPIMetaModules it will be that key prefixed with "query+".

  • The description message, formerly returned by the getDescription() method, is apihelp-$path-description. This may be overridden by implementing the getDescriptionMessage() method, but cases where that is needed are rare.
  • The parameter description messages, formerly returned by the getParamDescription() method, are apihelp-$path-param-$name (where $name is the key from getAllowedParams()). This may be overridden by setting a value for ApiBase::PARAM_HELP_MSG in the data structure returned from getAllowedParams().
    • Parameters with a description similar to "When more results are available, use this to continue" should use api-help-param-continue instead of redefining a duplicate message.
    • Sorting parameters taking values "newer" and "older" (with related "start" and "end" parameters) should use api-help-param-direction instead of redefining a duplicate message.
    • Modules using CSRF tokens by implementing needsToken() do not need to document the token parameter; this is automatically handled by ApiBase.
    • Modules using ApiBase::PARAM_HELP_MSG_PER_VALUE should still set it to array(), but do not need to specify the message for each possible value.
    • Several additional constants are available for use in getAllowedParams(); see ApiBase for details.
  • All examples must have a descriptive text. The method to implement here is getExamplesMessages(); see the inline documentation for details. Message names should be along the lines of apihelp-$path-example-$arbitrarySuffix.
    • Note that unlike the previous getExamples() function, the query strings should not start with api.php?.

Message formatting edit

Description and parameter description messages should end with a period, and be grammatical sentences. For parameters passed to the messages by default, see the templates linked from #Message documentation.

Examples should be sentences that could logically end with a colon, but should have no trailing punctuation.[check this]

Semantic markup should be used:

  • ‎<var> should be used for mention of parameter keys, and also reference to variables like $wgMiserMode.
  • ‎<kbd> should be used for the possible values of parameters, and the mention of the input values in example docs.
  • ‎<samp> should be used for mention of keys or values in the API output.
  • ‎<code> should be used for anything else that's computer code, e.g. "the max-age header" or "the page <head>".
  • When semantic markup is used, additional quotation marks should not be used.

If reference to other API modules is needed, pipe a link to Special:ApiHelp and the help formatter will do the right thing. For example, "[[Special:ApiHelp/query+tokens|action=query&meta=tokens]]" is used in the documentation for action=edit's token parameter, and properly renders as an in-page anchored link if both are on the same help page. Similarly, reference to MediaWiki configuration variables such as $wgMiserMode should link to the documentation on mediawiki.org.

Message documentation edit

When documenting the messages in qqq.json, the templates {{doc-apihelp-description}}, {{doc-apihelp-param}}, {{doc-apihelp-paramvalue}}, and {{doc-apihelp-example}} are recommended.

Conversion scripts edit

It can be helpful to extract the existing messages and examples as a starting point, rather than copying everything from scratch. Please contribute any scripts to help with this task here.

Perl, AnomieBOT edit

With an existing test wiki, this script using the AnomieBOT framework will extract the existing data from the API using action=paraminfo and create various files with the data properly formatted.

Extended content
#!/usr/bin/perl -w

use utf8;
use strict;
use lib '/usr/local/src/AnomieBOT/bot';
use AnomieBOT::API;
use Data::Dumper;
use JSON;

my $modules = shift or die "USAGE: $0 modules";

my $api=new AnomieBOT::API('/usr/local/src/AnomieBOT/local.ini',1);
$api->login();
$api->DEBUG(-1);

my $j = JSON->new->allow_nonref->utf8;

my $res = $api->query(
	action => 'paraminfo',
	helpformat => 'wikitext',
	modules => $modules
);
if ( $res->{'code'} ne 'success' ) {
	die 'Failed: '.$res->{'error'};
}
if ( exists( $res->{'warning'} ) ) {
	die Dumper( $res->{'warning'} );
}

open EN, '>:utf8', '/tmp/en' or die "en: $!\n";
open QQQ, '>:utf8', '/tmp/qqq' or die "qqq: $!\n";
for my $module (sort { $a->{'path'} cmp $b->{'path'} } @{$res->{'paraminfo'}{'modules'}}) {
	my $path = $module->{'path'};
	printf qq(\t"apihelp-%s-description": %s,\n), $path, $j->encode( $module->{'description'} );
	printf EN qq(,\n\n\t"apihelp-%s-description": %s), $path, $j->encode( $module->{'description'} );
	printf QQQ qq(,\n\n\t"apihelp-%s-description": "{{doc-apihelp-description|%s}}"), $path, $path;
	for my $p (@{$module->{'parameters'}}) {
		next if $p->{'name'} eq 'token';
		printf qq(\t"apihelp-%s-param-%s": %s,\n), $path, $p->{'name'}, $j->encode( $p->{'description'} );
		printf EN qq(,\n\t"apihelp-%s-param-%s": %s), $path, $p->{'name'}, $j->encode( $p->{'description'} );
		printf QQQ qq(,\n\t"apihelp-%s-param-%s": "{{doc-apihelp-param|%s|%s}}"), $path, $p->{'name'}, $path, $p->{'name'};
	}

	open EX, '>:utf8', "/tmp/ex-$path" or die "ex-$path: $!\n";
	printf EX <<EX1 ;
	/**
	 * \@see ApiBase::getExamplesMessages()
	 */
	protected function getExamplesMessages() {
		return array(
EX1
	my $ct = 1;
	for my $e (@{$module->{'examples'}}) {
		printf qq(\t"apihelp-%s-example-%d": %s,\n), $path, $ct, $j->encode( $e->{'description'} );
		printf EN qq(,\n\t"apihelp-%s-example-%d": %s), $path, $ct, $j->encode( $e->{'description'} );
		printf QQQ qq(,\n\t"apihelp-%s-example-%d": "{{doc-apihelp-example|%s}}"), $path, $ct, $path;
		printf EX qq(\t\t\t'%s'\n\t\t\t\t=> 'apihelp-%s-example-%d',\n), $e->{'query'}, $path, $ct;
		$ct++;
	}
	printf EX <<EX2 ;
		);
	}
EX2
	close EX;

	print "\n";
}

close EN;
close QQQ;

print "http://localhost/w/api.php?action=help&modules=$modules\n";
print "http://localhost/w2/api.php?action=help&modules=$modules\n";

Warnings and errors edit

With Gerrit change 321406 merged in December 2016, warnings and errors can be internationalized too.

Pre-i18n status edit

On the MediaWiki side, errors and warnings are generally hard-coded strings in English that are passed as-is to existing methods such as setWarning() and dieUsage(). ApiBase also has a large number of keys (resembling existing i18n messages) with hard-coded mappings typically accessed via dieUsageMsg(), which may be used directly or may be used when other MediaWiki code attempts to return internationalized messages. When extension code does have an actual i18n message to report as an error, it might break convention by reporting it in the uselang language or it might force English, it might or might not use MediaWiki-namespace customizations, and it might process the message as ->plain(), ->text(), ->escaped(), or ->parse() depending on the whim of the original developer.

Clients see warnings as a single string per module, and errors as an associative array with a few properties. In JSON, a response with an error and three warnings might look something like this:

{
    "error": {
        "code": "a-short-string",
        "info": "Some text that is probably in English"
    },
    "warnings": {
        "main": "A warning from the 'main' module, probably in English",
        "foo": "A warning from the 'foo' query module\nAnother warning"
    },
    "*": "See https://www.mediawiki.org/w/api.php for API usage"
}

If the underlying code returned multiple errors, only one can be returned.

MediaWiki extension changes edit

All of the existing ApiBase methods that deal with errors or warnings as strings have been deprecated: dieUsage(), dieUsageMsg(), dieUsageMsgOrDebug(), getErrorFromStatus(), parseMsg(), and setWarning(). New ApiBase methods have been introduced to replace them.

  • Instead of dieUsage() or dieUsageMsg(), code should generally use dieWithError() now.
    • When dealing with an array of error messages as from Title::getUserPermissionsErrors(), the error array should be passed to the new errorArrayToStatus() method to turn it into a Status object, which should then be passed to the existing dieStatus() method.
  • Warnings should be reported with addWarning() instead of setWarning(). Warnings, like errors, now have codes.
    • There's also an addDeprecation() method that can replace the combination of setWarning() and logFeatureUsage().
  • Uses of $this->getErrorFromStatus() should be replaced with $this->getErrorFormatter()->arrayFromStatus(), which has been available since MediaWiki 1.25.

For checking user rights, a new method checkUserRightsAny() has been added to ApiBase that will die with an appropriate error message along the lines of "You don't have permission to edit pages", taking advantage of the existing "action-*" messages. For checking permissions on a title, checkTitleUserPermissions() has been added for the same purpose.

ApiBase::$messageMap is no longer public (and its format has changed, anyway). Anything that was using it should probably be changed to use an ApiMessage object. Use of parseMsg() can often be replaced with dieWithError() or ApiMessage::create().

The ApiCheckCanExecute hook's $message parameter should be set to an ApiMessage, as a string or array return value is now interpreted as an i18n message key rather than a key for ApiBase::dieUsageMsg(). Using an ApiMessage for this hook has been supported since MediaWiki 1.27.

The UsageException class is deprecated in favor of ApiUsageException.

  • ApiUsageException holds a StatusValue object with warnings and errors rather than just a single error. It also records the module that threw the exception.
  • For the time being ApiUsageException is a subclass of UsageException, so code that attempts to catch and handle UsageException should continue to function as it did before.
  • ApiUsageException::getMessage() returns the backwards-compatible English plain text, as described below.
    • This should allow unit tests using the @expectedExceptionMessage annotation to continue working, although the text tested for may need changing. Also for unit tests, there's a new ApiTestCase::apiExceptionHasCode() method to check the exception by code rather than by text.

Client changes edit

By default, errors and warnings will continue to be returned in English in the familiar structure. The i18n messages will be processed to wikitext in English without using local MediaWiki-namespace customizations, i.e. ->inLanguage( 'en' )->useDatabase( false )->text(). The message's wikitext is then converted to something like plain text by replacing common semantic tags (‎<var>, ‎<kbd>, ‎<samp>, and ‎<code>) with double-quotes ("), removing all other HTML tags without concern for semantics, and replacing HTML entities with the corresponding characters.

Despite this attempt to keep the output generally the same, the actual text of the messages will likely be different. In addition, clients might notice the following changes to API output:

  • Error codes will be different in some situations. Most notably, errors from query submodules will no longer be prefixed like parameters, e.g. prop=revisions will now return badcontinue rather than rvbadcontinue.
  • action=emailuser might return a "Warnings" status now, in case the sending of the email succeeded with warnings. Also, instead of returning an error message as a string message errors and warnings will be returned as arrays errors and warnings.
  • action=imagerotate will now return errors as an array errors rather than a string errormessage.
  • action=move will report errors when moving the talk page as an array talkmove-errors instead of strings talkmove-error-code and talkmove-error-info. Reporting of errors when moving subpages is similarly changed.
  • action=rollback will no longer return a messageHtml property.
  • action=upload will report non-fatal stash errors as an array stasherrors rather than a string stashfailed.

To receive internationalized warnings and errors, the client must specify the new errorformat parameter. The different values of this parameter control whether the errors and warnings are returned as "plaintext" (as described above), wikitext, HTML, or unprocessed message keys and parameters. When error i18n is enabled in this manner, errors and warnings will be returned in the uselang language and will not use MediaWiki-namespace customizations by default. The language may be overridden by using the new errorlang parameter, and MediaWiki-namespace customizations may be enabled by using the errorsuselocal parameter.

Enabling error i18n will also change the format of the errors and warnings in the response. The example above might now look like this:

{
    "errors": [
        {
            "code": "a-short-string",
            "text": "Some text that is no longer necessarily in English.",
            "module": "query+foo"
        },
        {
            "code": "another-short-string",
            "text": "A second error.",
            "module": "query+foo"
        }
    ],
    "warnings": [
        {
            "code": "warning-code",
            "text": "A warning from the <kbd>main</kbd> module, no longer necessarily in English.",
            "module": "main"
        },
        {
            "code": "another-warning-code",
            "text": "A warning from the <kbd>foo</kbd> query module.",
            "module": "query+foo"
        },
        {
            "code": "a-3rd-warning-code",
            "text": "Another warning.",
            "module": "query+foo"
        }
    ],
    "docref": "See https://www.mediawiki.org/w/api.php for API usage"
}

The internationalization of errors and warnings will also carry over to things such as action=move's talkpage error reporting.

API output (uselang) edit

ApiBase has been a subclass of ContextSource since MediaWiki 1.18. Now, in 1.25, uselang is officially recognized by ApiMain. API modules should generally prefer to use $this or $this->getMain() as the IContextSource, and $this->msg() or $msg->setContext( $this ) when translating messages.

Note that, for backwards-compatability, the API defaults to the user language. uselang=content may be used in requests when the site's content language ($wgContLang ) is needed.