Translatable modules/Proposed solutions

The Translatable modules project is trying to build a new framework for module localization.

Several proposed storage solutions are proposed here for discussion.

One of the main goals of this consultation is to decide which of these solutions to implement and recommend to all module developers.

Translatable page

edit

Description

edit

Use something similar to Module:ModuleMsg on Meta, but standardized:

  • Put all the messages on a usual wikitext page marked for translation.
  • Have standard Lua functions to load them (rather than a module like Module:ModuleMsg on each wiki).

Advantages

edit
  • Marking pages for translation is familiar to translation administrators.
  • Can also work for templates.
  • Every translation is a wiki page. It’s good for credit, history, separate processing, etc.

Disadvantages

edit
  • By default, translation unit markers are numbers. Using numbers as message keys makes code unreadable. It’s possible to replace the numbers with strings, but, as noted above, the way to do it in the Translate extension is not currently documented well. This can probably be addressed by proper documentation and standartization of this Translate extension feature.
  • It’s unclear how will parameters ($1, etc.) and other wiki syntax i18n features (GENDER, PLURAL, etc.) work. They are not necessarily compatible with wikitext content pages.
  • This may work for module localization within one wiki, but will not necessarily work when modules become global.
  • A solution will be needed for looking up pages for translation. Currently, the message group selector just shows all the translatable pages.
  • In the global modules and templates age, it’s unclear how will it be locally customized on wikis.
  • Possible performance issues if each message translation is loaded individually, and in handling of fallbacks. There is the messagecollection Action API but possibly no nice Lua or Wikitext API for loading the translations.

JSON .tab file in the Data namespace on Commons

edit

Description

edit

Use something similar to what Module:TNT is doing, but formalized:

  • Store all the source messages in a JSON file in the Data namespace on Commons in the “banana” format, like in MediaWiki extensions, including qqq for documentation.
  • The same syntax can be used as for core and extension messages.
  • Enhance the Translate extension to load the source messages and write the translations to JSON files by language.
  • Add Lua functions to the standard Scribunto library to load the messages.
  • Use another JSON file to organize the translatable file for convenient display in Translate’s message group selector.

Advantages

edit
  • The file format is the same as in extensions.
  • These pages are already globally accessible from modules.
  • The raw pages are easily available for JavaScript gadgets, which will convert JSON into native object, containing all translations if desired. An API might be needed to retrieve one single language or fallback, improving network access. JavaScript gadgets prefer one single query for all messages, stored at client cache.

Disadvantages

edit
  • A new file format support (FFS) will have to be developed and maintained in the Translate extension. We will need a new type of MessageGroup, as well as MessageLoading.
  • In the global modules and templates age, it’s unclear how it will be locally customized on wikis.

JSON file in an MCR slot

edit

Description

edit

Similar to “JSON .tab file in the Data namespace”, but:

  • Store all the source messages in a JSON file in the Data namespace on Commons in the “banana” format, like in MediaWiki extensions, including qqq for documentation.
    • (Need to decide whether to store all the languages in one JSON structure, or a slot per language.)
  • The JSON file is stored as an MCR slot with the wiki page that stores the module’s code.
  • The same syntax can be used as for core and extension messages.
  • Enhance the Translate extension to load the source messages and write the translations to JSON files by language.
  • Add Lua functions to the standard Scribunto library to load the messages.
  • Use another JSON file to organize the translatable file for convenient display in Translate’s message group selector.

Advantages

edit
  • Stored elegantly with the module.
  • If the module becomes global, the data becomes global with it.
  • Creating MCR slots may require some privileges, but that’s probably OK because creating new messages files is not for total newbies anyway, while editing is still accessible to most editors.

Disadvantages

edit
  • Will require some development to create slot support.
  • A new FFS will have to be developed and maintained in Translate. We will need a new type of MessageGroup, as well as MessageLoading.
  • In the global modules and templates age, it’s unclear how it will be locally customized on wikis.
  • Common disadvantage of MCR slot approach: Permissions and history mixed altogether. Code programming has the same protection level as translations, every translator is permitted to modify the effective code. There is no history of the effective changes in global programming, but drowning among translation edits. If protection and history would be separated, these are individual pages but not MCR.

TemplateData

edit

Description

edit

Similar to the JSON proposals above, but with the JSON stored inside TemplateData:

  • Store all the source messages as a JSON value in TemplateData associated with a template that uses the module. Other than being part of a larger JSON structure, the format is otherwise the same as “banana” format, like in MediaWiki extensions, including qqq for documentation.
  • The same syntax can be used as for core and extension messages.
  • Enhance the Translate extension to load the source messages and write the translations to JSON files by language.
  • Add Lua functions to the standard Scribunto library to load TemplateData and the messages.
  • Use another JSON file to organize the translatable file for convenient display in Translate’s message group selector.

Advantages

edit
  • Continuity with existing TemplateData technology.
    • In particular, TemplateData already has some support for internationalization, e.g. template description can be in several languages.
  • Keys can be managed through the TemplateData editor (this will require updates to the editor UI, however).
  • The technology can be later shared with templates.
  • If TemplateData ever moves to MCR (task T56140), it will move there, too.

Disadvantages

edit
  • There is a hard-coded limit of 64 KiB (gzipped) in the TemplateData extension’s code. While this is enough room for something like 700 messages, we have about 400 languages to manage. When using the rate at which MediaWiki core messages are localized, there is room for only 20 messages.
  • Requires adding TemplateData support to Scribunto (task T107119).
  • A new file format support (FFS) will have to be developed and maintained in Translate.
  • In the global modules and templates age, it’s unclear how it will be locally customized on wikis.

Messages as pages in the MediaWiki space

edit

Description

edit
  • Store the translatable messages in the MediaWiki namespace, like core and extension messages.
  • Create message groups for Translate using a JSON or YAML file stored as a wiki page. This is already supported (WikiMessageGroup) as whitespace separated lists, however, there is no mechanism exposed to define groups inside the wiki itself.

Advantages

edit
  • Mostly natural for Translate to process (but support for the message group organizer will probably have to be developed).
  • Mostly natural for Scribunto to process—message loading parsing functions already exist.
  • Can be customized on local wikis when modules become global the same way that messages from core and extensions are customized.

Disadvantages

edit
  • Double duty of listing messages as well as creating them separately.
  • Creating the messages will require sysop or edit-interface permissions, making comprehensive module development and bug fixing accessible to much fewer people.
  • Lack of packaging. Many distributed development teams will create and maintain packages of module, global template, accompanied by TemplateData, or JavaScript gadget, but should not conflict in naming between packages of similar targets. Should be bundled per package.

Lua table

edit

Description

edit

Do it similarly to existing solutions in Module:I18n on Commons and Module:Wikidades/i18n on the Catalan Wikipedia, but:

  • Standardize the Lua table format: Decide whether it is one message key pointing to many translations indexed by language, or language codes pointing to many message keys, etc.
  • Add functions to the Scribunto standard library to load these messages.
  • Add support for reading and writing this file format to Translate.

Advantages

edit
  • Natural for Lua.
  • Continuity with at least some existing solutions.

Disadvantages

edit
  • This is actual code, which is error-prone and less safe. (We already used to have messages in PHP arrays, and moved away from it.)
  • It’s natural for Lua, but what if Scribunto acquires support for other programming languages? There are recurring requests to support JavaScript, Python, Rexx, etc.
  • Language codes that have hyphens have to be written with square brackets, which is non-obvious and error-prone.

Proposed solutions comparison table

edit
Feature Translatable page JSON .tab file in the Data namespace on Commons JSON file in an MCR slot TemplateData Messages MediaWiki space Lua table
Translate changes (see details in “Engineering considerations”) minor major major major minor major
Needs permission to edit source messages To mark for translation No To create slots No Yes - sysop of edit interface No
Translate FFS None Major Major Major None Major
Customize on-wiki Unclear Unclear Unclear Unclear Probably easy, but may have performance issues Unclear
Similar to core and extensions No Very similar Very similar Mostly similar Yes, but only for onwiki editors No
Readable message keys Brittle, needs fixing in Translate Yes Yes Yes Yes Yes
Importing and exporting Not easy Not easy Easy Easy Not easy Probably easy
Can also be used in templates on the same wiki Directly Through a module Through a module Through a module Directly Through a module
Handling of fuzzying Probably already done Needs non-trivial work Needs non-trivial work Needs non-trivial work Probably already done Needs non-trivial work

More information

edit