Requests for comment/Shadow namespaces
This is a request for comments regarding implementing shadow namespaces, which refers to the concept where if a local page doesn't exist, it will be transparently fetched from a remote wiki.
Shadow namespaces | |
---|---|
Component | General |
Creation date | |
Author(s) | Legoktm, MZMcBride |
Document status | See Phabricator. |
For example, if Template:Hi
does not exist on wiki A
but it exists on the linked wiki B
, then if {{Hi }} is added to a page on wiki A
, then it will show Template:Hi
from wiki B
.
This is just like how InstantCommons and foreign file repos currently work (If File:Example.png does not exist on this wiki, but exists on Wikimedia Commons, the Commons wiki image is retrieved and used).
Background
editCurrently we have $wgEnableScaryTranscluding
in MediaWiki core. The variable name is considered apt by some, although the feature is used without problems on some mid-sized wikis such as MITRE wikis and wiki.wikimedia.it (example).
Other methods for interwiki transclusion on content pages include, for Wikisource, Extension:DoubleWiki and InterWikiTransclusion.js.
MediaWiki also has a shallow concept of shadow namespaces via ForeignFileRepos. Currently if you set up a foreign file repository, pages in the File namespace will pull from local version if it exists. If the local version does not exist, the foreign repo is queried. Foreign repos can be on the same wiki farm and using database connections or on remote wikis using the API.
Proposal
editThe work here would be to extend and improve the current shadow namespace implementation. Instead of applying only to the File namespace, shadow namespaces could be implemented with the User, Template, Module, and Help namespaces.
The term "global" here means across all MediaWiki instances, so any modern MediaWiki installation should be able to reference any other wiki (possibly requiring installation of extensions/setting up configuration). In addition, wiki-farms would be able to designate one of their wikis to be used as the central repository if they choose not to use the default one.
$wgEnableScaryTranscluding
would be deprecated and/or removed from MediaWiki core.
Invocations (i.e., {{foo}}
) would try the local version first before trying the foreign repo equivalent. Links (i.e., [[bar]]
) would do the same, with appropriate coloring.
First steps
editFirst we would focus on the implementations that utilize "remote-parsing", where the text is parsed on the remote server, and the rendered HTML is displayed by the client wiki.
- Identify implementation differences between ForeignFileRepo and GlobalUserPage, and begin to reconcile them
- Switch ForeignFileRepo to use api.php?action=parse instead of index.php?action=render
- action=parse needs to support absolute URLs In progress
- Add top icons to to GlobalUserPages
- Implement batch remote page existence lookups
- Switch ForeignFileRepo to use api.php?action=parse instead of index.php?action=render
- Start abstracting ForeignFileRepo remote-transcluding code into core to handle other namespaces
- Done Add WikiPage::isLocal()
- In progress Make content navigation URLs non-file specific ([1])
- Improve Title::isKnown() related code:
- GlobalUserPage has a hacky opt-out with
<noinclude>
/<includeonly>
tags, which causes others problems and generally isn't great.- Can we get rid of opt-out?
- Should we make opt-out more granular (some related discussion at phabricator:T90849)?
- How many users at Meta-Wiki are using opt-out? This search query shows about 1,597 results when logged in as an admin, but most of these pages are user subpages. If we exclude subpages, we're left with about 185 results. Some of these 185 pages only use "noinclude" incidentally, but even if we assumed all 185 were legitimate results, that's still a very small portion of users.
- Done gerrit:303912 switches to a
__NOGLOBAL__
magic word as discussed during the IRC meeting.- Need to announce deprecation of old opt-out method
Open questions
edit- Localization of templates and Scribunto/Lua modules
- Namespacing: should we have a "Global templates" namespace (i.e.gtemplate:) or should it be transparent like InstantCommons?. Or both things?.
Applications
editShadow namespaces (global namespaces) are going, and could be, used to solve the following use-cases:
- Global Scribunto/Lua modules; Module namespace; T41610
- Global templates; Template namespace; T6547
- Global user pages; User namespace; T16759 Done
- Global help pages; Help namespace; T14306
- Would allow sharing of global styles for T90914, T112991, and T105845 in general
Code quality
editAs previously discussed, reducing duplication is the only proven method to share best practices and increase the quality of all the elements above across the board (also in big wikis), in addition to making usage broader and cheaper (for instance on small wikis which currently lack some features).
Making gadgets global is the other process which is widely recognised to improve code quality, a goal hotly perceived by some.
Implementation considerations
editCompatibility
editMediaWiki installations using global scripts will not keep pace with Wikimedia Foundation deployments, so we need a method to continue to support older versions (or maybe just versions that are still supported with code/security updates?).
Licensing
editNot a problem if the original page is in cc-by-sa and linked from the transcluding page. Hence not a problem for Wikimedia at all.
- GFDL and CC-BY-SA are not recommended for software. Scribunto/Lua modules, JavaScript gadgets, CSS pages, and other content may be considered software.
Licensing might be a problem for some system administrators who misconfigure the shadow namespaces feature, for instance to include content from sources that specify a license with strict attribution requirements (e.g. GFDL) or a license incompatible with the target wiki (e.g. a NC license on a commercial wiki).
Search
editWhen content starts living outside of the local wiki, interwiki search suddenly becomes a lot more important.
Recent changes and company
editTo actually implement T66474, the location where the namespace actually resides should be entirely transparent to the user.
The most important consideration is whether changes on the source wiki are reflected on the local wiki. Lack of such a visibility on the transcluded content is usually considered a dealbreaker for anything content-related at least on bigger Wikimedia wikis, see for instance how Wikidata change propagation was handled. T91192 requests a similar system for Wikimedia Commons.
Usage tracking
editFor media on Commons, we have GlobalUsage tracking, however that's only Wikimedia-wide, and captures no usage information about non-Wikimedia Foundation installations using InstantCommons. This means that Commons administrators have no idea whether a file is actually used or not when doing destructive operations like renaming (without leaving a redirect behind) or deleting.
Cache invalidation
editWhen a new version of a file is updated on Commons, it will automatically update on InstantCommons sites since images are hotlinked.[citation needed] However if a template is updated, it should cause HTMLCacheUpdate jobs to be queued so all pages are updated.
See also: Mentorship programs/Possible projects#Build an interwiki notifications framework and implement it for InstantCommons (phab:T48525)
Chaining
editShould we support chaining to multiple foreign repos? Basically, assuming we follow the w:en:zero one infinity rule with foreign repos, are we talking about one or infinity?
For example, imagine a wiki with two foreign repos configured. "Template:Baz" does not exist locally, but does exist on one of the foreign repos. Would there be some kind of fallback or order of precedence?