Interwiki cache/Setup for your own wiki

Setting up interwiki links on your own wiki edit

Since MediaWiki release 1.19, the Wikimedia projects do not use the interwiki table but rely instead on a cdb file which contains information about the way links to external projects work. This means that you can't just download the interwiki.sql.gz file from download.wikimedia.org for a given wiki and import it into your database to make interwiki links work.

For impatient readers edit

If you want links to external projects from your own wiki to work like they do on Wikimedia projects, download and run the script. It will retrieve the interwiki cdb file in use on the Wikimedia projects and update it for use with your wiki. It's alpha code, beware.

All the grody details edit

What's in the interwiki cdb file edit

A cdb file is a flat file database format containing key/value pairs.

The interwiki cdb file has the following types of key/value pairs in order to handle various sorts of links:

  • _global:wikiabbrev
    Some of these are used for 'absolute' interwiki links, where the wiki is available in only one language and site type, and the abbrev points to the same web site every time. You can add as many arbitrary external sites as you like by adding entries like these to the cdb file.
    Example entry from Wikimedia:
    key    __global:devmo
    value  0 https://developer.mozilla.org/en/docs/$1
  • The rest are used for interwiki links where the wiki is available in multiple languages. We choose one language as the reference point (usually en) so that interwiki links of the form wikt:el from fr.wikipedia (for example) will work by bouncing the user first to en.wiktionary.org and from there to el.wiktionary.org. (CHECK ME is that really how the link forwarding works?)
    If you have new multilingual site types, you should add a corresponding entry here. If there is some other language that each site type is guaranteed to have, rather than English, you need to change the url appropriately. And in any case, if you want the links to point to a wikifarm on your domain, you'll need to update the urls accordingly.
    Example entry from Wikimedia:
    key    __global:wiktionary
    value  1 //en.wiktionary.org/wiki/$1
  • _sitetype:langcode
    These map site type and language code (or other prefix) to the corresponding url. If you add new languages or site types you'll want to add entries here.
    Example entry from Wikimedia:
    key    _wiktionary:aa
    value  1 http://aa.wiktionary.org/wiki/$1
  • __sites:fullprojectname
    In these entries, 'fullprojectname' means the name of the wiki database, typically the langcode concatenated with the site type. These are used to map wiki database names to site types. If you add a new site you will need to have an entry here, mapping the wiki db name to the site type. Good defaults are wiki ('wikipedia' type), or wikimedia.
    Example entries from Wikimedia:
    key    __sites:guwikibooks
    value  wikibooks

    key    __sites:wikimaniateamwiki
    value  wiki
  • Note that *all* Wikimedia project wiki db names end in the site type, with one exception, wikidata, and you don't want to look too closely at that. If you don't follow this model, other things may be more annoying for you and require workarounds.
  • fullprojectname:iwabbrev
    These are used to map an abbreviation on a given project to a different site type in the same language. So for example the abbreviation 'q' when given on el.wikipedia should lead to el.wikiquote, which would be achieved by an appropriate entry here. If you add new site types you'll need entries here, and if you have a new fullprojectname, you need entries here for each known abbreviation. As of this writing, known Wikimedia abbreviations are w, wikt, q, b, n, s, v, chapter, voy
    Example entries from Wikimedia:
    key    aawiki:n
    value  1 http://aa.wikinews.org/wiki/$1

    key    liquidthreads_labswikimedia:q
    value  1 http://liquidthreads-labs.wikiquote.org/wiki/$1

The keys which start with _list are used for getAllPrefixesCached which is used at present only in retrieval of the interwikiMap when the MediaWiki api is queried for interwikimap site info. Example query: http://www.mediawiki.org/w/api.php?action=query&meta=siteinfo&siprop=interwikimap

  • __list:__global should contain all xxx for which there is an entry with key __global:xxx
  • __list:_wiktionary should contain all languagecodes for which there is an entry with key _wiktionary:langcode (and so on for the other site types)
  • __list:__sites should contain all fullprojectnames xxx for which there is an entry with key xxx
  • __list:fullprojectname should contain all abbreviations xxx for which there is an entry with key fullprojectname:xxx

Expanding an interwiki link edit

When we want to expand a piece of wikitext that might be an interwiki link, how does it work?

This depends on the value of the global $wgInterwikiScopes which has a default value of 3 and can be overridden in your wiki's LocalSettings.php file.

  • $wgInterwikiScopes = 1:
    There is no lookup in the interwiki cache cdb file at all.
  • $wgInterwikiScopes = 2:
    Check for the key __global:wikiabbrev and if it exists, use the corresponding value
  • $wgInterwikiScopes = 3:
    Check for the key __sites:fullprojectname in order to get the site type (is the current wiki a wikipedia, a wikiquote, etc). If that does not exist, we wil use the value of $wgInterwikiFallbackSite which by default is 'wiki', i.e. site type wikipedia.
    Check for the entry _<sitetype>:langcode where sitetype is the value we just retrieved. If there is no entry we fall back to wgInterwikiScopes = 2 and try that.

Setting up interwiki.cdb for your wiki edit

Adding entries to the cdb file by hand edit

If you are setting up a mirror of en wikipedia with wikidbname enwiki:

  • steal a copy of our interwikicache.cdb from [here]
  • copy it into cache/interwiki.db under the root of your MediaWiki installation
  • add $wgInterwikiCache = "$IP/cache/interwiki.cdb"; to your LocalSettings.php config file
  • You are done. All wikilinks will 'just work', as your wiki database name will be parsed into language code en and wiki type wiki, (i.e. wikipedia), both of which are fully specified in the cdb file already. Since interwiki links only affect links leading off of your wiki, you need to change nothing.

If you are setting up a mirror of en wikipedia with wikidbname enwiki and table name prefix (for example) mw_:

  • steal a copy of our interwikicache.cdb from [here]
  • add entries:
        key   enwiki-mw_:w
        value  1 http://en.wikipedia.org/wiki/$1
        key   enwiki-mw_:wikt
        value  1 http://en.wiktionary.org/wiki/$1
        key   enwiki-mw_:q
        value  1 http://en.wikiquote.org/wiki/$1
        key   enwiki-mw_:b
        value  1 http://en.wikibooks.org/wiki/$1
        key   enwiki-mw_:d
        value  1 http://en.wikidata.org/wiki/$1
        key   enwiki-mw_:n
        value  1 http://en.wikinews.org/wiki/$1
        key   enwiki-mw_:s
        value  1 http://en.wikisource.org/wiki/$1
        key   enwiki-mw_:v
        value  1 http://n.wikiversity.org/wiki/$1
        key   enwiki-mw_:voy
        value  1 http://en.wikivoyage.org/wiki/$1
        key   enwiki-mw_:chapter
        value  1 http://en.wikimedia.org/wiki/$1

	key    __sites:enwiki-mw
        value  wiki

        key    __list:enwiki-mw
        value  b d chapter n q s v voy w wikt

       and the entry for the key __list:__sites so that it includes enwiki-mw
  • copy it into cache/interwiki.db under the root of your MediaWiki installation
  • add $wgInterwikiCache = "$IP/cache/interwiki.cdb"; to your LocalSettings.php config file
  • You are done. All wikilinks will 'just work', as your wiki database name will be parsed into language code en and wiki type wiki, (i.e. wikipedia), both of which are fully specified in the cdb file already. Since interwiki links only affect links leading off of your wiki, you need to change nothing.

If you are setting up a site which is not a wikipedia but you want to have interwiki links to all of the Wikimedia projects, follow the instructions above for enwiki with mw_ db table prefix, substituting in the name of your wiki db for enwiki-mw_ everywhere.

If you are setting up a site which is not a wikipedia and you have the db prefix xxxx_, follow the above instructions for enwiki with mw_ prefix but substitute in yourdbname-xxxx for enwiki-mw_ everywhere.

If you are setting up a mirror of several wikipedias with wikidbnames enwiki, frwiki, etc (and no special db table prefix):

  • steal a copy of our interwikicache.cdb from [here]
  • change the entry
           key     __global:wiki
           value  1 //en.wikipedia.org/wiki/$1
to point to your domain
  • change the entry
           key    _wiki:en
           value  1 http://en.wikipedia.org/wiki/$1
to point to your domain, and do the same for each language you are setting up
  • copy the cdb file into cache/interwiki.db under the root of your MediaWiki installation
  • add $wgInterwikiCache = "$IP/cache/interwiki.cdb"; to your LocalSettings.php config file
  • You are done. You have modified interwiki links between the wikis in your farm to point to your domain, the rest will lead offsite as they should, and no other modifications are necessary.

The easiest way to add entries to a cdb file is to use a set of command line cdb utilities (aptget install freecdb for Ubuntu, yum install tinycdb for Fedora).

For freecdb, you can dump the existing cdb file to a flat text file using cdbdump cache/interwiki.cdb > cache/interwiki.txt, and add entries in the format +nn,mm:keyname->value where nn is the length of the key and mm is the length of the value in bytes. You can then convert the text file back to cdb using cdbmake cache/interwiki-test.cdb cache/interwiki.cdb.tmp < cache/interwiki.txt after which you can move the new cdb file into place and test it.

The commands for tinycdb are different but the procedure is the same: dump the cdb file into a flat text file, add entries of the above format and convert the text file back to cdb.

Use the script, Luke edit

There is a script [1] that can make this easier (tested only on Linux). It's designed for altering the Wikimedia interwikicache.cdb file for a single wiki. You will need to specify the site type, the db name including table prefix if any, the language code if any, or alternatively the path to your wiki's LocalSettings.php file. The script will do the rest, writing out a new cdb file with the desired entries, as needed. See the README file or run the script with the --help option for more information. It's alpha code, beware.