Manual:Categories/Sorting

Draft. Some assumptions remain to be checked and content needs be reorganised into a manual.

Wiki pages listed in categories follow a sorting order that is determined by at least two factors:

  • The sortkey of the wiki page, which can be customised on a per-page basis. The default category sortkey is the page name without its namespace prefix.
  • The collation algorithm selected for categories in the wiki.

Background

edit

Database structure

edit

page_props

edit

The default sortkey of a wiki page is stored in rows of the page_props table where pp_propname equals 'defaultsort'.

  • The pp_value field stores the value of the default sortkey (type: BLOB).
  • It is not to be confused with pp_sortkey, which may be null.
edit

Both collation and sortkeys are stored in rows of the categorylinks table.

  • The cl_collation field contains a shorthand of the algorithm, e.g. identity, uca-default-u-kn, etc.
  • The cl_sortkey field contains the sortkey, regardless whether or not it has been customised, given in the byte order of the chosen collation.
  • The cl_sortkey_prefix field contains a (human-readable) string representation of the custom category sortkey if any is provided. If none is provided, it defaults to an empty string.

When a wiki page is a member of multiple categories, there is more than one row for that page in the table.

Customisation

edit

Collation

edit

Collation refers to the way data is sorted according to a particular set of characters. Because different languages, or mixes of languages, have different needs, they are best served by bespoke criteria. MediaWiki supports the use of Unicode Collation Algorithm (UCA) methods to cater for ways of sorting pages that are language-friendly and alphanumeric.

A site admin can change the collation algorithm appropriate to the wiki by setting $wgCategoryCollation . Any change to this setting requires you to run updateCollation.php . This will update the cl_collation and cl_sortkey columns in the categorylinks table.

Sortkeys

edit

As explained on the help page about categories, users can change the category sortkey of a page in two main ways.

1. Default sortkey

If the appropriate permissions are set in the site's configuration, users can change the default sort key by adding a value to the DEFAULTSORT magic word. This magic word is defined in CoreParserFunctions.php and relies on ParserOutput::setPageProperty() . The use of DEFAULTSORT effects changes to :

  • the page_props table, as explained above.
  • unless overridden by the per category sortkey below, the cl_sortkey column of the categorylinks table.
2. Per category sortkeys

To change the sortkey for a specific category, users can add the desired value to the category tag, e.g. [[Category:Category name|Sort name here]]. The use of this method effects changes to the categorylinks table in two columns:

  • cl_sortkey (overriding the value represented by DEFAULTSORT if any)
  • cl_sortkey_prefix

Collation

edit

The Collation class is an abstract class with many extended classes geared to different algorithms. It can be called to transform strings according to the required set of criteria.

Example
// Request the wiki's collation object using a factory method:
$categoryCollation = MediaWikiServices::getCollationFactory()->getCategoryCollation();

// Convert string to a sortkey
$name = "Wikipédia est un projet d’encyclopédie collective";
$sortkey = $categoryCollation->getSortKey( $name );

// Get the first letter, e.g. for listing a page under its appropriate heading, such as Dž not D in Croatian.
$firstLetter = $categoryCollation->getFirstLetter( $name );

For an example of how MediaWiki calls it to build category listings, see CategoryViewer.php .

Sortkey

edit

To retrieve sortkeys, extensions can:


Hooks

edit

@todo

Information can also be requested from the API. For instance, see