Manual:$wgCategoryCollation

This page is a translated version of the page Manual:$wgCategoryCollation and the translation is 26% complete.

カテゴリ: $wgCategoryCollation
照合カテゴリが何を並べ替えに使用するか
導入されたバージョン:	1.17.0 (r72308)
除去されたバージョン:	使用中
許容される値:	(文字列)
既定値:	`'uppercase'`
その他の設定: アルファベット順 \| 機能順

詳細

The setting determines what collation^[1] algorithm should be used to sort category listings.

As an example, to use the Spanish collation, you'd write $wgCategoryCollation = 'uca-es'; in LocalSettings.php and then run updateCollation.php for your change to take effect.

現在対応しているのは以下の通りです:

照合アルゴリズム	MediaWiki のバージョン	説明
`uppercase`	既定値	make everything uppercase, then sort by binary value of string when stored as UTF-8. Essentially case-insensitive sort by code point.
`numeric`	MW 1.28+	Same as `uppercase`, but with numeric sorting.
`identity`	MW 1.18+	sort by binary value of string when stored as UTF-8 (without converting to uppercase). Essentially sort by code point.
`uca-default`	MW 1.17+	Unicode collation algorithm – complex, much more multilingual-friendly category collation.
`uca-default-u-kn`	MW 1.28+	`uca-default` with numeric sorting.
`uca-<langcode>`	MW 1.21+	`uca-default` with language-specific adjustments. 下記を参照してください。
`uca-<langcode>-u-kn`	MW 1.28+	`uca-<langcode>` with numeric sorting.
`xx-uca-ckb`	MW 1.23+	中央クルド語
`xx-uca-et`	MW 1.24-1.31 (1.32 で除去)	Estonian but with W and V being considered separate letters.
`xx-uca-fa`	MW 1.30-1.31 (1.32 で除去)	ペルシア語
`uppercase-ab`	MW 1.31+	アブハズ語
`uppercase-ba`	MW 1.30+	バシキール語
`uppercase-se`	MW 1.31 (1.32 で除去)	北サーミ語

Since MediaWiki 1.18, extensions can add extra collations via the Collation::factory hook.

The value is also stored inside the categorylinks table to determine which rows need updating when the collation algorithm changes.

Setup instructions

After changing this option, you must run updateCollation.php to recompute sort keys for all pages, or your categories will be sorted inconsistently.

Updating collations is slow and may take several hours on large wikis.

uca-default/uca-xx collations require the PHP intl extension.

If you are using Varnish, Squid or file cache, you may have to purge category pages after running updateCollation.php to see the results.

If you update or recompile your version of PHP, you must run updateCollation.php --force.

言語固有の照合

MediaWiki also supports many collations designed for specific languages.

These are based on the Unicode collation algorithm (UCA) uca-default and have the same requirements; they are named uca-<langcode>, where <langcode> is one of:

af, am, ar, as, ast, az, be, be-tarask, bg, bn, bn@collation=traditional, bo, br, bs, bs-Cyrl, ca, chr, co, cs, cy, da, de, de-AT@collation=phonebook, dsb, ee, el, en, eo, es, et, eu, fa, fi, fil, fo, fr, fr-CA, fur, fy, ga, gd, gl, gu, ha, haw, he, hi, hr, hsb, hu, hy, id, ig, is, it, ka, kk, kl, km, kn, kok, ku, ky, la, lb, lkt, ln, lo, lt, lv, mk, ml, mn, mo, mr, ms, mt, nb, ne, nl, nn, no, oc, om, or, pa, pl, pt, rm, ro, ru, rup, sco, se, si, sk, sl, smn, sq, sr, sr-Latn, sv, sv@collation=standard, sw, ta, te, th, tk, tl, to, tr, tt, uk, uz, vi, vo, yi, yo, zu

For example, to use a collation for Spanish, one would use the uca-es collation.

Using these collations provides both correct sorting order for given language and proper headings for first letters of article titles. MediaWiki の初期のバージョンでは、これらの言語コードのすべてには対応していない可能性があります。

Getting new collations added

There are two parts to having a new language supported:

It being supported by the International Components for Unicode library (the list of language codes it supports is available at [1]).

Note, however, that Wikimedia's production servers do not use the latest version of the ICU library. As of 2016, they use version 52.1, which supports a significantly smaller set of languages.

It being additionally supported by MediaWiki itself (this basically requires listing the additional characters, or character groups, that are considered separate letters in the given language, in addition to the basic alphabet) – the always up-to-date list of currently supported ones is available at includes/collation/IcuCollation.php.

It might also be the case that the default ICU ordering ('uca-default' collation) orders the titles correctly, but does not correctly separate the letters – it can be used for the first step in that case. Sometimes the letter ordering of a different language might fit yours, if they are related – a custom collation can sometimes be provided in such case (there is one for Sorani Kurdish / Central Kurdish language ('ckb') already, called xx-uca-ckb includes/collation/Collation.php).

数値の並べ替え

Comparison between regular sorting (top) and numeric sorting (bottom)

Under numeric sorting, pages will be sorted as such: 1, 2, 9, 10, 11, 20, 21, 99, 100. Under regular (non-numeric) sorting, pages will be sorted as text: 1, 10, 100, 11, 2, 20, 21, 9, 99. If numeric sorting is used, all pages starting with a number will be sorted together under a single header: "0–9". If regular sorting is used, pages starting with a number will be sorted under separate headers for whichever number each title begins with: "0", "1", "2", etc. For more information about numeric sorting, see the Unicode Technical Standard #10. To test numeric sorting, see the ICU Collation Demo. Note that numeric sorting only works for unbroken sequences of digits. Digits separated by commas, periods, or spaces are treated as separate numbers.

外部リンク

ICU Collation Demo

References

↑ Collation refers to how data is sorted according to its set of characters, applying defined sorts criterias (i.e. alphabetic or reverse sort, case dependent or not, etc.)

[collation-1] Collation refers to how data is sorted according to its set of characters, applying defined sorts criterias (i.e. alphabetic or reverse sort, case dependent or not, etc.)

[1]