Language tools/Requirements/Indic language support
This page contains an overview of the different parameters that are essential for proper language support in MediaWiki for a language. It contains the requirements that should be fulfilled for each language in the list of languages.
Requirements
editRequirements consist of two parts:
- Languages in scope
- Support properties
Languages in scope
editCurrent list (taken from en:Languages with official status in India[1]):
Language and community information
editLanguage | ISO 639-3 code[2] | ISO 639-1 code[3] | MediaWiki code[4] | Alternate names[5] | Autonym | Speakers (in M)[6] |
Language written? | Literacy rate of speakers |
Wikimedia community[7] | Standard body | Other communities | Language contacts | Other sources |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Assamese | asm | as | as | Asambe, Asami, Asamiya | অসমীয়া | 17 | |||||||
Bengali | ben | bn | bn | Bānglā-Bhāshā, Bāngālā, Bānglā | বাংলা | 181 | 85% | ||||||
Bodo | brx | Bara, Bodi, Boro, Boroni, Kachari, Mech, Meche, Mechi, Meci | बोडो, Bodo, (Assamese script missing) | 1 | 61% | ||||||||
Chhattisgarhi | hne | Khaltahi, Laria | छत्तीसगढ़ी | 17 | |||||||||
Dogri | dgo | Dhogaryali, Dogari, Dogri Jammu, Dogri Pahari, Dogri-Kangri, Dongari, Hindi Dogri, Tokkaru | डोगरी or ڈوگرى | 4.7 | 18% | ||||||||
English | eng | en | en | Sekgoa, Anglit | English | 328 | en.wp | ||||||
French | fra | fr | fr | Français | Français | 68 | fr.wp | ||||||
Garo | grt | Garrow, Mande, Mandi | Mande, (Bengali missing) | 1 | 55%+ | ||||||||
Gujarati | guj | gu | gu | Gujarati. | ગુજરાતી | 46 | 70% | gu.wp | |||||
Hindi | hin | hi | hi | Khadi Boli, Khari Boli | मानक हिन्दी | 181 | hi.wp | ||||||
Kannada | kan | kn | kn | Banglori, Canarese, Kanarese, Madrassi | ಕನ್ನಡ | 35 | 60% | kn.wp | |||||
Khasi | kha | Kahasi, Kassi, Khasa, Khashi, Khasiyas, Khuchia | Khasi | 1 | 63%+ | ||||||||
Konkani | knn | Bankoti, Concorinum, Cugani, Central Konkan, North Konkan, Konkan Standard, Konkanese, Konkani Mangalorean, Kunabi | कोंकणी | 4 | |||||||||
Kok Borok | trp | Kakbarak, Kokbarak, Tipura, Tripura, Tripuri, Usipi Mrung | Kok-borok, (Bengali missing) | 1 | 74%+ | ||||||||
Maithili | mai | Apabhramsa, Bihari, Maitili, Maitli, Methli, Tirahutia, Tirhuti, Tirhutia | मैथिली | 35 | 37% | ||||||||
Malayalam | mal | ml | ml | Alealum, Malayalani, Malayali, Malean, Maliyad, Mallealle, Mopla | മലയാളം | 36 | 100 | http://smc.org.in | Santhosh | ||||
Meitei | mni | mni | Kathe, Kathi, Manipuri, Meiteilon, Meiteiron, Meithe, Meithei, Menipuri, Mitei, Mithe, Ponna | ꯃꯤꯇꯩꯂꯣꯟ | 1 | 73% | Wp.mni.ꯋꯤꯀꯤꯄꯦꯗꯤꯌꯥ | Awangba | |||||
Marathi | mar | mr | mr | Maharashtra, Maharathi, Malhatee, Marthi, Muruthu | मराठी | 68mr.wp | 77% | ||||||
Mizo | lus | Duhlian Twang, Dulien, Hualngo, Lukhai, Lusago, Lusai, Lusei, Lushai, Lushei, Sailau, Whelngo | Mizo | 1 | 82% | ||||||||
Nepali | nep | ne | ne | Eastern Pahari, Gorkhali, Gurkhali, Khaskura, Nepalese, Parbatiya | नेपाली | 14 | 65% | ne.wp (correct page for VP?) | |||||
Oriya | ori | or | or | Odri, Odrum, Oliya, Orissa, Uriya, Utkali, Vadiya, Yudhia | ଓଡ଼ିଆ | 32 | 64% | or.wp | |||||
Eastern Panjabi | pan | pa | pa | Gurmukhi, Gurumukhi, Punjabi | ਪੰਜਾਬੀ | 28 | |||||||
Malvi | mup | Malavi, Mallow, Malwada, Malwi, Ujjaini | (Devanagari script missing) | 10 | 58%+ | ||||||||
Sanskrit | san | sa | sa | — | संस्कृतम् | 0 | 80% | sa.wp | |||||
Santali | sat | Har, Hor, Samtali, Sandal, Sangtal, Santal, Santhali, Santhiali, Satar, Sentali, Sonthal | (Bengali script missing), (Devanagari script missing), Santhali, (Ol Chiki script missing), (Oriya script missing) | 6 | 20% | ||||||||
Sindhi | snd | sd | sd | Asambe, Asami, Asamiya | سنڌي सिन्धी | 21.3 | sd.wp | ||||||
Tamil | tam | ta | ta | Damulian, Tamal, Tamalsan, Tambul, Tamili | தமிழ் | 65 | ta.wp | http://www.thamizha.com | ta:User:Logicwiki | ||||
Telugu | tel | te | te | Andhra, Gentoo, Tailangi, Telangire, Telegu, Telgi, Tengu, Terangi, Tolangan | తెలుగు | 70 | te.wp (?) | ||||||
Urdu | urd | ur | ur | Bihari | اردو | 61 | ur.wp |
Language support status
editLanguage | ISO 639-3 code[8] | ISO 639-1 code[9] | MediaWiki code[10] | ISO 15924 scripts | In Unicode 5.1? | In CLDR? | In glibc? | Trunk support? | Wikimedia support? | Active translators? | Most used translated? | Core 90%+ | Wikimedia 90%+ | Plural added? | Numerals added? | Date/time added? | Toolbar images? | Gender added? | Free fonts? | WebFonts? | Collection support? | Narayam mappings | Search working? | Search in Wikimedia? | Script conversion possible? | Conversion tables available? | Conversion implemented? | Supported in Kiwix? | Mobile support? |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Assamese | asm | as | as | Beng | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | No | Yes | No | No | No | Yes | Yes | Yes | No | |||||||
Bengali | ben | be | be | Beng | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | Yes | Yes | No | No | No | Yes | Yes | Yes | Yes | |||||||
Bodo | brx | No | Deva | Yes | Yes | No | No | No | No | No | No | No | No | No | No | No | No | No | |||||||||||
Chhattisgarhi | hne | No | Deva | Yes | Yes | No | No | No | No | No | No | No | No | No | No | No | No | ||||||||||||
Dogri | dgo | dgo | Deva, Arab | Yes | Done | Done | Done | Done | Done | Done | No | No | No | Done | No | No | No | Done | No | ||||||||||
English | eng | en | en | Latn | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | NA | Yes | Yes | |||||
French | fra | fr | fr | Latn | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | |||||||||
Garo | grt | No | Beng, Latn | No | No | No | No | No | No | No | No | No | No | No | No | No | |||||||||||||
Gujarati | guj | gu | gu | Gujr | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | Yes | Yes | No | No | No | Yes | Yes | Yes | No | |||||||
Hindi | hin | hi | hi | Deva | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | Yes | Yes | No | No | No | Yes | Yes | Yes | No | |||||||
Kannada | kan | kn | kn | Knda | Yes | Yes | Yes | No | No | No | No | No | Yes | No | No | No | Yes | Yes | Yes | No | |||||||||
Khasi | kha | No | Latn (modern texts), Beng (old texts) | No | No | No | No | No | No | No | No | No | No | No | No | ||||||||||||||
Konkani | knn | No | Deva, Knda, Mlym, Arab, Latn (depends on area) | Yes | No | No | No | No | No | No | No | Yes | Yes | No | No | No | |||||||||||||
Kok Borok | trp | No | Beng, Latn | No | No | No | No | No | No | No | No | No | No | No | No | ||||||||||||||
Maithili | mai | mai | Tirh, Kthi, Deva | Yes | Yes | Yes | Yes | Yes | No | No | No | No | No | No | No | ||||||||||||||
Malayalam | mal | ml | ml | Mlym | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | No | Yes | Yes | Yes | Yes | |||||||
Meitei | mni | mni | Mtei, Beng | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | Yes | No | No | No | Yes | Yes | Yes | Yes | Yes | Yes | ||||||
Marathi | mar | mr | mr | Deva (also "Modi", not encoded) | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes | No | No | No | Yes | Yes | Yes | No | ||||||||
Mizo | lus | No | Latn | Yes | No | No | No | No | No | No | No | No | No | No | No | No | |||||||||||||
Nepali | nep | ne | ne | Deva. According to Wikipedia in older texts also Takr; and Bhujimol and Ranjana (not encoded) | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes | No | No | No | Yes | Yes | ||||||||||
Oriya | ori | or | or | Orya | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | Yes | Yes | No | No | No | Yes | Yes | Yes | Yes | |||||||
Eastern Panjabi | pan | pa | pa | Guru, Deva | Yes | Yes | Yes | Yes | Yes | No | No | No | No | Yes | Yes | No | No | No | Yes | Yes | Yes | No | |||||||
Malvi | mup | No | Deva | No | No | No | No | No | No | No | No | No | No | No | No | ||||||||||||||
Sanskrit | san | sa | sa | Deva, Latn, others | Yes | Yes | Yes | Yes | Yes | No | No | No | Yes | Yes | Yes | No | |||||||||||||
Santali | sat | No | Olck | Yes | No | No | No | No | No | No | No | No | No | No | No | No | |||||||||||||
Tamil | tam | ta | ta | Taml | Yes | Yes | Yes | Yes | Yes | Yes | No | No | No | Yes | Yes | No | No | No | Yes | Yes | Yes | No | |||||||
Telugu | tel | te | te | Telu | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | Yes | Yes | No | No | No | Yes | Yes | Yes | No | |||||||
Urdu | urd | ur | ur | Arab | Yes | Yes | Yes | Yes | Yes | No | No | No | No | Yes | No | No | No | No | Yes | Yes | No |
Support properties
editList to be discussed with team. Current version created by Gerard, Niklas, Santhosh and Siebrand.
- Information to include in all overviews
- What is the language name (in English)?
- What is the ISO 639-3 code (with link to Wikipedia article and Ethnologue)?
- What is the ISO 639-1 code (if any)?
- What is the MediaWiki langage code (if any)?
- Language and community information
- What are the autonyms for the language in the scripts it can be written in?
- Which alternate names does the language have?
- What is the number of speakers (per Ethnologue)?
- What is the literacy rate (L1 only; leave blank if not available)?
- Does the language have a writing system?
- Is there an active Wikimedia community for this language (links to project's village pump page/embassy)?
- Is there a standard body for the language? If so, which?
- If no active Wikimedia community, is there an active online community (links to projects)?
- Who are the 1-3 goto people for testing language support for this language?
- Which initiatives can we use to speed up our language support?
- Language support information
- In which ISO 15924 scripts is the language being written? Add RTL if right-to-left.
- Is language currently supported in MediaWiki trunk (1.19 alpha)?
- Is language currently supported in the Wikimedia deployment?
- Are there active translators for the language/script combinations?
- Have the most often used messages been translated?
- Have more than 90% of MediaWiki core messages been translated?
- Have more than 90% of the MediaWiki extensions used by Wikimedia been translated?
- Is plural correctly defined for the language?
- Are numerals, number grouping and separators implemented for all scripts?
- Are time and date formatting implemented for all scripts/locales?
- Are localised images for the toolbar needed and added?
- If language uses gender, has support been added (namespaces)?
- Is collation correctly supported for all scripts?
- Are there freely licensed fonts available that support the language/script combinations (add names/URLs)?
- Is support for WebFonts needed? Motivate if no.
- Is there support for the fonts in the WebFonts extension?
- Are the scripts/fonts supported in PDF export/Collection extension?
- Which used keyboard[14]/script mappings need to be available in Narayam and as on-screen keyboard?
- If language uses multiple scripts, is automated script conversion feasible?
- Are script conversion tables available?
- Has script conversion been implemented?
- Is search working properly for the language's scripts in standard MediaWiki (confirmed by goto person)?
- Is search working properly for the language's scripts in the Wikimedia setup (confirmed by goto person)?
- Supported for offline (Kiwix 80%+, proper directionality support)?
- Mobile support (which questions need answering?)
References
edit- ↑ List of Indian Language wiki projects
- ↑ Ethnologue
- ↑ List of ISO 639-1 codes
- ↑ MediaWiki language code definitions
- ↑ Ethnologue
- ↑ Ethnologue
- ↑ List of Wikimedian pubs
- ↑ Ethnologue
- ↑ List of ISO 639-1 codes
- ↑ MediaWiki language code definitions
- ↑ Unicode 5.1.0 standard.
- ↑ Common Locale Data Repository
- ↑ GLibc
- ↑ ISO/IEC 9995-3:2010