Language Onboarding and Development/Technical gaps
Technical Gaps in Workflows and Features
editThis category covers potential interventions that are essential in nature, aim to address or improve existing gaps in technical, workflows, or features related to the development of language wikis. Wikimedia’s language communities have a wide range of needs around language technology. Their requests include adding and removing languages from projects (e.g., MediaWiki core), adding font and keyboard support for languages, Translatewiki,
and other miscellaneous language-related configuration changes such as RTL fixes, and spelling and grammar adjustments. According to research conducted by the Language Diversity Hub project in 2022, 13 language communities across Europe, Africa, Asia, and Latin America were interviewed to understand their challenges related to technology, education, economy, and/or social conditions. The top challenges identified were: few contributors, language tech challenges, and the need for more training.
Most languages have similar issues: they are spoken by many people (many of them by several millions of people), but they are not written or used online much. The work on creating the wikis is done by activists who know and love their languages and want to contribute local knowledge in them (and in other, more established languages), but they have several challenges including Lack of modern terminology, for both localization and encyclopedic article writing and Platform level issues or use of old devices where scripts, or fonts are not always update. As a support team, currently we are supporting the below challenges in one way or another however there are many more challenges that we can look into in the future that these languages face in general
Localisation activity
editAccording to the current policy, initial creation of a wiki requires the translation of 500+ most basic MediaWiki user interface messages. This serves as an incentive to translate them. After this is done, however, there is practically nothing that incentivizes users to keep localizing the user interface.
In some languages, such as Tyap, localization activity continues even after graduating from the Incubator. However, this happens thanks to excellent volunteers who understand the importance of localization, and it's the exception and not the rule. Most languages have low localization activity after graduation, and they are stuck at about 12% localization completeness in core and even lower in extensions.
The people who are best equipped at doing to make those translations usually also know English, French, or some other fallback language, and don't notice that something is untranslated. It would be good to develop incentives to keep completing the localization, which is growing literally every day. Currently, we are randomly reaching out to volunteers encouraging them to translate. For example utilizing community events, where we get a chance to meet community members speaking various languages. At the Wikimedia Hackathon 2024, a Swahili volunteer, did a lot of localisation work on translate wiki, among other things completing the translation of the mobile front end.
Localisation quality
editIn some languages, such as Igbo, there were complaints that localization was done in the past by people who don't know the language well, use a rare dialect that is unreadable to most other speakers, or made mistakes because of low wiki editing experience or poor understanding of the technical terminology. As a result tasks related to such may be reported for our attention. Broadly and collectively, we should encourage more localization activity in all languages, and specifically, encouraging people to write localization guides and glossaries in each language. Currently, there are localization guides for fewer than twenty languages: https://translatewiki.net/w/i.php?title=Special:PrefixIndex/Localisation_guidelines/&namespace=0&stripprefix=1 .
Localisation terminology
editFinding the right translations for some of the technical terms is a challenge in several of the communities. Many of the technical terms have no equivalent in these languages, and good processes for establishing new terminology on the conditions of the language will be of value for the languages in total. Some communities contribute in a language that has no standardized written form. Some languages have developed strategies for negotiating issues of dialect difference, spelling divergence, and the lack of an official language standardization guide (e.g., Scots). Connected to the lack of standardization, is the limited ability to write texts, both short and long, about modern technology-oriented themes, which is caused by generally low availability of such texts even outside the Wikimedia ecosystem.. This might lead to challenges when translating terminology, as well as with the general quality of the content. A linguist in the network describes a situation where many languages borrow terms directly from English instead of finding their own terms (from Barriers_experienced_by_contributors_to_small_language_versions_of_Wikipedia.pdf)
(Basic glossary: https://translatewiki.net/wiki/Translating:MediaWiki/Basic_glossary )
Keyboard support
edit“A recurring issue among all the smaller language communities is that tools for supporting their language online are lacking; they have insufficient keyboards for typing in their own language, or there might be few or no online dictionaries or spell checking software. There are some on-wiki solutions for keyboards for some languages, and a few dedicated people in the movement have put a lot of time and effort into supporting those languages. However, some of the solutions only work on wiki platforms, and they require maintenance. Besides, these languages deserve to have keyboards available everywhere, on all devices and platforms. The lack of digital tools is not only a barrier for Wikimedia activities.” (from Barriers_experienced_by_contributors_to_small_language_versions_of_Wikipedia.pdf)
RTL
editRTL languages are generally usable on the MediaWiki platform and technical support cost for them is relatively low thanks to CSSJanus.
In most features of day to day reading and editing, minor RTL issues do occur, for example misplaced icons or form labels, misaligned text, words appearing in incorrect order, . However, they are either quickly fixed, or they are deprioritized because they are barely noticeable. Many examples of current open tasks can be found at the RTL board in Phabricator.
However, some important issues and concerns remain including Wikitext editing, Transition to Vue, Wikifunctions and many more. (See State of RTL support on Wikimedia Products )
Incomplete configuration
editSome languages have incomplete configuration: no namespace and magic word translations (or incorrect translations), no date formats, no grammar rules (even though they are necessary), no digit conversion, and incorrect autonyms. This particularly affects languages that were added to the system long ago, in the mid-2000s—their codes and names were added, but some configuration details were not. This is slowly improving by personal outreach to people who speak those languages and making patches to complete the configuration (recent examples include Twi, Kinyarwanda, Hausa, Igbo, Xhosa, and Swazi). Many languages, however, are still far from complete. In future we can consider, systematic mapping of lacking configuration can be performed.
Missing features implemented as templates and modules
editThousands of features that are available in wikis in larger languages are implemented as templates and cannot be conveniently used in any other language. This includes infoboxes, formatted references, navigation boxes, and many others (see examples at https://www.mediawiki.org/wiki/Global_templates/Taxonomy). To be able to use them in their languages, editors have to implement them from scratch or manually copy each of them into their wiki, which is unsustainable because the number of those features grows constantly, and most of the smaller wikis have no people who are able to continuously do this technical work. This has been discussed for years in various forums (Movement strategy, Community wishlist, etc.). In 2024, it was confirmed again in the "Connecting Wikifunctions to Wikipedia Opportunities and Challenges" report (https://commons.wikimedia.org/wiki/File:Connecting_Wikifunctions_to_Wikipedia_Opportunities_and_Challenges.pdf).
Complaints about those issues come up very frequently in the context of Content Translation and general editing. The Support team can, at most, point the requesters to documentation about importing templates, but it has no capacity to help more. The problem must be addressed more systematically.
Issues related to language support, but handled in practice by other teams (if at all)
- Search - handled by Search Platform team
- Language Converter - occasionally handled by Content Transform team