User:SSethi (WMF)/Sandbox/Languages onboarding

In the 2024–2025 fiscal year, the Language Team will be working on implementing and testing a technical recommendation gathered through numerous conversations with various stakeholders to support language onboarding. See Languages onboarding#Strategy and approach.

Below, you can find information about the goals of this project, the history that has informed it, and why the Wikimedia Foundation's Product Department is prioritizing this work.

Objectives edit

  • Communities are supported to effectively close knowledge gaps through tools and support systems that are easier to access, adapt, and improve, ensuring increased growth in trustworthy encyclopedic content. Part of the Wikimedia Foundation’s Annual Plan 2024-25.
  • There is a clear picture of the state of languages and the process of supporting existing and new languages to the Wikimedia movement. Part of the Wikimedia Foundation’s Annual Plan 2023-24.

Current status of languages onboarding edit

As of April 2024, Wikipedia has 326 active language editions. And yet, there are many more living languages in the world (7,164 as per Ethnologue) that are spoken by millions of people, in which there is no Wikipedia and no Wiki at all. This is a blocker to fulfilling our vision that every single human being can freely share in the sum of all knowledge. Currently, Incubator serves as a centralized system for language creation within the Wikimedia ecosystem. It has been operating for 18 years without a platform owner. The current process of language onboarding is divided into phases, each involving a series of complex manual processes:

Before Incubation: The process of request creation for a language in Incubator involves various manual steps, including understanding project principles, creating Meta and Translatewiki.net accounts, confirming language eligibility, translating essential messages. Tracking, approving, and rejecting requests by Language Committee members is also a labor-intensive manual process.

During Incubation: Incubator faces technical limitations and lacks many of the modern features found in other Wikipedia wikis (e.g., ContentTranslation and Wikidata integration are missing). This deficiency leads to a poor editing experience for contributors, as highlighted in numerous Wikimedia convenings and previous research. Language wikis often remain in the Incubator for several years before graduating, primarily due to the poor editing experience, fewer community contributors, and a shortage of native speakers. According to April 2024 statistics, the average duration for a language wiki to graduate from the Incubator is 4.4 years (e.g., Fon Wikipedia). Numerous other content restrictions involve the inability to search for content, add citations, and upload files. Additionally, there is a lack of support for essential tools like appropriate keyboards, online dictionaries, spell checks, and grammar tools for many small and underserved languages, which hinder the editing process. Machine translation is not available for smaller languages.

After Incubation: Upon approval of a language by the Language Committee, the setup of the wiki site, content importing, and ongoing maintenance entail a series of manual steps carried out through collaboration among community members, the Language Committee, and server maintainers, which sometimes takes several days or even weeks.

 
Number of days each graduated Incubator project spent in the Incubator before graduating, per project type, with medians labelled (April 2024)

Envisioned future of languages onboarding edit

In December 2023, the Language team initiated discussions on enhancing the language onboarding processes, documented here. Various stakeholders from the community and staff shared their insights, contributing to the recommendations listed here.

Editing on Incubator should feel similar to editing on normal wikis, but we are far from achieving this goal.

We should forget about Incubator completely. And, find another way of starting wiki. Because of the complexities around it, it might take time to improve the technical side of it.

These recommendations aim to establish a streamlined technical infrastructure for creating language wikis and improving the complex processes involved in each of the distinct phases of language onboarding: before, during and after incubation. The recommendations cover various approaches, such as automating the addition and approval of new languages within Incubator, extending access to modern wiki features beyond Incubator, enhancing the editing experience within Incubator, and streamlining backend site creation processes. On the social front, they focus on fostering community growth and inclusivity within Wikimedia projects. Additionally, they propose exploring social pathways for language onboarding, including enhancing the discoverability of Incubator, creating welcoming pages, and orienting communities to relevant Wikimedia projects.

Status edit

May 2024

In 2023-24, as part of Language inclusion goals, efforts were made to identify key recommendations for improving the social and technical infrastructure to support existing and new languages. This was achieved by recruiting potential stakeholders for conversations revolving around various phases of the incubation process and conducting one-on-one and async discussions with stakeholders on key questions. A list of potential recommendations was compiled to make relevant stakeholders aware of the findings. Next fiscal, the plan is to focus on testing some of these recommendations, analyzing feedback, and learning from them.

Strategy and approach edit

After discussions with various stakeholders, the chosen recommendation for languages onboarding in the Wikimedia Foundation’s Annual Plan 2023-24 is Providing Access to Modern Wiki Features Beyond Incubator. This option appears to be the most feasible to implement and test within a short timeframe because it doesn’t require new implementation. Instead, it utilizes existing infrastructure to provide modern wiki features to language wikis. Additionally, conducting A/B testing will allow us to learn about their impact on communities and plan a direction for language onboarding.

Regarding other ideas: "Improving Editing Experience Within the Incubator" is the most challenging to achieve due to the technical complexities involved in integrating major features such as Content Translation and Wikidata with the Incubator. "Automating Backend Site Creation" would require collaboration from all teams and is a complex task to undertake in the first year of the project, as it would necessitate significant resources and time investment. "Automating New Language Addition and Approval in Incubator" can be addressed once the incubation experience itself is improved. However, implementing this would also require finding the right team owner.

Hypothesis statement edit

If we provide production wiki access to 5 new languages, with or without Incubator, we will learn whether access to a full-fledged wiki with modern features such as those available on English Wikipedia (including ContentTranslation and Wikidata support, advanced editing and search results) aids in faster editing. Ultimately, this will inform us if this approach can be a viable direction for language onboarding for new or existing languages, justifying further investigation.

Timeline edit

October–December 2024

  • Analysis is conducted on the editing activity of wikis.
  • Next steps are discussed with relevant stakeholders for the following year.

July–September 2024

  • Production wikis are set up, and features are being added to them.
  • Selected wikis are invited to edit.
  • Continue to monitor wikis for their activity and provide general onboarding support.

March–June 2024

  • Gather feedback on the initial recommendations with WMF engineers and relevant stakeholders.
  • Refine and develop these recommendations further, propose solutions based on them, and identify small ideas that can be undertaken as experiments to learn from and assess the impact they have on language communities.
  • Identify key stakeholders for implementing these ideas.
  • Publish proposed recommendations on the wiki.
  • Define the selection criteria and possible outcomes for the wikis and gather feedback on them from relevant stakeholders.
  • Develop a list of default features (extensions, templates, gadgets) that wikis will receive.

December 2023–March 2024

  • Conduct discussions with stakeholders on key questions related to language onboarding
  • Document initial recommendations for future discussions with WMF engineers and other stakeholders

Resources edit

See also edit