Wikimedia Language engineering/Reports/2014-15 Q3 Report
Period: January-March 2015
Content Translation
editDeployment and Availability
edit- Content Translation first deployed on 8 Wikipedias. Gradually more languages were added, growing up 20 by the end of the quarter. At the time of writing this report Content Translation is available in 23 Wikipedias. Complete List.
- Machine Translation support via Apertium is provided for 12 languages. The highest number of articles translated in a language without machine translation support is French.
- Initially source languages were separately mapped for each target language. Later, all available source languages were enabled for use by any target language, thus expanding the scope for wider use.
- New languages are deployed as per requests received from the user community, or opportunities as perceived by the Language Engineering team. Request queue.
- Deployments are handled by Language Engineering and Technical Ops. Prior to deployment, checks are done to ascertain any special requirements for the particular language and tests are done on beta-labs to check for any failures.
- Post-deployment issues may surface, especially on Wikipedias with special tools or templates. For instance, publishing on the Spanish Wikipedia failed repeatedly until it was discovered to be caused by an abusefilter.
Bug fixes and other development
edit- Bug fixes for issues reported during this period was the major area of developer focus. These included publishing errors, section alignment, links etc. that users faced while translating.
- Analytics related to Content Translation data needed special attention during the last few weeks of the quarter. This was particularly important to ascertain the adoption and usage trends that provided the success metrics for the tool overall and also for individual features.
- The highlight of the new feature initiatives is the addition of new entry points at several stages of the reading/editing workflow. This is part of a major UX design overhaul. During this quarter 3 entry points were made available: contributions page, popover menu from the contributions menu and invitation to try the tool during new article creation (for user's who haven't enabled the beta-feature). An increase in beta feature activations and published translations were produced after the new entry points were made available.
- An API was developed that can be used to expose the modifications made by users on top of the machine translations. This is expected to help the MT service providers to improve the translation engines,
Usage Data
editThe following data is for the period 16 January (coinciding with the availability of Content Translation on Wikipedias for the time) to 31 March 2015:
-
Number of translators who have published at least one translation until 31 March: 204. This number includes translators who have also published articles on Wikimedia beta-labs, when Content Translation was being tested.
-
Number of articles published across all Wikipedias using Content Translation until 31 March: 708. Out of this ~450 are on the Catalan Wikipedia, which was the first language where Content Translation was introduced. Additionally, approximately 325 articles were in-progress.
-
Users enabling the Content Translation beta feature increased after the release of new entry points. Users creating new articles were shown an invitation to try Content Translation.]]
Comparisons
edit-
During this quarter 163 new translators activated the Content Translation beta-feature on the Catalan Wikipedia. On 31st March 2015, a new campaign was started on the Catalan Wikipedia to display Content Translation as an option when users (who haven't enabled the beta-feature) tried to create a new article. As a result of this campaign, within the next 10 days nearly 60 more users enabled the beta-feature.
-
Additional entry points into Content Translation were already in place or introduced within a week of the last day of the quarter. The most popular method used was the Contributions menu and the users' Contriibutions page. The other entry points being: red interlanguage link suggestions for missing languages, and invitation messages to use Content Translation when creating new articles from scratch.
-
Publishing frequency among users varied widely and across Wikipedias. 92 users published only one article and less than 30 users published over 5 articles. The highest number of articles published by an editor was 186 - User:19Tarrestnom65.
-
Only 2 articles had been deleted during this period for sub-optimal quality.
Other Projects
editProjects other than Content Translation that are in different stages of ongoing maintenance are:
- MediaWiki i18n (general improvements, bug fixes etc.)
- Extensions
- Babel
- CLDR
- LocalisationUpdate
- Translate
- TwnMainPage
- Universal Language Selector
- ULS Compact Language Links
- MediaWiki Language Extension Bundle (MLEB)
- Milkshake Libraries
Objective for 2014-15 Q3
editFor FY 2014-15 Q3 i.e January to March 2015, the objective was to collect and analyse data that would indicate the extensions and tools that needed to be prioritized in the development cycles. Data such as bug reports or open patchsets since January 1 2014 (i.e. from the time CX was started), were considered as key indicators of activity and attention. The next step was to prepare a plan for Q4 for consistent development attention to the focus areas.
Data and results
editComponent | Bugs | Description |
---|---|---|
MediaWiki Extension (i18n only) | All i18n, l10n and language support in MediaWiki core. Partially supported by Language Engineering, lots of WMF and volunteer activity in terms of patch contributions. | |
Visual Editor | i18n, l10n and language support for VE, some of which is done by Language Engineering. | |
Babel | Language tags for user pages. Used by WikiData. No planned development activity. Minimally supported by Language Engineering with occasional code reviews. Has some WMF and volunteer activity in terms of patch contributions. | |
CLDR | Locale data for core and extensions. No planned development activity. Supported mostly by Language Engineering Has some WMF activity in terms of patch contributions. | |
LocalisationUpdate | No planned development activity. | |
Translate | For translation of software interface and many auxiliary content pages on Wikimedia sites. Partially supported by Language Engineering. Has volunteer activity in terms of patch contribution. Has wide scope of functionality, also depends on ElasticSearch service on WMF production. To be prioritized for development support. | |
TwnMainPage | The main page for translatewiki.net. Supported by Language Engineering. Very little WMF activity in terms of patch contributions. No active development planned. | |
UniversalLanguageSelector | Language selection, input methods and web fonts. Mostly supported by Language Engineering. Has some WMF and volunteer activity in terms of patch contributions. Heavily dependent on Project Milkshake and also used for Content Translation language selection. To be prioritized for development support. | |
ULS Compact Links | Language selection for ULS based on special criteria. Has some WMF and volunteer activity in terms of patch contributions. Currently a beta-feature. Can be useful as a default feature, but needs metrics for setting success goals. |