Content translation/Development Plan/Roadmap/CX03Release

Content Translation 0.03 release

edit

See below for the detailed development plan for each of these features

Increase language support (Labs->Beta->Prod)

edit
  1. Languages with high-quality support through Machine Translation Engines

Feature Set

edit
  1. Entry Point: Red interlanguage link
  2. Translation editing tools: dictionary, machine translation, link adaptation, category adaptation, reference adaptation, limited template adaptation
  3. Translation dashboard: selection of source and target languages, warning about trying to create an existing article, saving and loading of drafts
  4. Machine translation features: Warning about too much machine translation before publishing; Tagging articles that were published with a high threshold of machine translation.
  5. Infrastructure improvements
  6. Analytics: Listing number of published drafts and articles and information about users who published them.

Production Deployment - Resources & Provisioning

edit

Milestones

edit
Completion Date/Milestones Features Sprints
November 26 - December 9 2014
79
December 10 - December 22 2014
80
December 23 - January 13 2015
81

Development Plan

edit
Feature Details Tracking story
Saving an unfinished translation Auto-saved every minute to the same database as the previous row. Only the latest version is auto-saved.

Can be saved manually at any point as well.

https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/4693
Guided tour for the translation interface https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/4044
Legal text about publishing the translation with the appropriate license Show it in all possible contexts before the user publishes, including writing a direct URL. https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/4620
Allow adding red links in translation https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/4512
Fix any outstanding with aligning the source and the translation text https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/4688
Smart warnings about publishing a page the title of which already exists When the user renames the translation title, a warning will be shown if the page already exists.

When publishing, if the page already exists, a dialog is shown to the user with the following options:

  1. Publish anyway. Content will be replaced on the target Wikipedia.
  2. Publish as draft. Content will be published under the user namespace.
  3. Cancel. That will allow the user to go back and rename the article or decide later.
https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/4491
Making all of ContentTranslation a beta feature Hide all the features - the red link, the special page, and the links to the special page. The change tag will remain visible in RecentChanges, History, etc. https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/4685
A generic entry point An entry point from the user contributions page, which will open the dashboard. https://wikimedia.mingle.thoughtworks.com/projects/language_engineering/cards/3933

CX Deployment Plan for 0.03 Release January 2015

edit

Deployment date: TBD

Project: Content Translation Framework

Release: 0.03 (fourth release)

Long-term project roadmap: Content_translation/Roadmap

Language Pairs to be supported:

Release as: Beta Feature

Overall Plan

edit

System Architecture

edit

See: https://www.mediawiki.org/wiki/Content_translation/Technical_Architecture

https://www.mediawiki.org/wiki/Content_translation#Workflow_and_Technical_Architecture

https://www.mediawiki.org/wiki/Content_translation

Caching Architecture

edit

The following diagram includes the caching requirements for the CX framework.

https://www.mediawiki.org/wiki/Content_translation/Server_communications_workflow

https://commons.wikimedia.org/wiki/File:CX_ArchitectureV1.svg

Components to be provisioned for production

edit

CX server installation and configuration: https://phabricator.wikimedia.org/diffusion/GCXS/

See Setup: https://www.mediawiki.org/wiki/Content_translation/Setup for detailed information about component, installation and configuation and instructions.

  • Node.js
  • Apertium
  • Extension dependencies:
    • BetaFeatures
    • CLDR
    • EventLogging
  • Backend Services

Varnish:

  • External APIs called by CX
    • Wikidata
    • Parsoid API
  • Configuration Scripts

Upstart and Systemd scripts are at: https://www.mediawiki.org/wiki/Content_translation/Setup

Provisioning Plan

edit
  • Storage Requirements

To be determined from discussion with ops

  • Hardware Requirements

To be determined from discussion with ops

  • Bandwidth Requirements

To be determined from discussion with ops

  • Performance expectations
    • MT TPS (Transactions per second)
    • User responsiveness
    • MT Round trip
    • General guidelines

Monitoring and metrics

edit
  • EventLogging activity for CX
  • Number of users enabling the feature
  • Performance of S:CX, backend calls?
  • Check for node and varnish? Who to page?
  • Graph showing requests or timings for the WikiData API(s) we are calling
  • Graph showing requests or timings for the Parsoid API(s) we are calling

External Signoffs Required

edit
  • Faidon - Ops
  • Gabriel - Infrastructure architecture
  • Ori - Performance
  • Chris Steipp - Security
  • Greg G - Release engineering
  • Mark - Ops
  • Tim - Platform

LE Team responsibilities

edit
  • Kartik - Deployment, Engineer
  • Niklas - Engineer, Code Reviewer
  • Santhosh - Engineer, Code Reviewer
  • David - Engineer, Code Reviewer
  • Joel - Engineer, Code Reviewer
  • Runa - Team Scrum-Ninja / testing and communications
  • Pau - Feature UX reviewer, designer
  • Amir - Feature signoff
  • Alolita - Engineering coordination, Eng Manager