Content translation/Development Plan/Roadmap/CX02Release

Content Translation 0.02 release

edit

The goal of this release is to make the translation process more fluent and provide more flexibility in the way they start to translate. See below for the detailed development plan for each of these features

Increase language support (Labs->Beta->Prod)

edit
  1. Languages with high-quality support through Machine Translation Engines
    1. Define criteria for enabling new language pairs. Done
    2. Selection waiting on prelim user testing of production-ready language pairs in Apertium
    3. Blocked due to technical issues in the infrastructure setup on wikimedia betalabs

Feature Set

edit
  1. New entry Points
    1. Translation dashboard to initiate and continue translations.
      1. Auto-saving translation drafts as users translate.
      2. Initiate translations from dashboard
      3. Notifications pointing to the dashboard) about relevant translation-related events.
    2. Entry point to the dashboard from the contributions page.
  2. Editor: improved language tools
    1. Editing
      1. Keep focus on content for a fluent editing.
      2. Warnings and options for existing translations.
      3. Avoid formatting to be added when pasting content.
    2. Exploration and basic support for the Yandex, Google or Bing API
    3. Category adaptation
    4. Better support for links:
      1. Red links support
      2. Handle link adaptation for disambiguation pages
      3. Creating links and editing their target
  3. Infrastructure improvements
    1. Make it ready to be deployed.
  4. Analytics:
    1. Content Translation publishing data
    2. Visualization (basic)

Auto-saving translation drafts

edit

From gerrit:172528: this is about translation drafts. A translator can save translation and resume later. The draft content is annotated html with segmented sections and sentences (also lot of other data in DOM that represent a state in translation workflow). This drafts won't be available as articles but it can be opened in translation editor and resumed, published.

Drafts can be resumed from any OS, browser, any wiki, any machine, any other translator (this is futuristic) from content translation central dashboard.

Production Deployment - Resources & Provisioning

edit

Milestones

edit
Completion Date/Milestones Features Sprints
October 8 - October 21 2014
76
October 22 - November 4 2014
77
November 5 - November 18 2014
78

Development Plan

edit

Mingle Story Board

Feature Details
Entry Points
  • Translation Dashboard (more below)
  • Entry point: New translation from Contributions page
  • "New translation" dialog improvements
  • Notifications pointing to the dashboard) about relevant translation-related events.
Editor
  • Layout and Design
    • Top navigation bar adjustments
    • Keep text focus on content for a fluent editing.
  • Editing
    • Handle red links in the source column
    • Adapt red links in the translation
    • Existing translation: warning and options
    • Link highlighting: distinguish active from connected links
    • Auto-save translations
  • Publishing
    • Mark articles published with a high amount of automatic translation
    • Warnings about existing articles and options to deal with them
Link and Category Adaptation
  • Auto-adapt categories
  • A keyboard shortcut for link adaptation
  • Support link adding with disambiguation pages
  • Link adaptation - edit links
  • Red link adaptation.
Translation Dashboard
  • Create a new translation from the Translation Center
  • Add content to existing articles
  • Translation-related notifications infrastructure
  • Continue an existing translation from the Translation Dashboard list
Machine Translation Support (mt)
  • Support for one additional translation service.
Dictionary Support
Templates Support
Architecture (technical feature)
Research and preliminary development
Analytics
  • Expose Content Translation publishing data
  • Update publishing data collection
  • Set up Limn instance in labs
Deployment

CX Deployment Plan for 0.02 Release November 2014

edit

Deployment date: TBD

Project: Content Translation Framework

Release: 0.02 (third release)

Long-term project roadmap: Content_translation/Roadmap

Language Pairs to be supported:

Release as: Beta Feature

Overall Plan

edit

System Architecture

edit

See: https://www.mediawiki.org/wiki/Content_translation/Technical_Architecture

https://www.mediawiki.org/wiki/Content_translation#Workflow_and_Technical_Architecture

https://www.mediawiki.org/wiki/Content_translation

Caching Architecture

edit

The following diagram includes the caching requirements for the CX framework.

https://www.mediawiki.org/wiki/Content_translation/Server_communications_workflow

https://commons.wikimedia.org/wiki/File:CX_ArchitectureV1.svg

Components to be provisioned for production

edit

CX server installation and configuration: https://git.wikimedia.org/markdown/mediawiki%2Fservices%2Fcxserver.git/HEAD/README.md

See Setup: https://www.mediawiki.org/wiki/Content_translation/Setup for detailed information about component, installation and configuation and instructions.

  • Node.js
  • Apertium
  • Extension dependencies:
    • BetaFeatures
    • CLDR
    • EventLogging
  • Backend Services

Varnish:

  • External APIs called by CX
    • Wikidata
    • Parsoid API
  • Configuration Scripts

Upstart and Systemd scripts are at: https://www.mediawiki.org/wiki/Content_translation/Setup

Provisioning Plan

edit
  • Storage Requirements

To be determined from discussion with ops

  • Hardware Requirements

To be determined from discussion with ops

  • Bandwidth Requirements

To be determined from discussion with ops

  • Performance expectations
    • MT TPS (Transactions per second)
    • User responsiveness
    • MT Round trip
    • General guidelines

Monitoring and metrics

edit
  • EventLogging activity for CX
  • Number of users enabling the feature
  • Performance of S:CX, backend calls?
  • Check for node and varnish? Who to page?
  • Graph showing requests or timings for the WikiData API(s) we are calling
  • Graph showing requests or timings for the Parsoid API(s) we are calling

External Signoffs Required

edit
  • Faidon - Ops
  • Gabriel - Infrastructure architecture
  • Ori - Performance
  • Chris Steipp - Security
  • Greg G - Release engineering
  • Mark - Ops
  • Tim - Platform

LE Team responsibilities

edit
  • Kartik - Deployment, Engineer
  • Niklas - Engineer, Code Reviewer
  • Santhosh - Engineer, Code Reviewer
  • David - Engineer, Code Reviewer
  • Joel - Engineer, Code Reviewer
  • Runa - Team Scrum-Ninja / testing and communications
  • Pau - Feature UX reviewer, designer
  • Amir - Feature signoff
  • Alolita - Engineering coordination, Eng Manager