Extension:Translate/Mass migration tools/Alignment

Assumptions:

  1. The translation has paragraphs in the same order as the original

The step 2 of this project requires importing the translations which were already present before FuzzyBot's edit to the page. These need to be aligned with the translation unit identifiers made by the Translate extension.

A mock-up design showing the source text and the imported translations for the language selected. The user can confirm the imports or disapprove them.

As per the mock up design, the only thing we need to worry about is filling out the left and right hand side blocks with appropriate texts. The left hand side blocks are for the source text (English) and the right hand side blocks are for the corresponding imported translations.

  1. Left-hand-side blocks: The left-hand side blocks would contain the source text and would not be editable. The source units would be obtained using the WebApi and list=messagecollection
  2. Right-hand-side blocks: The right hand side blocks would contain the translations of the language code specified for a given page. The logic involved would be as follows:
    1. Find the timestamp of FuzzyBot's oldest edit on the page using the WebApi prop=revisions and rvuser=FuzzyBot
    2. Reduce the timestamp by 1 second and use it as the starting timestamp for prop=revisions to get the revision before FuzzyBot's revision on the page.
    3. Fetch the text for that revision and split it on double newlines ('\n\n') to obtain the translation units on paragraph level.
    4. Split the resulting units so that there does not exist a unit containing a mix of headers and other wiki text.
    5. Once the units are split on '\n\n' and headers are separated from other wikitext, align them on h2 level, i.e, the first h2 header in source should match with the first h2 header in target. Empty units can be added or units could be collapsed if there is not enough room to align.
    6. After this, the user can manually adjust the units by using the features mentioned below.
  3. Import features: Corresponding to each translation (target) unit, a set of features would be added. These include:
    1. Delete unit: This deletes the corresponding unit and shifts the target units up.
    2. Add a unit below: This adds an empty unit below the current unit.
    3. Swap with unit below: This swaps the text of the current unit with the unit below.