Parsoid/Parser Unification/Confidence Framework/Reports

Questions to answer for report assessment

edit
  • What’s the visual diff score? What will make us confident to deploy based on this score?
  • Are there specific extensions critical to Wikitech DiscussionTools that are not yet supported?
  • Are there missing functionalities in Parsoid that will impact this deployment?
  • What’s the rollback plan in case needed?
  • What would be the main concern and reason to postpone this deployment?
  • What’s your proposed confidence score?
    • Not confident at all
    • Not confident
    • Somewhat confident
    • Confident
    • Extremely confident

Deployment Readiness for Wikivoyages

edit

For all wikis where Parsoid readviews is being rolled out, some common considerations apply:

  • Extension support: All critical extensions are known to be functional on the wiki.
  • Testing considerations: Visual diff testing only identifies rendering issues and may not uncover dynamic behavior like javascript functionality - in some scenarios, we may discover regressions post-deploy and aim to quickly fix them as we discover them or rollback Parsoid if appropriate.
  • Rollback Strategy: A full rollback is planned if any significant issues are observed post-deployment.
  • Functionality Concerns:
    • Kartographer: The wikivoyage projects use Kartographer heavily, so we are paying close attention to interactions Parsoid may have with Kartographer
    • Localization/RTL issues: For non-English wikis, localization of messages will be a key issue (localization of messages from Kartographer in particular). hewikivoyage was our first Right-To-Left reading order wiki.
    • Vertical whitespace issues: A relatively large number of pages display minor vertical whitespace issues. This is primarily due to newline combining cross template boundaries which introduce paragraphs with <br>s in them which Parsoid will not introduce. In most cases, the Parsoid output is actually better than the output of the legacy parser, and in other cases, it might be better to have had the extra whitespace. There is no "always-correct" solution here and we believe Parsoid's output is reasonable. This issue is tracked in T355099.
Wiki Confidence Visual Diff Score Deployment date
Tested Version Tested pages Pixel Perfect Rendering Vertical WS shifts only Potential issues
eswikivoyage High 1.43.0-wmf.20 890 90.79% (808) 98.88% (880) 1.12% (10) 2024-09-04
Remarks: a couple of edits fixed broken wikitext causing diffs. Mostly Vertical White Space and couple of cases that contain minor table padding differences in a couple columns or gallery centering differences where legacy parser is actually worse.
eowikivoyage High 1.43.0-wmf.20 869 90.45% (786) 99.19% (862) 0.81% (7) 2024-08-29
Remarks: Vertical White Space shifts noted. Gallery rendering issues due to a markup error was fixed.

See this page for detailed diff analysis.

fiwikivoyage High 1.43.0-wmf.20 1035 87.83% (909) 99.03% (1025) 0.97% (10) 2024-08-29
Only Vertical White Space issues and one minor issue with a list item not getting parsed.

See this page for detailed diff analysis.

svwikivoyage High 1.43.0-wmf.20 1154 82.06% (947) 98.61% (1138) 1.39% (16) 2024-08-29
Remarks: T371125 has been identified in the visual diff, but doesn't impact functionality and it's not a blocker to the deploy. VWS issues and extlink parsing problem in some cases, fixed after editing the wikitext in the template.

See this page for detailed diff analysis.

rowikivoyage Very High 1.43.0-wmf.19 623 88.76% (553) 98.72% (615) 1.28% (8) 2024-08-22
Initially, we've identified a difference in the main page that required a wikitext change that can be found here.
cswikivoyage Very High 1.43.0-wmf.19 403 98.01% (395) 99.5% (401) 0.5% (2) 2024-08-22
We are working on an issue with the error messages displayed by the geodata extension at phab:T372608 but do not consider this a blocker at this time.
hewikivoyage Confidence in the rollout is very high, although we will take care due to the functionality concerns described. 1.43.0-wmf.14 4,799 53.0% (2,544) 99.3% (4768) 0.7% (31) 2024-07-30
46.3% of pages show only minor vertical whitespace shifts (2,224/4,799). The vertical whitespace shifts are primarily due to the interaction of invisible tags (usually template boundary markers) and paragraph wrapping (T355099/T368719), which the Content Transform Team considers a "known difference" between the parsers.

An additional page (1/4,799) displayed a rendering difference due to T368720 caused by a misnested <small> tag. A patch is in progress for this issue. The remaining 0.6% of pages (30/4,799) displayed false positives, such as images not completely loading, which are artifacts of the visual diff process.

enwikivoyage Very High 1.43.0-wmf.14 24,609 90.9% (22,363) 99.7% (24,524) 0.3% (10) 2024-07-30
The remaining 0.3% of pages (85/24,609) displayed false positives, such as images not completely loading, which are artifacts of the visual diff process.
bnwikivoyage High 1.43.0-wmf.20 811 97.04% (787) 98.64% (800) 1.36% (11) 2024-09-11
Mostly VWS and edge cases. Initially, some map banner differences were identified and reported at T373400, but after deep investigation it didn't become a blocker, but a known issue.
hiwikivoyage High 1.43.0-wmf.20 949 91.04% (864) 98.63% (936) 1.37% (13) 2024-09-11
Initially, some map banner differences were identified and reported at T373400, but after deep investigation it didn't become a blocker, but a known issue. We also identified a few pages with hidden tracking category, also considered not a blocker and will require further investigation.
pswikivoyage High 1.43.0-wmf.20 800 95.0% (760) 98.38% (787) 1.62% (13) 2024-09-11
Table markup error leads to different error handling -- recommend fixing page. Fixed with this edit. Gallery rendering differences because of a markup error. Fixed with this edit.
trwikivoyage High 1.43.0-wmf.20 875 89.94% (787) 98.17% (859) 1.83% (16) 2024-09-11
Initially, some map banner differences were identified and reported at T373400, but after deep investigation it didn't become a blocker, but a known issue.
fawikivoyage High 1.43.0-wmf.22 1376 93.83% (1292) 98.62% (1358) 1.38% (18)
Mostly VWS and false positives. It also contains known issues with Tracking Categories that aren't roll-out blockers.
nlwikivoyage High 1.43.0-wmf.22 1495 84.49% (1264) 95.52% (1429) 4.48% (66)
Mostly VWS. T371125 is also identified but not considered a roll-out blocker.
plwikivoyage High 1.43.0-wmf.22 2303 90.67% (2090) 97.57% (2249) 2.43% (54)
Mostly VWS, horizontal shifts in galery, and some pages got the wikitext fixed to become pixel perfect in the visual diff test. Also, T348722 andT374724 were identified but are not blockers to the roll-out
ptwikivoyage High 1.43.0-wmf.22 1341 87.42% (1174) 97.62% (1311) 2.38% (30)
Mostly VWS and TC. Also, an issue with files not marked as redlink is being tracked here T374868 but not considered a roll-out blocker.
ukwikivoyage High 1.43.0-wmf.20 820 61.83% (507) 92.07% (755) 7.93% (65)
Mostly VWS, TC, and category sort order. Also, a few pages had T368724, which were fixed by editing wikitext markup, see Ганновер, Прага, or Аландські_острови.
elwikivoyage High 1.43.0-wmf.20 879 35.45% (312) 92.05% (810) 7.95% (69)
Mostly VWS and edge cases
frwikivoyage High 1.43.0-wmf.20 2330 88.86% (2073) 98.93% (2308) 1.07% (22)
itwikivoyage High 1.43.0-wmf.20 1648 66.14% (1094) 97.64% (1615) 2.36% (33)
shnwikivoyage High 1.43.0-wmf.20 822 78.49% (646) 94.29% (776) 5.61% (46)
viwikivoyage High 1.43.0-wmf.20 1245 75.12% (936) 95.99% (1196) 4.01% (49)
ruwikivoyage High 1.44.0-wmf.4 2084 61.03% (1306) 96.87% (2073) 4.13% (11)
dewikivoyage Very High 1.44.0-wmf.4 3911 68.92% (2696) 99.82% (3905) 0.18% (6)
jawikivoyage High 1.44.0-wmf.4 1193 59.45% (711) 98.49% (1178) 1.51% (15)

Deployment Readiness for Wikitech DiscussionTools

edit

Visual Difference Score Assessment

edit

Current Status:

edit
  • Pixel Perfect Rendering: Achieved for 95.7% of approximately 6250 talk pages tested.
  • Minor Issues: 99.9% of pages show pixel perfect or only minor vertical whitespace shifts.
  • Known Differences: 0.1% of pages have known differences (not planned for fixing).
  • Critical Differences: None.

Considerations for Deployment:

edit
  • Reliability: The high percentage of pixel-perfect rendering (98.55%) is a strong indicator of the system’s reliability.
  • Pending Reassessment: None
  • Blocker Evaluation: None

Extension and Functionality Support

edit

Critical Extensions:

edit
  • Assumption: Extensions critical for Wikitech DiscussionTools are presumed to be identified in the visual diff process.

Functionality Concerns:

edit
  • Login vs. Logout Issue: None
  • Legacy Parser Compatibility: A strategy is needed to ensure that deployment of the new system does not disrupt the recording of metadata by the legacy parser.

Rollback Strategy

edit
  • Procedure: A full rollback is planned if any significant issues are observed post-deployment.

Main Concerns and Potential Delay

edit
  • Current Stance: As of now, there are no major concerns that would necessitate postponing the deployment.
  • Dependent Factors: None.

Proposed Confidence Score for Deployment

edit
  • Score: Confidence in the rollout is extremely high.