Talk:Parsoid/Parser Unification/Known Issues

Latest comment: 9 days ago by IHurbainPalatin (WMF) in topic Unexpected </tvar>

I found an issue when using Parsoid to read an article or discussion

edit

Use this topic to discuss or report an issue you've found when using Parsoid to read an article or discussion page. Please check the known issues list first to ensure we don't already know about your issue! cscott (talk) 01:43, 15 November 2023 (UTC)Reply

Parsoid has trouble with Template:code documentation on en.WP

edit

Compare https://en.wikipedia.org/wiki/Template:Code?useparsoid=1 (very broken) and https://en.wikipedia.org/wiki/Template:Code?useparsoid=0

The latter renders fine, although many Linter errors are reported by LintHint in the rendered-page mode. LintHint does not report any error in Preview. This may be a GIGO situation that LintHint can't handle in Preview, and that causes Parsoid to expose a subtle syntax error. Jonesey95 (talk) 16:22, 6 March 2024 (UTC)Reply

Strangely, I was able to resolve both the Parsoid problem and the Linter problem with this edit. I don't see anything technically wrong with the previous code, though. Someone might want to copy the old code into a sandbox to test it. Jonesey95 (talk) 16:29, 6 March 2024 (UTC)Reply

Footnote references are not copied

edit

Because footnote references are implemented in CSS, they don’t get copied when copying the article text. Try copying https://hu.wikipedia.org/w/index.php?title=Teemu_Keisteri&useparsoid=0 and https://hu.wikipedia.org/w/index.php?title=Teemu_Keisteri&useparsoid=1 to a word processor: the footnotes themselves are copied from the bottom of the article, but the inline references are missing. This has a serious negative impact on certain Wikipedia content reuse scenarios (e.g. copying the article to a word processor for offline access, or to start creating a derivative work). —Tacsipacsi (talk) 15:57, 17 March 2024 (UTC)Reply

Interesting. The inline refs aren't generated via CSS -- only updated via CSS post-load. The HTML does contain the inline refs in English (as default). So, curious what is causing them not to be copied. Needs investigation. SSastry (WMF) (talk) 19:33, 24 March 2024 (UTC)Reply
I see the reference in two forms:
  • a <span> with display:none – doesn’t get copied because of this CSS rule;
  • an ::after pseudo-element with content: "[" counter(mw-Ref) "]" – doesn’t get copied because it’s not a real element.
I haven’t checked the source code before, and assumed that it’s CSS-only to allow out-of-order parses (which don’t know yet what number will be assigned). If it’s not the cause (the <span> contains the number, so it seems it’s not the cause), then why can’t it work the same way as in the legacy parser? —Tacsipacsi (talk) 22:09, 24 March 2024 (UTC)Reply
edit

I checked the page source: these lines of styles for the translate header

border-bottom: 1px solid #a2a9b1;
padding-bottom: 0.4em;
font-size: small;
text-align: center;

got replaced with

font-size: var(--font-size-medium);

Not a huge problem but I lost a heartbeat when the much needed link wasn't where I expected it to be 🤭 Ата (talk) 13:09, 9 August 2024 (UTC)Reply

Interestingly, "This page is a translated version of X" message, placed in the same position but on the translation subpages, renders without change. Ата (talk) 13:23, 9 August 2024 (UTC)Reply
Hi @Ата, thanks for the report! Is this https://phabricator.wikimedia.org/T355664 or another issue? IHurbainPalatin (WMF) (talk) 08:42, 12 August 2024 (UTC)Reply
IHurbainPalatin (WMF), yes, it is the same issue. Ата (talk) 09:16, 12 August 2024 (UTC)Reply
ext.translate.css is missing ABreault (WMF) (talk) 20:50, 12 August 2024 (UTC)Reply

"Issue tracker" not shown in page title line

edit

I'm looking at Edit check page and there should be "Issue tracker: #EditCheck" visible to the right of the page title line. Instead it reads Issue tracker: [[phab:tag/{{{1}}}/|#{{{1}}}]]. – Ата (talk) 08:29, 6 October 2024 (UTC)Reply

Hi @Ата, thank you for the report! We're aware of the issue, documented in https://phabricator.wikimedia.org/T348722. IHurbainPalatin (WMF) (talk) 08:36, 7 October 2024 (UTC)Reply
IHurbainPalatin (WMF), good to know! It's hard for me to look for this kind of issues on Phabricator because I'm terrible at tech language, but I still wish to point out what I see, so thanks for bearing with me 😇 Ата (talk) 11:35, 8 October 2024 (UTC)Reply
@Ата We'd much rather have people report existing bugs than not reporting bugs we're not aware of :) And making sure that a given bug is the same or a different one than another one can be surprisingly tricky too. So when in doubt, please let us know :) IHurbainPalatin (WMF) (talk) 11:47, 8 October 2024 (UTC)Reply

References from Wikidata not shown correctly

edit

See uk:Карл Бюхер: infobox has some data with references from Wikidata. They are shown with Cite error: Invalid <ref> tag; name "xxx" defined multiple times with different content. Ref name is taken from P248 and url is taken from P854; urls, however, can be different so using P248 for ref name is unwise.
I'm not sure whether that this is necessarily a parsoid error, but it isn't present with the normal settings, and I have no other ideas. Ата (talk) 17:28, 17 November 2024 (UTC)Reply

Thank you for reporting the issue! I filed https://phabricator.wikimedia.org/T380152 to track and fix that. IHurbainPalatin (WMF) (talk) 09:59, 18 November 2024 (UTC)Reply

Lists on translatable pages are messed up

edit

Hi! Compare the page Extension:UniversalLanguageSelector with and without Parsoid; the ordered list in the "Usage" section is messed up. You can see the same with the unordered lis in Extension:Translate#Features. What both pages have in common is that there is page translation markup in the wikitext, which probably messes with things. Jon Harald Søby (talk) 22:43, 3 December 2024 (UTC)Reply

The display is not more broken than, for example, the Polish translation with either parser. The page uses markup that’s explicitly (with bold text!) discouraged by Help:Extension:Translate/Page translation administration#Markup examples – the hash marks and asterisks are within the translation units rather than outside of them. If they were outside, like they are on MediaWiki database policy, the Parsoid output would be fine. This kind of broken markup is unfortunately quite common, but this doesn’t make it less broken. —Tacsipacsi (talk) 19:57, 4 December 2024 (UTC)Reply
Hi @Jon Harald Søby and thank you for reporting the issue! It looks similar to what happens in https://phabricator.wikimedia.org/T355667 - please let us know if you believe something else is happening!
As @Tacsipacsi mentions, there are ways to improve the translation markup so that it doesn't interfere with the rendering of the page.
Note that the rollout of Parsoid as a default parser for multi-lingual wikis like this one is currently planned for a later phase, precisely because we have things to fix around the Translate extension markup, which we intend to tackle before that point. IHurbainPalatin (WMF) (talk) 11:42, 5 December 2024 (UTC)Reply
@IHurbainPalatin (WMF): Thanks! Yeah, that's definitely the same issue. Seems like everything's covered in that task already. :) Jon Harald Søby (talk) 21:43, 5 December 2024 (UTC)Reply

Unexpected </tvar>

edit

Compare this and this. In the list, there are some (seemingly) stray </tvar> tags (but no starting tags) in the links.

Looking at the wikitext, I think this might be solved by changing [https://url.com/ label] into either [https://url.com/ label] or <tvar name="1">[https://url.com/ label], but I haven't tested it (I am not a translation administrator on this wiki). Jon Harald Søby (talk) 12:16, 12 December 2024 (UTC)Reply

Hi @Jon Harald Søby,
Thanks for the report. I filed https://phabricator.wikimedia.org/T382131 for this. My sense is that this is not a high-priority issue (let me know if you disagree) as this wikitext is effectively trying to enclose something that will be rendered as an href attribute and something that will be rendered as text in a single tvar context, and that's not a construction we'd support a priori (enclosing a full link with attributes is perfectly ok!).
I believe that these two markups be valid: [<tvar name="link">https://url.com</tvar> <tvar name="label">label</tvar>] (because that would separate what's in the href attribute from what's text) and <tvar name="link>[http://url.com example]</tvar> (because the whole link is now considered as a single untranslatable entity), and my tests seem to confirm that.
I would additionally argue that having only the [ markup be part of the translation unit is probably suboptimal for translators. IHurbainPalatin (WMF) (talk) 09:47, 16 December 2024 (UTC)Reply
Return to "Parsoid/Parser Unification/Known Issues" page.