Talk:Parsing

About this board

Parser problem? ticket in phabricator.

1
Summary by SSastry (WMF)

Followups on the ticket.

Herzi Pinki (talkcontribs)

Parser problem? ticket in phabricator.

4
Summary by SSastry (WMF)
Herzi Pinki (talkcontribs)

Hi folks, I come here because of https://phabricator.wikimedia.org/T209236 (and because AKlapper recommended it) I guessed that this issue should be assigned to the MediaWiki-Parser, but I might be wrong. Can you please have a look at the ticket and assign a priority to it. If my guess is wrong, please move it to the right project. best --Herzi Pinki (talk) 18:49, 23 November 2018 (UTC)

197.235.56.207 (talkcontribs)

Seems like a recent change caused this, searching phabricator keywords, the most likely cause is :https://phabricator.wikimedia.org/T206940.

Until it is fixed or reverted, it is easy to manually fix these links, especially if they are being generated by a template, simply switch the param around so replace this:

[[File:Erioll world.svg|15px|alt=Welt-Icon|link=//tools.wmflabs.org/geohack/geohack.php?pagename=Hirtenberg&language=de¶ms=47_N_16_E]]


With this:

[[File:Erioll world.svg|15px|alt=Welt-Icon|link=//tools.wmflabs.org/geohack/geohack.php?params=47_N_16_E&pagename=Hirtenberg&language=de]]

The link below should be working: Welt-Icon

Herzi Pinki (talkcontribs)

facepalm. Thanks for the hint. Workaround works like a charm. The issue is still open. --Herzi Pinki (talk) 21:06, 23 November 2018 (UTC)

SSastry (WMF) (talkcontribs)

Thanks for flagging it and filing the phab task. And thanks anonymous for the workaround. Since this is being tracked in Phabrication, I am going to close the discussion here so we track it one place.

Wikitext vs VisualEditor lint errors

8
197.218.91.124 (talkcontribs)

I'm sure this has been asked thousands of times, and previously ignored because there was no possibility of checking automatically. However, now that the parsoid / linter extension combo is able to detect and report markup errors this might be a perfect time to create an automatic log / graph of errors introduced by each of these tools to make it possible to somewhat measure if the visualeditor really lives up to its aspiration of reducing mistakes.

This may also help find previously unknown errors caused by either of these tools, so having such logging will certainly be a good idea anyway. This data could also be mined by automated tools such as bots and ORES to facilitate its cleanup.

Considering that smaller wikis don't have bots to regularly clean up these errors, it might also be useful to evaluate how long these errors tend to stick on articles, especially for errors that prevent the page from being viewed properly.

Legoktm (talkcontribs)

What kind of errors are you specifically thinking of?

197.218.91.124 (talkcontribs)

Mostly errors that either break the whole page or prevent it from working properly, some of these are quite severe:

There are plenty more, such as an unclosed html comment affecting page rendering under certain conditions (https://phabricator.wikimedia.org/T30939).

Some are quite easy to miss when writing wikitext or when section editing since markup in one section may be perfectly fine and preview well enough, but break the whole page when saved. Most of them however are quite visible in VE.

197.218.82.68 (talkcontribs)

Here's a classic and pretty nasty one, nested extension tags, e.g. (https://phabricator.wikimedia.org/T22707):

<ref>
<ref> Eureka
</ref>
</ref>
<Gallery> file 2.png
<gallery> file1.png <gallery/>
</gallery>

If I had to rate the most common issues (along with the others above) they'd probably be in this order:

Issue Task Frequency
Misnested tables T64323 High
Unclosed tags T59196 High
Misnested or broken Lists T3581 High
Links being misnested T13239 Low

Most of these occured in a non-WMF wiki, so if this happens on a low traffic wiki, it must be much much worse in a wiki used by thousands of editors. Just look through phabricator and you'll find hundreds more cases. Lists in particular are pretty problematic because they rarely work the way people hope they will. The team might also want to look into wiki validator (https://github.com/Wikia/wiva).

Perhaps there might be something useful in the code.

Jdforrester (WMF) (talkcontribs)

Hey there.

One of the design philosophies of VisualEditor as a tool is that it helps editors make edits "directly", and doesn't ever mysteriously do things just because we think the user should do this. We felt very strongly that this was a fundamental blocker to adoption on Wikimedia wikis, as it has to live in an environment with people using other tools to edit via wikitext. Consequently we've spent a lot of effort to avoid changing things unexpectedly, especially in the Parsoid service, and even when we know that there was something wrong. For example, when you edit an image we'll fix up the syntax, as if they were an expert wikitext user, but we won't ever change the syntax of things the editor didn't touch.

Because of this, measuring the before/after impact of parsing error from VisualEditor edits probably wouldn't be very helpful – you'd expect them to go down, but you wouldn't see a nullification of all such errors.

I'd be pretty worried about tools to do mass-level changes inside VisualEditor, as people might struggle to understand exactly what they're changing and why. In the future, we're planning (T128511) to provide a "prompt the user to do something", but that would still be a per-item-fix interface.

Hope that helps explain my thinking.

197.218.82.95 (talkcontribs)
 Because of this, measuring the before/after impact of parsing error from VisualEditor edits probably wouldn't be very helpful – you'd expect them to go down, but you wouldn't see a nullification of all such error.

The design philosophy is pretty reasonable, and you've raised a very valid point about the error detection not really proving much for existing articles. However, it probably would provide very useful data for page creation, e.g.:

  1. On average how many pages created have markup errors
  2. How many errors are introduced by new editors vs experienced editors
  3. Which types of errors occur most often
  4. What page components generate most errors

VE is very intuitive, but it does sometimes have surprising interactions when things are pasted , dragged, or when templates are mixed with other extensions.

I'd be pretty worried about tools to do mass-level changes inside VisualEditor, as people might struggle to understand exactly what they're changing and why

I agree entirely. Magic like behaviour can be very problematic, and could make even simple edits create complex revisions.

In the future, we're planning (T128511) to provide a "prompt the user to do something", but that would still be a per-item-fix interface.

Long live microsoft clippy!

Anyway, this seems like a very good idea for providing to warn the about some things they've overlooked or that may cause problems. The wiva tool has some functionality that fits in nicely with that task, e.g. makes mention of huge images that may not render properly on mobile.

This is useful because currently neither VE nor the WTE makes any attempt to warn the user about markup issues, article readability, or usability problems.

Jdforrester (WMF) (talkcontribs)
However, it probably would provide very useful data for page creation

OK, you've convinced me; I've created https://phabricator.wikimedia.org/T162958 and hopefully we'll have a moment to measure it soon enough.

Anyway, this seems like a very good idea for providing to warn the about some things they've overlooked or that may cause problems. The wiva tool has some functionality that fits in nicely with that task, e.g. makes mention of huge images that may not render properly on mobile.

Yeah, accessibility and content scale/size/depth hints are one of the things on my wishlist for this prompt tool.

197.218.88.255 (talkcontribs)
OK, you've convinced me; I've created https://phabricator.wikimedia.org/T162958 and hopefully we'll have a moment to measure it soon enough.

Great. My hunch is that on new pages, novices don't add as much markup (on average) with the "source editor", yet introduce more errors than VisualEditor users. When they do add complicated markup they probably cause errors by imperfectly copying the markup from existing pages.

Yeah, accessibility and content scale/size/depth hints are one of the things on my wishlist for this prompt tool.

Indeed, this is a very hard problem when one considers the variety of devices. Most editors are not trained to understand usability, and so it is something that is hard for them to grasp. "Online help"/ tooltips about why these are bad might be helpful to users.

There is probably a big issue with the perception of visualeditor vs wikitext editor. As all VisualEditor uses are tagged but wikitext edits aren't, it gives the impression that everything that isn't tagged is automatically a wikitext edit, which isn't the case because one could conceivably be editing using their own customized wiki application, the api, a fridge, a bot, or something else.

Maybe a new tag "API edit" or "unknown editing tool" should be added.

Reply to "Wikitext vs VisualEditor lint errors"
Rogol Domedonfors (talkcontribs)

This is a rather important topic in terms of its potential to break existing content and impact on the ways contributors and (human) editors work. It merits a project page here with a broad description of the goals and the ways it is proposed to achieve them. A Phabricator search page is not an adequate substitute. In particular I note that it is proposed to reduce the power of templates to make it easier for software (I assume VE is a main driver here) to handle them. This really needs community engagement -- there's a huge investment in template technology closely adapted to the current software and there needs to be a clear engagement with the community before major impacts on their ways of working. As I point out in another topic, the absence of a locus for community engagement is expected to be a blocker in and of itself.

SSastry (WMF) (talkcontribs)

Understood. The evolving wikitext ideas are, at this time, still at an idea phase and you might have seen the notes at Parsing/Notes/Wikitext 2.0, Parsing/Notes/Wikitext 2.0/Strawman Spec ... those are not accepted ideas of what will happen, but more a proposal of what we might want to work towards in the long run. Even if that were the plan (which it is not, it is just my personal high-level proposal), it is not something that we will just flip the switch one day ... but something that we will incrementally work towards. And yes, I agree about the need for community conversation and engagement.

But, https://phabricator.wikimedia.org/T114445 is one of the first steps that is actually in an RFC phase.

Rogol Domedonfors (talkcontribs)

Thank you. I see the Phabricator task has been open for a year now. Phabricator is not a suitable venue for a community engagement exercise, although it may be capable of sustaining informal staff discussions around an existing task. You really need to hold that engagement somewhere more suitable like here or Meta. Presumably you have the stakeholder mapping and communications plan ready to go?

Reply to "Evolving wikitext"

Parser unification project

3
Rogol Domedonfors (talkcontribs)

Since this is sufficiently important to merit its own mention in the Annual Plan (Product Programme 4 Objective 1 Goal 1), I suggest that it should have a page here to describe the goals, roadmap and so forth, as a locus for community engagement. After all, the discussion around the Technical Collaboration Guideline/Community decisions makes it clear that the absence of such a resource would be a blocker for community adoption in and of itself.

SSastry (WMF) (talkcontribs)

We are still figuring out the specifics of all of this and it has been one of the topics of our offsite. The unification has been long overdue and we will publish more details of our thinking as it comes into clearer focus. But, till such time, you can see some thoughts at Parsing/Notes/Two Systems Problem.

Rogol Domedonfors (talkcontribs)

Thank you for that reference, but it does rather sound as if you take the view that this is purely a matter for internal discussion. I suggest that you involve the community of people who use the editor(s) and parser(s) on a regular basis who will have some useful insights as to what does and does not work well in the various systems at present, and where they might want to see the tools that they use going. Having a prominent pointer here to Parsing/Notes/Two Systems Problem or wherever you choose to host that community engagement, with a framework to structure the discussions, would undoubtedly be of value to the project at this stage.

Reply to "Parser unification project"
There are no older topics
Return to "Parsing" page.