A extensão Linter identifica padrões de wikitext que devem ou podem ser consertados em páginas, juntamente com algumas orientações sobre quais os problemas com esses padrões e como corrigi-los.
Os grupos de páginas do Special:LintErrors com erros por tipo. Algumas dessas questões podem ser mais fáceis de encontrar com Special:Expandtemplates. Nesta página, classificaremos questões de fiabilidade de acordo com a gravidade da questão em relação aos objetivos bloqueados por essas questões. Mais informações e discussões sobre isso são fornecidas abaixo.
Continuaremos a melhorar a funcionalidade para eliminar o ruído, corrigir erros e tornar a saída do linter mais acionável, mas a saída atual está pronta para usar e agir.
Documentação de problemas de lint
Por que e o que consertar
Going forward, the parsing team plans to leverage the Linter extension to identify wikitext patterns:
- that are erroneous (ex: bogus image options – usually caused by typos or because media option parsing in MediaWiki is fragile).
- that are deprecated (ex: self-closing tags)
- that can break because of changes to the parsing pipeline (ex: replacing Tidy with RemexHTML)
- that are no longer valid in HTML5 (ex: obsolete tags like center, font)
- that are potentially broken and can be misinterpreted by the parser compared to what the editor intended them to be (ex: unclosed HTML tags, misnested HTML tags)
Not all of them need to be fixed promptly or even ever (depending on your tolerance for lint). Different goals are advanced by fixing different subsets of the above lint issues. We (the parsing team) will try to be transparent about these goals and will provide guidance about which goals are advanced by fixing which issues.
São fornecidas instruções simplificadas na página de FAQ.
Meta: Substituir o Tidy
As part of addressing technical debt in the parsing pipeline of MediaWiki, we replaced Tidy with a HTML5-based tool. However, doing so would have broken the rendering of a small subset of pages unless certain wikitext patterns were fixed. Specifically, issues found in the
tidy-font-bug categories. In order to do a timely replacement of Tidy, we classified all these issues as high priority.
Right now, the HTML generated by the PHP parser is used for read views and the HTML generated by Parsoid is used by editing tools and the Android app among others. The parsing team, as one of its long-term objectives, wants to enable the use of Parsoid's output for both read views as well as for editing. Since Parsoid and RemexHTML are both HTML5-based tools, the lint categories that affect RemexHTML's rendering also affect Parsoid's rendering. We haven't yet identified any newer lint issues that affect Parsoid's rendering at this time, but will update this list as we identify any such.
This is a somewhat complex goal and we haven't yet arrived at an understanding about how important it is to pursue this goal or how far we should go with this. Additionally, it is not yet clear what mechanisms we wish to leverage towards this goal. For example, based on a bunch of discussions in different venues, User:Legoktm/HTML+MediaWiki outlines a proposal for handling the html5-deprecated big tag. In any case, fixing issues in the
self-closed-tag categories advance this goal. Given lack of clarity around this goal, we have accordingly marked the obsolete-tag category as a low-priority goal.
Meta: Esclarecer a intenção do editor
Getting markup right is hard. Errors inadvertently creep through. While the parser does its best in recovering from these errors, in many cases, what the parser does might not truly reflect the editor's original intent. Given that, we recommend that it is best to fix the issues identified here to clarify the editor's intention. Issues in the
missing-end-tag categories seem to affect this goal. Since this is a fairly important goal, we have marked most of them with medium priority. However, we have marked the missing-end-tag category with a low priority since in a vast majority of cases, the parser does seem to recover fairly accurately. Nevertheless, we recommend fixing whatever can be fixed without too much effort, if only to assist comprehension by other human editors and tools.
Meta: Marcação limpa
Getting markup right is hard. Even in the presence of errors, the parser does a fairly decent job in most cases in figuring out accurately how that piece of markup is supposed to render. But, in much the same way that typos, punctuation and minor grammatical errors can feel unsettling, some editors or those with a developer-mindset might find lint issues in these categories unsettling. We don't recommend spending an inordinate amount of time fixing these issues and, in many scenarios, bots might be able to fix these up as well.
stripped-tag lint categories affect this goal.
Quando há erros de lint para uma página atualizada?
Currently, all lint categories are populated by errors identified by Parsoid while parsing a page. When a page (or, template transcluded on a page) is edited, ChangeProp requests a re-parse of that page from Parsoid, which will send the fresh results to the Linter extension.
This means that when a new category is introduced (or a correction is made to a previous category), it may take a while for all the results to be updated (if ever for pages that are rarely touched). Making a null edit would speed up the process individually. However, in phab:T161556, we're exploring ways to reprocess all pages.
Should pages in X namespace (e.g. talk) be fixed
- WPCleaner – a Java program that interfaces with Linter and can also detect some of the errors
- ja:User:MawaruNeko/ShowPageLintError.js – a user script that shows all lint errors on a page
- Bot by User:星耀晨曦 that can fix multiple-unclosed-formatting-tags errors.