Parsoid/Round-trip testing/Diffs
False positive reports (mostly fixed with a better rt-testing diffing strategy)
edit
In many cases, it seems to be because the double-rt-diffing is comparing mismatched sections .. probably because of the diffs that the wt-diff algo returns.
In some cases, it could be because of DSR inaccuracies.
http://localhost:8000/_rt/itwiki/British_Grand_Prix_2009http://localhost:8000/_rt/eswiki/Festival_de_la_Canci%C3%B3n_de_Eurovisi%C3%B3n_1963http://localhost:8000/_rt/arwiki/%D8%AD%D8%B3%D9%86_%D8%A7%D9%84%D8%AF%D9%85%D8%B3%D8%AA%D8%A7%D9%86%D9%8Ahttp://localhost:8000/_rt/hiwiki/%E0%A4%B5%E0%A4%BE%E0%A4%AF%E0%A5%81_%E0%A4%AA%E0%A5%8D%E0%A4%B0%E0%A4%A6%E0%A5%82%E0%A4%B7%E0%A4%A3http://localhost:8000/_rt/hewiki/%D7%9E%D7%99%D7%9B%D7%90%D7%9C_%D7%94%D7%A0%D7%A7%D7%94http://localhost:8000/_rt/itwiki/Vanessa_Branchhttp://localhost:8000/_rt/ruwiki/%D0%A8%D0%BB%D0%B8%D0%B2%D0%B8%D1%87%2C_%D0%9D%D0%B5%D0%BD%D0%B0%D0%B4http://localhost:8000/_rt/hewiki/%D7%90%D7%9C%D7%9B%D7%A1%D7%A0%D7%93%D7%A8_%D7%90%D7%98%D7%99%D7%99%D7%9F_%D7%A9%D7%95%D7%A8%D7%95%D7%9Fhttp://localhost:8000/_rt/kowiki/%ED%99%8D%EC%BD%A9_%EC%A7%80%ED%95%98%EC%B2%A0(plus <references />)http://localhost:8000/_rt/itwiki/Take_Me_Home_%28One_Direction%29http://localhost:8000/_rt/svwiki/Lista_%C3%B6ver_avsnitt_av_Game_of_Throneshttp://localhost:8000/_rt/hiwiki/%E0%A4%95%E0%A5%8D%E0%A4%B0%E0%A4%BF%E0%A4%95%E0%A5%87%E0%A4%9Fa(??)http://localhost:8000/_rt/kowiki/%EC%9D%B4%EC%98%81%ED%95%9Chttp://localhost:8000/_rt/hewiki/%D7%9E%D7%A8%D7%99%D7%95%D7%9F_%D7%A0%D7%A1%D7%98%D7%9Chttp://localhost:8000/_rt/arwiki/%D9%85%D8%AA%D9%84%D8%A7%D8%B2%D9%85%D8%A9_%D8%A7%D9%84%D8%B9%D9%88%D8%B2_%D8%A7%D9%84%D9%85%D9%86%D8%A7%D8%B9%D9%8A_%D8%A7%D9%84%D9%85%D9%83%D8%AA%D8%B3%D8%A8(because of stray </ref>s)http://localhost:8000/_rt/plwiki/Rotterdamhttp://localhost:8000/_rt/eswiki/El_%C3%BAltimo_vals_%28canci%C3%B3n%29http://localhost:8000/_rt/frwiki/Championnat_du_monde_de_hockey_sur_glace_1938
Auto <references /> insertion (most false reports now fixed with a better rt-testing diffing strategy)
edit
Lots of pages where <refererences /> is missing and has references section auto-generated has RT diffs when the <references /> tag is serialized. This is being classified (incorrectly) as a semantic diff.
Link in links
edit- http://localhost:8000/_rt/zhwiki/%E5%8D%9A%E5%98%8E%E5%B0%94%E6%96%B9%E8%A8%80
- http://localhost:8000/_rt/hiwiki/%E0%A4%9C%E0%A5%89%E0%A4%A8_%E0%A4%B8%E0%A5%80%E0%A4%A8%E0%A4%BE (as well as empty list items)
- http://localhost:8000/_rt/zhwiki/%E6%9D%8E%E5%85%86%E8%89%AF (specifically <ref> in an ext-link) Ex: [http://google.com X<ref>foo</ref>]
- http://localhost:8000/_rt/arwiki/%D8%AC%D8%A7%D8%B3%D8%AA%D9%88%D9%86_%D9%81%D9%8A%D9%8A%D8%AA (+ <references />)
{{lang|..}} template in plwiki
edit
Several pages on plwiki seem to be affected by the use of this in links like: [http://google.com Foo {{lang|en}}]
[subbu@earth lib] echo "[http://google.com foo {{lang|en}}]" | node parse --normalize --prefix plwiki --dump tplsrc
=================================
Szablon:Lang
---------------------------------
<span style="color:#009">([[język angielski|<span style="color:#005" title="Treść w języku angielskim (English)">ang.</span>]])</span>
---------------------------------
<p><a href="http://google.com">foo (</a><a href="Język_angielski" title="Język angielski"><span style="color:#005" title="Treść w języku angielskim (English)">ang.</span></a>)</p>
Should now be fixed after MatmaRex used a bot to fix 1000+ plwiki pages that had this broken wikitext.
Empty list items lost in RTing
edit- http://localhost:8000/_rt/hiwiki/%E0%A4%85%E0%A4%A3%E0%A5%81%E0%A4%B5%E0%A5%8D%E0%A4%B0%E0%A4%A4
- http://localhost:8000/_rt/jawiki/%E5%B0%91%E5%B9%B4%E9%AD%94%E6%B3%95%E5%A3%AB
- http://localhost:8000/_rt/arwiki/%D8%AC%D9%88%D8%A7%D8%B2_%D8%B3%D9%81%D8%B1_%D9%87%D9%86%D8%AF%D9%8A
- http://localhost:8000/_rt/arwiki/%D8%A8%D9%84%D9%8A%D8%BA_%D8%AD%D9%85%D8%AF%D9%8A
- http://localhost:8000/_rt/plwiki/Karel_Dobbelaere
- http://localhost:8000/_rt/hiwiki/%E0%A4%AC%E0%A5%8D%E0%A4%B0%E0%A5%89%E0%A4%95_%E0%A4%B2%E0%A5%87%E0%A4%B8%E0%A4%A8%E0%A4%B0
Fostered content from tables
edit- http://localhost:8000/_rt/arwiki/%D8%AF%D8%A7%D8%A1_%D8%A7%D9%84%D8%A3%D9%85%D8%B9%D8%A7%D8%A1_%D8%A7%D9%84%D8%A7%D9%84%D8%AA%D9%87%D8%A7%D8%A8%D9%8A
- http://localhost:8000/_rt/jawiki/%E3%82%AA%E3%83%AA%E3%82%B3%E3%83%B3%E3%83%81%E3%83%A3%E3%83%BC%E3%83%88
- http://localhost:8000/_rt/kowiki/%EC%9D%B4%EC%98%81%ED%95%9C
- http://localhost:8000/_rt/zhwiki/%E6%96%90%E8%BF%AA%E5%8D%97%E4%B8%80%E4%B8%96_%28%E8%91%A1%E8%90%84%E7%89%99%29
Fostering of lists from tables
edit- http://localhost:8000/_rt/jawiki/%E3%82%B1%E3%83%B3%E3%83%89%E3%83%BC%E3%83%AB%E9%83%A1_%28%E3%82%A4%E3%83%AA%E3%83%8E%E3%82%A4%E5%B7%9E%29
- http://localhost:8000/_rt/jawiki/%E8%95%83%E5%B1%B1%E4%B8%98%E9%99%B5
- http://localhost:8000/_rt/zhwiki/%E8%89%BE%E7%88%BE%E4%B9%8B%E5%85%89
- http://localhost:8000/_rt/nlwiki/Heinz_Hellmich
- http://localhost:8000/_rt/nlwiki/Wilhelm_List
Loss of duplicate transclusion params
edit
Seems to show up in multiple pages in rt-testing
After the fixes to mimic newline suppression before categories, these are now properly recognized as syntactic diffs.
Paragraph-wrapping related false-positive semantic error reports (see https://phabricator.wikimedia.org/T89628) (now fixed with rt-diff fixes)
edit
Lots of reports which should really be a syntactic diff
Block-tag generating transclusions with leading whitespace introduce conservative nowiki protection around whitespace during RTing
editWeird partial {{! output in rt-ing
This turned out to be a bug in DSR computation. Patch now in gerrit.
Implicit <td> insertion
editNowiki-ing of bad transclusion
edit- http://localhost:8000/_rt/arwiki/%D8%A5%D8%AF%D8%A7%D8%B1%D8%A9_%D8%A7%D9%84%D8%BA%D8%B6%D8%A8 (+ <references />) -- the semantic diff report is caused by the <references /> diff.
Bad tokenization of !! in <td> (https://phabricator.wikimedia.org/T91411)
edit
Multi-line xml tag parsing
edit
http://localhost:8000/_rt/kowiki/%EC%9B%A8%EB%AF%BC%EC%A5%94-- "<foo\nbar\n<baz>" parses as an xml tag with tag-name spanning 3 lines. Not sure if multi-line xml tag names are valid.
Other
edithttp://localhost:8000/_rt/enwiki/Markus_Fagervall -- seems to be fixed
http://localhost:8000/_rt/kowiki/%ED%8C%A8%EB%B9%84%EC%BD%98 -- the following snippet demonstrates the issue
[subbu@earth tests] echo '<link rel="shortcut icon" href="<nowiki>http://www.example.com/myicon.ico</nowiki>" />' | node parse --wt2wt
<link rel="shortcut icon" href="<nowiki>http://www.example.com/myicon.ico</nowiki>" />
Bad quoting (<ref name="foo'>..</ref>)
edithttp://localhost:8000/_rt/nlwiki/Fred_VargasEdited the page and fixed the quoting.
Bad rt-ing of chess table
edit
Semantic errors now fixed -- these are all syntactic errors now.