Topic on Talk:VisualEditor/Design/Software overview

Constraints

2 comments • 11:02, 10 July 2015 9 years ago

2

✓ (talkcontribs)

I think you're making some heavy mistakes there. I also was thinking about such an live editor and semantic autoformatting, and instead of starting hacking (OK, I did and rewrote Preprocessor_DOM in javascript) I pondered a lot about parsing and editing. I had loved to join in the hackaton, but I had to learn for my tests.

First of all you're completely forgetting "inclusion zones" (<includeonly> etc), which are even used in article namespace (w:de:Wikipedia:WikiProjekt Begriffsklärungsseiten/FAQ#Wie funktioniert das nochmal mit dem Per-Vorlage-Einbinden und dem noinclude.2Fonlyinclude.3F) and in talkspaces as a workaround for Extension:Labeled Section Transclusion.
The proplems described in the section #Constraints are more common than you seem to believe. Lots of things are based on the use of templates as and in attributes (eg {| {{orangetable}} ...), and its unimaginable to do without.
Templates are never complete documents, some are even designed to be tablestarters or new-liners. How would you parse structures like {{#if:...| {{!-}} ...}}?
And some templates even need to be invalid, because they would be much much larger instead. Examples would be de:Wikipedia:Formatvorlage Bahnstrecke#Beispielanwendung or de:Vorlage:Infobox Schiff/DokuOhneTyp#Beispiel.
Another topic is the editing of template pages. A preview for test parameters (and test environment) would be nice, also dynamic nesting of templates etc. I can't see how to deal with such requests in the proposed DOM.

At first I also thought about a top-down document model, but I fastly came to the conclusion that this is only doable at very, very simple pages. A autoformatter that sees an unclosed table/div/whatever never knows what's hidden in the following templates. A live-parser/autoformatter/semantic lexer has to use a bottom-up model, just like the current parser. Steps would be

Getting the xml-like tag hooks, comments and inclusion handlers (what to do if malformed? Current: run to the end)
Parsing headings, templates and tpl-arguments
expanding templates
parsing wikitexts into tables/blocks/images/whatever and doing text annotations
tidy the generated html for output

The current parser does the first two steps together, semantically they could be divided. I'm not sure about the fourth step, I've not dived into the source code yet so maybe I'm writing nonsense about that.

My conclusion is that a semantic lexer has to start at the bottom, a autoformatter or editing transaction needs to run down from the top (generated result) again. Everthing other would narrow the required syntax possibilities.

Of course, I think its right to have the document-block-annotatedText model as a data format for saving pages with parsing possibilities to html4, html5, pdf, rss etc, for quick-generating cached content and, most of all, for creating diffs. But for editing we will have to go deeper into wikitext, which has to stay as uncomfortable as today, and templates should not be a part of the DOM.

Reply Edited by External Link to Interwiki (Bot) 02:35, 27 December 2011 12 years ago

Trevor Parscal (WMF) (talkcontribs)

You have some great points and have clearly thought this out. Most of what you are focusing in on has to do with the parser, so you might want to get involved over here. One thing I will say though is that it's important to remember that there are, and will always be, many edge cases that aren't being addressed. What we hope to do is meet in the middle, between supporting exotic cases and content being reformed. While it may not be reasonable for us to support every imaginable edge case, it is quite reasonable for us to provide alternative solutions to the use cases that are causing the edge cases. With careful consideration and research, these alternative solutions can serve the use case and the editor software equally well. It's important to keep a sense of balance in this work, not diving too deep on edge cases, and also not pretending there are none. Hopefully you can help User:Brion VIBBER and others who are focused on the parser to keep that balance and contribute your expertise.

This post was posted by Trevor Parscal (WMF), but signed as Trevor Parscal.

Reply Edited by Flow talk page manager 11:02, 10 July 2015 9 years ago

Reply to "Constraints"