Topic on Talk:VisualEditor on mobile/VE mobile default/Flow

How should we measure which editing interface is "better" for newer contributors?

2
PPelberg (WMF) (talkcontribs)

In running this A/B test, we are trying to figure out which editing interface is a “better” default for newer contributors.

To answer this question, we first need to define what “better” means so we can compare the two test groups and, ultimately, decide whether to explore making the mobile visual editor the default mobile editing interface for more contributors on more wikis.

How do you think we should measure which editing interface is "better" for newer contributors?

Alsee (talkcontribs)

The true goal is the health and productivity of our projects. In more simple and measurable terms, the goal is to maximize contributions. The number of edits or size of edits are poor measures, there may be different work patterns between the two editors. So as a practical matter, the metric we want is user retention and sustained contributions. Sustained contributions are particularly significant, as an edit by a knowledgeable experienced user is far more reliable and valuable than edits by newbies. We want to look at new user retention over as long a time span as practicable. If you want to expand on that, you could count the number of days the user has been active. That would avoid any small-scale differences in editing style between the two editing environments.

Metrics such as edit-completion-rate may be more convenient to measure, and may be useful for catching certain glaring issues, however past studies on VE have demonstrated that there are complexities with defining and interpreting that metric. If edit completion rates were to point in the opposite direction as user-retention&contributions, then obviously we disregard the irrelevant completion rate. For example it's a not-uncommon part of the wikitext workflow to open additional throwaway edit-sessions just to view or copy wikitext from a page. Closing that session without saving does not indicate any sort of failure. The original VE research project explicitly excluded any session where the user closed the editor without any content change. Failing to account for that issue will result in invalid low figures for wikitext success rate.

Another metric you want to look at is whether there is any preferential direction in users switching away from one editor and into the other. Assuming retention and contributions are roughly equal between the editors, obviously we should not be forcing new users to switch away from a bad initial default.

Shifting to a related subject, I'd like to note that the Foundation has spent years trying to get positive results for VE. I'd also like note that some of the documentation here continues to indicate a significant bias seeking a specific outcome on this research. previously attempted to roll out a Visual-Editor-default as part of the SingleEditTab project. I'd also like note your graph on Per-interface retention rates shows that mobile&desktop retention rates for VE are roughly half of the rate for wikitext. When users are defaulted into VE, they generally either quit or switch to wikitext. I'd also like note that your table for Use of the visual and wikitext editing interfaces shows that when wikitext is the default, nearly one hundred percent of editors stick with it, and when VE is the default the overwhelming majority of editors flee VE and switch to wikitext. For the last several years the Foundation has been battling the community trying to push VE, and the community has been continually fighting back and insisting that wikitext is the better tool for the job. When the Foundation attempted an unannounced rollout of a VE-default, there was a unanimous Polish demand that it be rolled back. Two other wikis (including EnWiki) went so far as to write hacks to the sitewide javascript to reverse the VE-default. A constant flow of new users into the community is extremely important to us. We are strongly motivated to protect their on-ramp to success.

And finally, I'd like to note that the test scenarios for positive test results lays out "a proposal to make the VisualEditor the default mobile editing interface on all wikis", but if the test results are negative it instead directs analysis to figure out why you didn't get the desired results. Can we please get that changed? If the research finds that a VE-default is actively harmful to new users, that obviously warrants an equal-and-opposite proposal to make the Wikitext the default mobile editing interface on all wikis.

Reply to "How should we measure which editing interface is "better" for newer contributors?"