Talk:Growth/Personalized first day/Structured tasks

About this board

MMiller (WMF) (talkcontribs)

We can think of several editing workflows that could be structured, along with the help of algorithms. Here are some examples. Which of these workflows do you think have the most potential to be structured? Which ones would be useful for the wiki and which ones not useful? Are there others you can think of?

  • Add a link: algorithm recommends words or phrases that should be blue links, on articles that don't have many blue links. Newcomer decides whether the link really should be added and adds it.
  • Add an image: algorithm recommends images from Commons that might belong in the article. Newcomer decides if it is a good fit for the article and adds it.
  • Add a reference: algorithm recommends sentences or sections that need references. Newcomer goes out to find references and adds them in.
  • Add a section: algorithm recommends section headers that could be used to expand a short article. Newcomer finds sources and adds content.
Galendalia (talkcontribs)

I think adding a table and the aspects that go with it should be an advanced task as a lot of articles have tables some basic some advanced.

MMiller (WMF) (talkcontribs)

@Galendalia -- interesting. Do you know of some way to identify articles that need tables but don't have them?

John Broughton (talkcontribs)

Adding wikilinks is not particularly useful (also, a "link" can be either a wikilink - internal - or an external [http] link; the latter are generally undesirable, at least in the English Wikipedia; and it is helpful to distinguish between the two). Adding maintenance templates is generally not useful

John Broughton (talkcontribs)

Every task that is listed consists of two things - (a) changing an article, and (b) finishing the edit by publishing it (ideally, adding an edit summary). Starting out (as an editor) by making a minor change, such as fixing a typo, is a good way for editors to learn that second thing, which they will be using every single time that they edit. By contrast, adding a section involves (1) adding content (sentences), (2) adding citations, and (3) finishing by publishing.

In other words, "fix a typo" or "make a minor change" should, ideally, be the first structured task that an editor learns, because it incorporates the "finishing the edit by publishing it" micro-task. And once the editor has learned to do that micro-task, other tasks will be easier.

MMiller (WMF) (talkcontribs)

@John Broughton -- I think this is a good point, that every task teaches wiki skills (e.g. adding an edit summary) that are not part of the core task itself (e.g. adding wikilinks). We should keep in mind that as we structure the experience of editing, we may also be teaching other universal wiki skills and concepts. Other examples might be teaching users that their edit is immediately public (except in wikis with flagged revisions), or that they can see their edit on the history page.

Galendalia (talkcontribs)

I thought this was what the Wikipedia Adventure was for? It shows the basics of using WP, however, there is no obligation to go through it. If there was 3/4 of our Teahouse questions would stop coming in. Galendalia (talk) 06:50, 20 May 2020 (UTC)

MMiller (WMF) (talkcontribs)

@Galendalia -- good question! Our team looked at the Wikipedia Adventure (and many other attempts at onboarding newcomers), and we've learned a lot. In summary, our current theory is that a good way to help newcomers stick around Wikipedia is to help quickly have a positive editing experience. We think that if they can make a good contribution within minutes and understand its value, they will be excited and want to keep going. Whereas if they have to go through a long tutorial, they might lose patience and not stick around. So this idea, "structured tasks", is about how we can give newcomers a real editing experience, but with guardrails so that the experience is positive for them and for the wiki.

More background information: In a study on the Wikipedia Adventure, while a lot of users claimed to enjoy the experience, it unfortunately didn't statistically increase their retention, or any other important metrics. But in a study about the Teahouse, it was shown that being invited to the Teahouse does statistically increase retention. So our team took this all to mean that there is something valuable in the personal connection that happens with getting a question answered (although we know it takes a lot of time from experienced editors). That's why we decided to build the mentorship module for the newcomer homepage. And, to your point, as we deploy the mentorship module on more wikis, we are continually trying to strike the balance of giving newcomers a personal connection, while not overburdening the mentors who answer the questions.

Galendalia (talkcontribs)

I think that the not sticking around part is the bullying of admins and the not following the don't bite the newcomer rule. Many a time in my start and even until today, I get admins telling me what to do and what not to do as well as adding their own POV to why I should or should not be doing something. Two recent examples are last night I asked a question on IRC about BLP for clarification from someone who I thought would have the answer and their response was "You should find something else to do as you have bitten off more than you can chew as a new comer." The second was today an editor pinged me about removing the gnome and fairies tags from indefinitely blocked user pages to clean up the active user lists as it contained some 50 or so blocked users from years back to current. That editor opened an ANI against me because he/she didn't get the answer they wanted. I think if admins and other people were to stay out of the new members using their in your face routines (does not apply to all, but to some) and let normal editors be a mentor, this would work great. There are definitely cliques in the admin and sysops teams that seem out to get newbies and instead of being helpful they are rude and not helpful. When I first joined I went into IRC to the en-help channel and got chastised because I did not have a cloak nor am I at 3 months as a wikipedian. When I asked about these I was pointed to 2 links of which neither were helpful. I watched this same user in the IRC and they are rude to everyone in the tone of their messages and I even PM'd them to let them know I felt they were being hostile, not only towards me, but others as well and the response I got was "Deal with it' then I got kicked from the room. I requested a courtesy vanish on Friday last week. Before I knew it, those I have worked with on various things posting messages for me to come back and continue my contributions. So I decided to come back and again, same hostility towards me. So in short, I would recommend that the mentor's not be admins, sysops, clerks, ARBs, etc. Just normal everyday wikipedians who volunteer to take on someone. How would we define who is an experienced editor I guess would by my next question.

Waggie (talkcontribs)

I was the person Galendalia asked "about BLP for clarification". They had asked for help in private message to me with a dispute resolution case they were mediating for on en-wiki. It was a particularly complex case and they had already pinged two others on-wiki for assistance with it. The "quote" that Galendalia is posting here is not an accurate quote. My response to them was actually: "It's a pretty involved situation you're asking for advice on, you may have bitten off more than you can chew right now." and "I see that you've pinged Robert McClenon and Nightenbelle, I would await their responses." As you can see, the tone of my reply is quite a bit different than the "quote" they are offering here.

They are also complaining about us asking them to not idle in the help channel until they meet the requirements for idling in the channel as specified at en:Wikipedia:IRC/wikipedia-en-help. They were repeatedly pestering numerous people about getting a WM cloak and were pretty upset that they were not getting a cloak despite not meeting the minimal criteria specified at m:IRC/Cloaks. They kept obtaining various different cloaks, trying to get past the channel rules regarding idling in -help without meeting the criteria for idling or helping. Honestly, I think I was pretty patient and polite given the level of intensity from them regarding this.

This rudeness to helpees they speak of, and this quote of "Deal with it", I do not know what they are referring to. If this is referring to me, this is entirely inaccurate and they never PMed me with anything of the sort. I'm actually very patience and polite with helpees, even ones who are difficult and/or UPE.

Frankly, I'm not appreciative of this blatant mischaracterization of my actions.

MMiller (WMF) (talkcontribs)

Thanks for sharing that perspective, @Galendalia. We know for a fact from research that hostility toward newcomers drives them away. Here is one of the most important papers about it, and here is another influential research project. I think it's definitely hard to improve the culture of a wiki, and I think it's great that you're trying to be a force for positivity in your work. So far, the mentors that we've recruited seem to be generally encouraging to newcomers, and I think you have a good idea that we should make sure it's clear that many people can be a mentor -- it doesn't only have to be the most experienced and involved editors on the wiki.

Nick Moyes (talkcontribs)

I can feel Galendalia’s pain. Shortly after becoming an Administrator earlier this year, I thought I’d go and try out IRC.chat as I’d never used it and thought I ought to get a feel for the place. I not only found it incomprehensible as well, but I was permanently blocked by a so-called ‘helper’ whose manner towards me was appallingly unwelcoming. There is no accountability or complaints system at IRC, so I will never ever recommend any newcomer on en-wiki to ever have go there unless major changes happen there, or unpleasant/unhelpful editors are kicked out. The person who I encountered wasn’t an admin, so unpleasant attitudes to newcomers isn’t something unique to those with extended rights. Finding mentors/helpers with the right interpersonal skills to be able to deal with inexperienced users is critically important.

Waggie (talkcontribs)

I'm sorry that Nick Moyes had a bad experience, although I must say that it was somewhat self-inflicted for them. There is accountability on IRC, and there is a process for complaints and appeals. For a more complete and accurate explanation of what actually happened here, please read the thread at en:User_talk:Waggie#Your_attitude_on_IRC. I go into great detail about why this happened. I am also willing, with Nick Moyes' and Jeske's (as the other involved person here) permission, to publicly release the logs of the encounter. There was no "permanent block", bans in -help are for 24 hours by default. Secondly, as soon as they were identified to a known "good" user, I lifted the ban immediately.

Sdkb (talkcontribs)

Looking through the list of tasks at https://en.wikipedia.org/wiki/Wikipedia:Task_Center...

As I've mentioned at a previous stage, I still think anti-vandalism has a ton of potential to be a structured task for newcomers (it somewhat already is with WikiLoop Battlefield). Categories and copy editing both sound good. There are also some more niche tasks that could be easily structured, such as fixing links to disambiguation pages that pop up in mainspace.

MMiller (WMF) (talkcontribs)

@Sdkb -- I remember when you mentioned that, and @Zoozaz1 brought up WikiLoop Battlefield as an example of how reverting vandalism is like a structured task. I guess my open question is still whether newcomers would do a good job of judging vandalism, given their low wiki experience. You recommended that we check in with some Wikipedians who do a lot of edit patrolling. I can go seek some out -- is there anyone in particular who you would recommend or tag?

Sdkb (talkcontribs)
Galendalia (talkcontribs)

My issues with the rollback that everyone gets are:

1. Inexperienced

2. Not trained

3. Causes edit wars

I recommend one or all of the following:

   A. IP users are not allowed to use the rollback feature
   B. Only the people who have graduated from the CVUA should have rollback rights (I see a lot of new users getting the right without any type of training. 
   C. To use the rollback built in it must be a registered user with 3 months experience.
MMiller (WMF) (talkcontribs)

Hi @Galendalia -- thanks for thinking about this. We've been talking a lot about easy editing tasks for newcomers to do, and we wanted to hear from someone in CVU because of the idea that maybe reverting simple vandalism is something newcomers could help with. It seems like an interesting idea, because on the one hand, some vandalism is really obvious, but on the other hand, newcomers know little about Wikipedia or vandalism, and might not have the judgment required. What's your take? Could you imagine newcomers being given something like a very simple version of Huggle, and asked to revert obvious vandalism? If I'm reading your previous comment correctly, it sounds like maybe you would say it's not a good idea.

Galendalia (talkcontribs)

Hi @MMiller (WMF) : Even though I have been on WP just over a month, I feel the inexperience would be a major hindrance. Like I stated above, They need to complete the CVUA and be on WP for at least 3 months. This will allow new editors time to process the policies and learn from their mistakes rather than reverting a valid entry. There are sometimes subtle entries which would probably not being noticed unless you are looking for them, like no source listed in the diffs. Wait what is a diff? That is a question I see users asking a lot of.

MMiller (WMF) (talkcontribs)

Thanks, @Galendalia. It sounds like your general advice is that reverting vandalism takes some experience and knowledge. Got it. But it also sounds like you have an interesting story, if I may ask -- how did you find your way to reverting vandalism so soon after joining Wikipedia? What caused you to try that type of editing in the first place? What were the very first edits you did?

Galendalia (talkcontribs)

Honestly it seemed like the only thing I can do without having someone revert anything I did or go on a tangent about questions I asked that end up not even answering the question I posed in the first place. I pretty much do 2 things. CVU and Dispute Resolution. I also am in the process of rebooting Spoken Wikipedia as there is plenty of interest in it. That will be the 3rd thing. I’ve been trying to maintain where active user lists are maintained and I’m getting a lot of flack for that because in one instance it requires removing the tag or userbox from someone’s user page and I only did this to those who are permanently blocked. However as soon as I did it people were all over me and reported me to ANI and I’m getting nothing but crap for housekeeping.

MMiller (WMF) (talkcontribs)

Also pinging @Revi (WMF), who has a perspective on this from Korean Wikipedia, which doesn't have any sort of bots for reverting simple vandalism.

NickK (talkcontribs)

I would very much like to have one more: correcting typos / improving language. Wikipedias have a lot of articles that are labelled as needing proofreading. If we can use some spellchecker or dictionary (e.g. for identifying words that are very similar to the dictionary ones but possibly misspelled) or some style problems (e.g. common stop words like 'outstanding' or 'interestingly'), that would give us a good task for a simple first edit. Beyond that, Ukrainian Wikipedia also has a good list of problems at uk:Вікіпедія:Проект:Якість.

Sdkb (talkcontribs)
MMiller (WMF) (talkcontribs)

@NickK -- I agree that would be a perfect task for newcomers. And I think you've hit on the main problem: how to automatically generate lists of potential spelling and grammar corrections across dozens of languages? @John Broughton pointed me towards the Typo Team's "moss" tool, which does this for English. Also, engineers on the Growth team pointed out the aspell and hunspell libraries, which have many languages. Do you know if Ukrainian Wikipedia already does anything like that? Where do the problems listed at uk:Вікіпедія:Проект:Якість come from? Are they from maintenance templates placed by users, or from some automation?

NickK (talkcontribs)

@MMiller (WMF): We had multiple discussions about libraries, we have several bot owners who are maintaining their own lists. There are some lists at uk:Вікіпедія:Список найтиповіших мовних помилок internally or Неправильно — правильно externally (it cannot be completely copied as some might still be accepted in some context, so a human check will be needed). If this is the only issue, I think we can come up with some solution.

Regarding uk:Вікіпедія:Проект:Якість, yes, they are maintenance templates placed by users.

Galendalia (talkcontribs)

I know autowikibrowser had this feature and I was going to start in on some of them, however, I was informed today, that feature has been long gone. I know there is a db source somewhere that contains dictionary words. This does not necessarily resolve synonyms or other word choices. It would be great to have a bot that could those changes based on the article language tag and also to fix dates based on the article date format tag.

Barkeep49 (talkcontribs)

I also agree with adding categories and typos as a potential task. Bigger picture I'm wondering if individual communities could input something into a template to generate these tasks rather than everything having to be done uniformly on the backend perhaps through categories which this tool could render in nice forms.

MMiller (WMF) (talkcontribs)

@Barkeep49 -- thanks for weighing in. I think that the way we have started to build newcomer tasks is in-line with how you're thinking about it. Right now, the feed that newcomers get runs off of maintenance templates, like these. Most wikis have big backlogs of these templates, but maybe one day in the future, newcomers (or others using this feature) could churn through the backlogs, and communities would be incentivized to keep tagging articles with them. That said, the idea we're talking about here, "structured tasks", is about these tasks coming from an algorithm, as opposed to from maintenance templates. Perhaps both sources could continue to be options, and communities could regulate which ones of the pipes (so to speak) they turn on and off into these tasks feeds.

Galendalia (talkcontribs)

To go off of this it would also be dependent on the users grasp of the language. There is a small difference in British English vs American English. Same with the Spanish language where I believe there are 3 versions. I know of a few editors from other countries who try to correct what they assume are typos but in fact are not based on the sentence. That may pose a potential problem with this being automated or templated.

Nick Moyes (talkcontribs)

I've just posted my support for Typo-fixing in the General Thoughts section above, but I'd like to reiterate it as a preferred first task, and to try to understand why it is that fixing typos as a structured task is seen as so difficult ti implement across different langauge sites.

English Wikipedia already has Wikipedia:Lists of common misspellings; Wikipedia:AutoWikiBrowser/Typos and even an article on Commonly misspelled English words, plus a list of variations of acceptable spellings that should NOT be corrected like colour>color and vice versa (Wikipedia:List of spelling variants).

Even if other language Wikipedias don't currently have any such similar internal lists, surely these spell-check lists are available from elsewhere? And it could even be an ideal opportunity to engage with wider editing communities to start building up such a list of common errors themselves which could be incorporated into this task.?

I do tend to feel that anti-vandalism might not be an ideal structured task as it does require some understanding of what is and isn't bad faith editing, and is possibly also prone to being abused if bad edits are let through. English Wikipedia already has edit filters and Cluebot for removing the worst of the worst - but what about other languages? Does manual input here have a role to play?

Galendalia (talkcontribs)

Hey Nick, I just wanted to point out, as I stated earlier, they removed that function from AWB. Also, you have to have a really good reason to gain access to the application to use it. I got denied a couple times, but then they accepted my reasoning. I think part of the difficulty may be the language format in which symbols/characters are used. That would require every language to have their own version of spell check.

John Broughton (talkcontribs)

I'm sure that there are spell checkers already in existence that cover the majority of Wikipedia languages - see https://webspellchecker.com/ , for example.

Pelagic (talkcontribs)

Grammar and punctuation fixing came to my mind. Most educated native speak have an intuitive sense of wrongness when they see ungrammatical a sentences. Having a feel for encyclopaedic tone is a more uncommon skill, but the improvements don’t have to be perfect.

Acquiring the software to identify problem sentences for non-major languages would be harder for grammar than spelling, I imagine.

MMiller (WMF) (talkcontribs)

@John Broughton @Sdkb @Nick Moyes @NickK @Pelagic @Barkeep49 -- since we were all talking about how it would be valuable to have copyediting as a structured task, @Tgr (WMF) and I did some research to look into it. We talked to @Beland, the creator of "moss", a typo-detection script on English Wikipedia. We learned how the tool works, and talked about prospects for doing similar things in other languages. You can see our notes here (@Beland, please add to or correct them!) We're going to keep thinking, learning, and posting about the possibilities around copyediting.

Sdkb (talkcontribs)

Sounds great; thanks for the update!

LittlePuppers (talkcontribs)

@MMiller (WMF): I would make spelling correction a separate task from copyediting and label it as such; I personally think of copyediting as more of a grammar/structure/clarity thing than spelling correction. That's not to say that fixing typos is unimportant or something we shouldn't do, but it might be more clear for other editors (and you should probably deal with categories differently for each). LittlePuppers (talk) 01:37, 4 June 2020 (UTC)

MMiller (WMF) (talkcontribs)

Hi @LittlePuppers -- thanks for weighing in. That distinction is not something I had thought about. And I think you're right -- the more we've thought about how we might build a structured task that would recommend spelling corrections, the more we think that such a task would only recommend spelling corrections, and not other kinds of grammar edits, which would require totally different algorithms to identify. Where would you say that the phrase "typos" fits in? Do you think typos are more about spelling, or more about punctuation or something else?

LittlePuppers (talkcontribs)

Thanks MMiller (WMF). I'd say that spelling is solidly within the realm of typos, and something like phrasing is solidly within the realm of grammar, while punctuation is somewhere in between. It's a bit harder to say, but I think that punctuation would fit into the category of typos if it's an obvious and entirely unambiguous error (for example, putting two periods instead of one at the end of a sentence), but more in the category of grammar when it's something less clear-cut (such as over or underuse of commas, or a period vs. a semicolon).

To generalize a bit more, typos are unambiguous mistakes based on basic rules (be it a misspelled word, or some other typographical error), while copyediting or grammar (whatever you decide to call it) focuses on improving language (be it sentence or article structure, phrasing, punctuation, etc.) in a way that makes it more clear or easier to understand, even if it wasn't strictly "wrong" before. To link to two projects on en.wp I think you're familiar with, typos are in the realm of the MOSS project and grammar/copyediting is in the realm of the Guild of Copyeditors. LittlePuppers (talk) 02:08, 24 June 2020 (UTC)

MMiller (WMF) (talkcontribs)

Thank you, @LittlePuppers. This actually helps a lot, especially where you said "typos are unambiguous mistakes". This has implications for our prioritization and design of different structured tasks. For "unambiguous mistakes", we can probably create a very confident algorithm that can feed easy edits to newcomers, which they could accept or reject. Copyediting or grammar is a more advanced task, requiring the newcomer to create/produce the change on their own. It's like the difference between a true/false question ("This word should actually be spelled this way. True or false?") and an open-ended question ("What is a better way to phrase this sentence?"

MMiller (WMF) (talkcontribs)

Hello @جار الله -- I'm the product manager for the WMF Growth team, and I work with @Dyolf77 (WMF). He said it would be okay if I ping you here, where we are having a conversation about "structured tasks". In this conversation, we have been talking about automated ways to identify spelling errors in the wikis, so that we can point them out to newcomers to fix. We've talked about the moss tool in English Wikipedia, and I've learned that you built something similar in Arabic Wikipedia with JarBot. We're trying to figure out how possible it would be to build similar things in many wikis. I'm hoping you can answer some questions about your work. Thank you!

  • Which dictionaries/spellcheckers does JarBot use, and which one is best?
  • Does JarBot scan every revision when it is made? Or does it follow its own path through the articles?
  • Approximately how many spelling corrections does it make per day?
  • How does JarBot avoid making changes to peoples' names or names of locations, or other words that cannot be found in a dictionary?
  • Does it assign a score for how likely something is to be a misspelling, with some having higher scores and some lower? Or does it simply decide that a word is either misspelled or not?
  • Does JarBot automatically make the corrections? How accurate is it? In other words, how often are its corrections reverted?
  • How easily do you think something like this could be made for another language?
جار الله (talkcontribs)

Hello @MMiller (WMF)

Which dictionaries/spellcheckers does JarBot use, and which one is best? I use list of the most common mistakes in Arabic, the list is made by arwiki editors.

Does JarBot scan every revision when it is made? Or does it follow its own path through the articles? It depends on the tasks, sometimes by new revisions and sometimes by scan the database.

Approximately how many spelling corrections does it make per day? I don't know, maybe 50-100.

How does JarBot avoid making changes to peoples' names or names of locations, or other words that cannot be found in a dictionary? There is a list of words that the bot most avoid, but our common mistakes list didn't includes names and locations.

Does it assign a score for how likely something is to be a misspelling, with some having higher scores and some lower? Or does it simply decide that a word is either misspelled or not? The script doesn't work on AI to make decisions (maybe in the future).

Does JarBot automatically make the corrections? How accurate is it? In other words, how often are its corrections reverted? Yes, the bot is automatically makes the corrections. And 99.99% are correct.

How easily do you think something like this could be made for another language? I don't know about other languages but in Arabic and maybe the languages of the Middle East, the start will be from scratch and work will be difficult because there are no valid word lists or comprehensive dictionaries.

Best regards.


MMiller (WMF) (talkcontribs)

Thank you for the quick reply, @جار الله. These answers are helpful for now, and I will get back in touch if we decide to work on a project around spelling.

HLHJ (talkcontribs)

Typo-fixing seems like a task that would fit well in a mobile interface. Subtitling movies on Commons and translating subtitles also spring to mind. Adding "lang" templates would also be very useful and make the Typo Team's life easier (flagging that these words are Latin, these are Japanese, and so on).

More creatively, the WikiProject Guild of Copy Editors is always looking for volunteers to read through select articles and review and fix. This is not as readily done on a small interface.

You are building this into a reader app. Maybe link it to what the reader is doing? If they are confused, help them add a "clarify" inline tag. If it needs a citation, have them add that inline tag (everyone knows that tag, even if they never edit). If it is US-centric, let them add "globalize" inline tags.

A good simple interface for this might be OpenStreetMap-style comments to articles; "I got lost here, because you did not define this mathematical term" and suchlike. Scan the text and suggest some inline tags in which the comment could fit as a "reason=" parameter, in this example "clarify". There's a related project for doing something similar in Huggle.

And then let them resolve tags. If a section is templated as needing expansion, ask them to submit a comment suggesting sources that could be used to expand the section, as plain URLs. If they spend time on a "citation needed", the app could tell them to click the tag for guidance on adding a reference (a few times only). Or a banner saying: "This article has a photo request. If you have or could take a photo to donate to this article, please [add it]".

Reply to "Task types"
MMiller (WMF) (talkcontribs)

What are your main reactions to structured tasks? Do you think this could be helpful to newcomers and to your communities?

John Broughton (talkcontribs)

It would be nice to present a newcomer (perhaps after he/she indicates interest, say by clicking a link) with a set of structured tasks that he/she can do, as a way of becoming a more experienced Wikipedia editor. So: add an image, add a citation, do a copy edit, add a (wiki)link, add a category, add or change an article assessment (on a talk page), revert vandalism, and so on.

My concern is the the team is jumping into this process by building a structured task for something with a relatively low added value to Wikipedia - converting words in articles into internal links (wikilinks). Articles with too few internal links (or none at all - orphans) are typically articles that are rarely read. And while teaching someone to create a wikilink is straightforward, teaching them when - and when not to - create such a link is another matter - see https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Linking#Overlinking_and_underlinking .

More generally, I would urge the team to revisit the table at Growth/Personalized first day/Newcomer tasks#Sourcing the tasks, to think about what the first handful of structured tasks should be. And by "revisit", I mean getting the community actively involved in revising and expanding the table, since it looks like this is going to be a roadmap of sorts.



MMiller (WMF) (talkcontribs)

@John Broughton @Sdkb -- thank you for reading the project page and weighing in. I'm learning a lot from your input, and I'm glad to hear that your general impressions are positive.

I understand your doubts about whether "add a link" is the best place to start. We know that it's a lower-value edit than other kinds. Let me tell you why we're leaning there, and you tell me what you think:

  • In the long run, we hope there could be many different kinds of structured tasks, like @John Broughton says (add an image, add a reference, etc.). But in the short run, the most important thing we would want to do first is to prove the concept that such a feature could work. Therefore, we would want to build the simplest one, so that we can get some software out there and see how it goes, without having to invest too much in the first version. If the first version goes well, then we would have the confidence to invest in types of tasks that are more difficult to build.
  • When thinking through which one would be the simplest for us to build, we gravitated toward "add a link" because:
    • There already exists an algorithm built by the WMF Research team that seems to do a good job of suggesting wikilinks. We tested it in Czech, Korean, Arabic, and Vietnamese, and it looks like about 70-80% of suggestions it makes are good ones. That number puts it into the right range for a "human-machine partnership" -- in other words, the algorithm is smart enough to point the human to the right spots, but not so smart that it doesn't need the human to confirm its work. What this means is that the user can hopefully make good edits and absorb good linking habits from the algorithm's recommendations without having to understand all the concepts right away. For details on the algorithm, see this Phabricator task.
    • Adding a wikilink doesn't usually require the newcomer to type anything of their own, which we think will make it particularly simple for us to design and build -- and for the newcomer to accomplish.
  • Part of building this well will be figuring out how to teach concepts to the user. We're actually going to be taking our first step in that direction in the next few weeks when we deploy "guidance", which makes simple help content available to the user while they're doing their suggested edits. We worked hard on the content so that it is brief, explanatory, and contains examples.
  • The plus side of "add a link" being a low-value edit might be that it is also low-risk -- in other words, perhaps an article can't be too damaged by overlinking, whereas mistakes with adding references or images would be worse.

Regarding other task types, I think @John Broughton is right that the list here could, in an ideal world, become a sort of roadmap of future structured tasks (but only if the first ones we try show promise). I actually think "add an image" would be a good one to think about next, because a really simple algorithm could be: for an article with no images, see if that article in another language has images from Commons. If so, recommend that image. The Research team has confirmed with me that that is possible, plus more sophistication using machine-learning.

What do you think of all this?

Sdkb (talkcontribs)

That makes sense that it's best to do a simple proof of concept-type thing first. Thanks for the explanation!

Bluetpp (WMF) (talkcontribs)

@John Broughton Hi, I'm Phuong, I'm the Vietnamese Ambassador of the Growth team. I want to share my perspective on your opinion that "Articles with too few internal links (or none at all - orphans) are typically articles that are rarely read". Well I think it's only the case for big Wikipedias like English Wikipedia. In medium Wikipedias like the Vietnamese, we have a lot of articles that are fairly important but have few to no links. The article quality in Vietnamese Wikipedia is not very high as we don't have enough man-power to polish every articles. So a structured task for adding link would be a great start for us medium Wikipedias : )

Pelagic (talkcontribs)

Thanks, Phuong, I was planning to ask whether small–medium Wikipedias have more underlinking than overlinking.

It would be great if the tool could also detect and suggest removal of excessive links, so that we’re not training a cohort of newcomers to overlink indiscriminately.

Nick Moyes (talkcontribs)

@Pelagic I would have envisaged that the software would only identify the first example of a word that needed wikilinking, and would not offer a user the chance to link to an already-linked word.

I think that inviting a new editor to remove overlinked words would require a greater understanding of why this isn't a good idea. Wouldn't they just say to themselves "why do I need to do that - links are good, surely?". So under-linking ought to be easier for a new editor to fix than over-linking.

MMiller (WMF) (talkcontribs)

@Pelagic -- interesting that you should mention this idea of turning the algorithm on its head to point out links that should not be present! The researcher working on the algorithm suggested this as a possibility, and we didn't think it would be a very compelling task for newcomers. But I do like the idea that by mixing in some suggested removals with the additions, that might go a long way toward teaching newcomers that there can be too many links, and that judgment is required -- perhaps it might be more effective than just telling them that in the instructions. Tagging our team's designer, @RHo (WMF), so she can think about this.

@Nick Moyes -- yes, the algorithm will be able to include the rule of only adding a link for the first occurrence of a phrase, so I think we should be covered there. We've also discovered through testing that there are a host of other conventions that we want to respect, which we're trying to incorporate into the algorithm. Things like:

  • Not linking years and dates (some wikis have this best practice).
  • Linking the largest possible phrase, e.g. "London School of Economics" instead of just "London".
  • Not linking in the References section of an article.

If you can help think of other "gotchas", we would love to add them to the list (we're keeping them in this Phabricator task). I know we can also draw on the manual of style to sharpen the algorithm.

Galendalia (talkcontribs)

To add to what John said, I think it would be much easier to have a filter somewhere where we can select the new users in a certain period with the number of contributions in the main space. We have a conglomerate of tools to use, however, the keyword being "tools" I feel we should have one tool that would pull this information for us or edit the "User Creation" special page to allow us to do that. That will help out in getting to the new comers.

It would also be great to have an automated script push out the welcome template (I have one I particularly like due to the content) to the new users after a certain threshold is met (say since account was created & number of edits & in time frame & no reverts).

MMiller (WMF) (talkcontribs)

@Galendalia -- I think what you're saying is along the lines of how a few things work already. For instance, Teahouse invites automatically go to the user talk pages of newcomers who have shown that they are generally acting in good faith. The main way that newcomers currently find their way to Growth team features is through the newcomer homepage. After a user creates their account, a few different things (popups, buttons) point the user to their homepage as a place to get started. This works well so far -- about half or a bit more of newcomers winds up checking out their homepage (in wikis that have the feature).

Galendalia (talkcontribs)

Maybe I am wrong here but an editor has to post an invite to the TeaHouse in en wiki. I know I’ve had to do it to a couple new editors. It’s the same with the welcome page it has to be done manually to an editors talk page.

Sdkb (talkcontribs)
Galendalia (talkcontribs)

Thanks for that @Sdkb:. Looks like they only send out once a day that’s why I don’t see them.

Sdkb (talkcontribs)

Seconded about wikilinks being more difficult than it might appear; that's something I previously mentioned.

Sdkb (talkcontribs)

Overall, this is something that I'm excited to see implemented. It seems like it has good potential to help newcomers feel comfortable making edits.

NickK (talkcontribs)

I would say automated tasks should be clearly useful and quickly make newbies understand what they are doing. The worst thing that can happen is an automatic tool suggesting a wrong edit and a newbie gladly accepting it (e.g. adding a link cs:rozvoje in cs:Société Générale is wrong), it would be frustrating both for the community (some WMF guys did it wrong again...) and for the newbies (they might get reverted on their first edit).

MMiller (WMF) (talkcontribs)

@NickK -- thank you for joining the conversation! I'm glad to hear you think structured tasks would be useful. Your comment reminds me that in the Android app, they keep track of how many times a newcomer's structured tasks are reverted (there, they are mostly adding Wikidata title descriptions). If the newcomer gets reverted too many times, they are no longer allowed to do structured tasks. This is an idea we can think about for our version of structured tasks.

NickK (talkcontribs)

@MMiller (WMF): I think there is a misunderstanding. Structured tasks would be useful indeed. My point is that automated tasks should not suggest something that is not clearly useful. I don't want newbies to be victims of bad suggestions in tools. If a newbie is suggested some link that in reality is wrong, adds it and gets reverted, that would be a major frustration.

MMiller (WMF) (talkcontribs)

@NickK -- you're right, I did misunderstand. Thank you for clarifying. I definitely agree that if the system urges newcomers to make edits that then end up getting reverted, both the newcomer and the patroller will be frustrated. I think that anytime we are using an algorithm, there are going to be some mistakes, no matter how hard we work to improve it. We know this already from ORES models, which are usually right, but sometimes wrong (which is why humans need to make the final decision, instead of reverts happening automatically based on ORES). I think we can try to prevent the frustrating outcome in two ways:

  1. Try to make the algorithm as good as possible, of course. For the link recommendation algorithm we have now, this is the Phab task about making improvements.
  2. Design the interface so that the newcomer knows that the suggestions come from an algorithm that can be wrong sometimes, and understands that the important part is that they use their judgment with the suggestions. And we will have to show them the concepts to use so that they develop good judgment.

Does that sound right?

NickK (talkcontribs)

Thank you, this is the approach I was thinking about.

Regarding 1, it is too English-centric. For instance, French Wikipedia recommends linking dates, and I believe concepts of names and common words will be different in Chinese. In Slavic languages, for instance, grammar and indirect cases will be the main problem, i.e. we will need to use a dictionary. E.g. to add a link to the article France (uk:Франція) to uk:Société Générale), you would have to link the world французьких (genitive plural adjective).

On the second point, I would suggest adding one-line description of the article the link is being added to, this should eliminate the most irrelevant ones.

MMiller (WMF) (talkcontribs)

Got it, thanks @NickK. So far we have gotten strong performance out of the algorithm in Czech, Korean, Arabic, and English. But we saw only moderate performance in Vietnamese. I definitely have concerns that there will be languages that have issues. As we try to expand it, could we ask for your help in evaluating how it performs in Ukrainian? Perhaps that can be our first Cyrillic language.

For the second point about adding a one-line description of the suggested article, that is probably something that our team's designer, @RHo (WMF), is thinking about, but tagging her just in case.

NickK (talkcontribs)

@MMiller (WMF): Thanks, that's interesting. Ukrainian grammar should be overall similar to the Czech one, so the approach should work the same way. I think the Slavic factor should be more important than the Cyrillic one, but I might be wrong. Is it possible to test the Ukrainian algorithm if it is already available, or the Czech one if it is not? Thanks

MMiller (WMF) (talkcontribs)

@NickK -- you're welcome to look at the Czech results that we already have. The materials are in the task description here: T245330. The way it works is that we uploaded 20 Czech articles to Test Wikipedia with links created by the algorithm (they're all red because none of those articles actually exist in Test Wikipedia). Then our Czech ambassador, @Martin Urbanec (WMF), went through and added green or red templates to indicate whether they look like links that should be made. The counts of those are in the task description, and the total was that it looks like 86% of the suggestions should be links. Does that make sense?

As we continue to improve the algorithm (happening in T253279), maybe we can try Ukrainian as the next language. I will get back to you about it.

RHo (WMF) (talkcontribs)

Yes, my current thinking is we would re-use the same pattern of showing the wikidata description under the linked article title much like when it is shown in VE when a link is selected (as shown in this screenshot) or search results on mobile.

Sdkb (talkcontribs)

I like that idea. It'd be good to have a softer warning, though, after the editor's first or second reversion, giving a gentle nudge toward reviewing the pertinent "learn how to do this" material.

Nick Moyes (talkcontribs)

Having now managed to read through the project page and these discussions, I'd like to offer my support.

As I understand, WMF want to develop and test a simple but effective way for newcomers to make quick, efficient edits (of one particular type to begin with) that will be:

  • Easy to develop and trial
  • Easy for newcomers to enhance Wikipedias without having to learn the complexities of editing processes
  • Not too damaging to articles if misused
  • Expandable (if successful) to include other types of structured task.
  • Encouraging and motivating for newcomers to use.


I welcome this idea. wholeheartedly. Whilst the opportunity to make very small improvements in this way without having to directly edit pages won't teach someone how to open and edit a page, it will immediately give someone the feeling that they have made a small, but worthwhile contribution to the encyclopaedia that "anyone can edit". I suspect that if they're able to get feedback and satisfaction from that task, they may move on to other structured tasks, and may eventually become interested enough to do more in-depth editing.

A by-product of this approach will be to offer experienced editors who are "on the go" to be able to make small, easy contributions on a mobile, whether in a bus queue, doctor's waiting room or (ahem) under the covers at night, unable to get to sleep.

I don't understand the explanation in the table of Newcomer Tasks that offering discrete copyedits such as spelling checks as one of the early structured tasks is too difficult. It seems that if an article already contains a probable error, very little harm can be done by changing it to another spelling variant. And I would have thought it would have been quite easy to integrate spelling dictionaries from a multitude of languages into a structured task. I have spent many a lunch hour using either Lupin's spellchecker or AutoWikiBrowser to fix things like 1980's > 1980s, or recieved>received, and derived quite some satisfaction of time well spent when I couldn't do more technical work. Doing that on the go via Structured Tasks would have been brilliant. If it's not too much trouble, could I ask for a (simple!) explanation why finding and presenting probable spelling errors is too technical a task to achieve? It seems counter-intuitive not to offer spelling errors. On en-wiki we even have our own list of common errors (see here).

Failing that, I would have thought that adding recommended Wikilinks would be slightly less rewarding but, again, relatively little harm would be done if the wrong word were linked to. So I would support this as a trial Structures Task if Spell-checking is really off the cards.

MMiller (WMF) (talkcontribs)

@Nick Moyes -- wow, thanks for engaging deeply with this content and for summarizing it. I think your summary is really correct clear -- clearer than what I've written -- and I'm sharing it with others. Following the suggestions of many community members on this page, we went out and learned more about the prospects for copyedit and spellchecking tasks, and I'll post about it under a different heading so we can discuss further.

Galendalia (talkcontribs)

I have to admit I am not satisfied with the application or using the mobile web browser on a standard phone. A phablet or tablet would work, but, I for instance, do not always carry my ipad with me and editing it difficult on my phone.

Nick Moyes (talkcontribs)

I use a tiny iPhone 5S for a lot of my editing (like replying right now) but rarely for activities requiring more than 4 or. 5 sentences. I see Structured Tasks as ideally suited to a mobile phone.

Galendalia (talkcontribs)

@Nick Moyes You need a bigger phone then lol. I can see that for the tasks, but I am speaking to the end results of editing a full article to bring the status up to a new level. I find it hard to add citations, references, wikilinks, external links, etc. Just typing out a few sentences, yeah that is ok. When you look at the tasks overall though, that is about the only thing the mobile is good for. So I guess in a nutshell what I am trying to say is the mobile experience would not be good for the task list of things to be completed.

HLHJ (talkcontribs)

@NickK has reason to worry about poor suggestions: example.

The "editing is hard" list on the associated project page does not quite reflect how I began to edit. My flow was more:

I was already reading an article on a topic of interest, and found an informational omission (not an editing-goal-directed behaviour, happens spontaneously). Then:

  1. Click to start editing (one option).
  2. Type the sentence in the right place (easy, it's a markup like HTML). Preview until it looks right (a great confidence-builder).
  3. Fill out an edit summary (because it asks).
  4. Click to publish the edit.

Note the lack of citations; when I did later add citations, they were usually bare URLs inside ref tags. I was also attracted to the informational content; had I been offered semi-automated tasklets, I would have found it very boring. The app may be good at recruiting wikignomes (and I really value gnomes), but I'm not sure it will recruit editors with other interests effectively. While we can never have too many gnomes, I fear that the app may put off other potential editors, who might get the idea that the app-recommended tasks are all of what editors do.

Reply to "General thoughts"
MMiller (WMF) (talkcontribs)

What are your main concerns about this idea? We could imagine that ideas like these could lead to vandalism or a high volume of bad edits. We want to think these challenges through.

Nick Moyes (talkcontribs)

I'd like to ask whether edits made via Structured tasks will be automatically tagged,or indicated in some way, in Edit Summaries? The suggestion has been made that this could offer an easy way to do some subtle vandalism by, for example, wrongly wikilinking to another page. It struck me that it could be helpful to see by what route a new editor has made certain edits. I've not seen any mention of this in the discussions, thus far. Maybe ORES could even have some role to play in determining whether an editor is able to continue being offered some or all of these structured tasks if they appear to be misusing them?

MMiller (WMF) (talkcontribs)

Good question, @Nick Moyes. Yes, we are able to tag edits that come through this feature using "edit tags". For instance, here's the RecentChanges feed on French Wikipedia filtered to just edits coming through the feature (deployed to French Wikipedia three days ago). That would certainly allow people to keep an eye on how the edits are going, and to specifically patrol the ones coming from this feature if they like.

I think you're bringing up a good point about how to prevent the features from being abused. The way it's handled in the Android app's suggested edits feature is that after a certain number of reverts, the user is prevented from making more edits out of the feed. But it would be even better if, as you say, something like ORES could warn a newcomer before they save an edit that they are probably doing something wrong. That's a cool idea that we should keep in mind.