Topic on Talk:Growth/Personalized first day/Structured tasks

MMiller (WMF) (talkcontribs)

We can think of several editing workflows that could be structured, along with the help of algorithms. Here are some examples. Which of these workflows do you think have the most potential to be structured? Which ones would be useful for the wiki and which ones not useful? Are there others you can think of?

  • Add a link: algorithm recommends words or phrases that should be blue links, on articles that don't have many blue links. Newcomer decides whether the link really should be added and adds it.
  • Add an image: algorithm recommends images from Commons that might belong in the article. Newcomer decides if it is a good fit for the article and adds it.
  • Add a reference: algorithm recommends sentences or sections that need references. Newcomer goes out to find references and adds them in.
  • Add a section: algorithm recommends section headers that could be used to expand a short article. Newcomer finds sources and adds content.
Galendalia (talkcontribs)

I think adding a table and the aspects that go with it should be an advanced task as a lot of articles have tables some basic some advanced.

MMiller (WMF) (talkcontribs)

@Galendalia -- interesting. Do you know of some way to identify articles that need tables but don't have them?

John Broughton (talkcontribs)

Adding wikilinks is not particularly useful (also, a "link" can be either a wikilink - internal - or an external [http] link; the latter are generally undesirable, at least in the English Wikipedia; and it is helpful to distinguish between the two). Adding maintenance templates is generally not useful

John Broughton (talkcontribs)

Every task that is listed consists of two things - (a) changing an article, and (b) finishing the edit by publishing it (ideally, adding an edit summary). Starting out (as an editor) by making a minor change, such as fixing a typo, is a good way for editors to learn that second thing, which they will be using every single time that they edit. By contrast, adding a section involves (1) adding content (sentences), (2) adding citations, and (3) finishing by publishing.

In other words, "fix a typo" or "make a minor change" should, ideally, be the first structured task that an editor learns, because it incorporates the "finishing the edit by publishing it" micro-task. And once the editor has learned to do that micro-task, other tasks will be easier.

MMiller (WMF) (talkcontribs)

@John Broughton -- I think this is a good point, that every task teaches wiki skills (e.g. adding an edit summary) that are not part of the core task itself (e.g. adding wikilinks). We should keep in mind that as we structure the experience of editing, we may also be teaching other universal wiki skills and concepts. Other examples might be teaching users that their edit is immediately public (except in wikis with flagged revisions), or that they can see their edit on the history page.

Galendalia (talkcontribs)

I thought this was what the Wikipedia Adventure was for? It shows the basics of using WP, however, there is no obligation to go through it. If there was 3/4 of our Teahouse questions would stop coming in. Galendalia (talk) 06:50, 20 May 2020 (UTC)

MMiller (WMF) (talkcontribs)

@Galendalia -- good question! Our team looked at the Wikipedia Adventure (and many other attempts at onboarding newcomers), and we've learned a lot. In summary, our current theory is that a good way to help newcomers stick around Wikipedia is to help quickly have a positive editing experience. We think that if they can make a good contribution within minutes and understand its value, they will be excited and want to keep going. Whereas if they have to go through a long tutorial, they might lose patience and not stick around. So this idea, "structured tasks", is about how we can give newcomers a real editing experience, but with guardrails so that the experience is positive for them and for the wiki.

More background information: In a study on the Wikipedia Adventure, while a lot of users claimed to enjoy the experience, it unfortunately didn't statistically increase their retention, or any other important metrics. But in a study about the Teahouse, it was shown that being invited to the Teahouse does statistically increase retention. So our team took this all to mean that there is something valuable in the personal connection that happens with getting a question answered (although we know it takes a lot of time from experienced editors). That's why we decided to build the mentorship module for the newcomer homepage. And, to your point, as we deploy the mentorship module on more wikis, we are continually trying to strike the balance of giving newcomers a personal connection, while not overburdening the mentors who answer the questions.

Galendalia (talkcontribs)

I think that the not sticking around part is the bullying of admins and the not following the don't bite the newcomer rule. Many a time in my start and even until today, I get admins telling me what to do and what not to do as well as adding their own POV to why I should or should not be doing something. Two recent examples are last night I asked a question on IRC about BLP for clarification from someone who I thought would have the answer and their response was "You should find something else to do as you have bitten off more than you can chew as a new comer." The second was today an editor pinged me about removing the gnome and fairies tags from indefinitely blocked user pages to clean up the active user lists as it contained some 50 or so blocked users from years back to current. That editor opened an ANI against me because he/she didn't get the answer they wanted. I think if admins and other people were to stay out of the new members using their in your face routines (does not apply to all, but to some) and let normal editors be a mentor, this would work great. There are definitely cliques in the admin and sysops teams that seem out to get newbies and instead of being helpful they are rude and not helpful. When I first joined I went into IRC to the en-help channel and got chastised because I did not have a cloak nor am I at 3 months as a wikipedian. When I asked about these I was pointed to 2 links of which neither were helpful. I watched this same user in the IRC and they are rude to everyone in the tone of their messages and I even PM'd them to let them know I felt they were being hostile, not only towards me, but others as well and the response I got was "Deal with it' then I got kicked from the room. I requested a courtesy vanish on Friday last week. Before I knew it, those I have worked with on various things posting messages for me to come back and continue my contributions. So I decided to come back and again, same hostility towards me. So in short, I would recommend that the mentor's not be admins, sysops, clerks, ARBs, etc. Just normal everyday wikipedians who volunteer to take on someone. How would we define who is an experienced editor I guess would by my next question.

Waggie (talkcontribs)

I was the person Galendalia asked "about BLP for clarification". They had asked for help in private message to me with a dispute resolution case they were mediating for on en-wiki. It was a particularly complex case and they had already pinged two others on-wiki for assistance with it. The "quote" that Galendalia is posting here is not an accurate quote. My response to them was actually: "It's a pretty involved situation you're asking for advice on, you may have bitten off more than you can chew right now." and "I see that you've pinged Robert McClenon and Nightenbelle, I would await their responses." As you can see, the tone of my reply is quite a bit different than the "quote" they are offering here.

They are also complaining about us asking them to not idle in the help channel until they meet the requirements for idling in the channel as specified at en:Wikipedia:IRC/wikipedia-en-help. They were repeatedly pestering numerous people about getting a WM cloak and were pretty upset that they were not getting a cloak despite not meeting the minimal criteria specified at m:IRC/Cloaks. They kept obtaining various different cloaks, trying to get past the channel rules regarding idling in -help without meeting the criteria for idling or helping. Honestly, I think I was pretty patient and polite given the level of intensity from them regarding this.

This rudeness to helpees they speak of, and this quote of "Deal with it", I do not know what they are referring to. If this is referring to me, this is entirely inaccurate and they never PMed me with anything of the sort. I'm actually very patience and polite with helpees, even ones who are difficult and/or UPE.

Frankly, I'm not appreciative of this blatant mischaracterization of my actions.

MMiller (WMF) (talkcontribs)

Thanks for sharing that perspective, @Galendalia. We know for a fact from research that hostility toward newcomers drives them away. Here is one of the most important papers about it, and here is another influential research project. I think it's definitely hard to improve the culture of a wiki, and I think it's great that you're trying to be a force for positivity in your work. So far, the mentors that we've recruited seem to be generally encouraging to newcomers, and I think you have a good idea that we should make sure it's clear that many people can be a mentor -- it doesn't only have to be the most experienced and involved editors on the wiki.

Nick Moyes (talkcontribs)

I can feel Galendalia’s pain. Shortly after becoming an Administrator earlier this year, I thought I’d go and try out IRC.chat as I’d never used it and thought I ought to get a feel for the place. I not only found it incomprehensible as well, but I was permanently blocked by a so-called ‘helper’ whose manner towards me was appallingly unwelcoming. There is no accountability or complaints system at IRC, so I will never ever recommend any newcomer on en-wiki to ever have go there unless major changes happen there, or unpleasant/unhelpful editors are kicked out. The person who I encountered wasn’t an admin, so unpleasant attitudes to newcomers isn’t something unique to those with extended rights. Finding mentors/helpers with the right interpersonal skills to be able to deal with inexperienced users is critically important.

Waggie (talkcontribs)

I'm sorry that Nick Moyes had a bad experience, although I must say that it was somewhat self-inflicted for them. There is accountability on IRC, and there is a process for complaints and appeals. For a more complete and accurate explanation of what actually happened here, please read the thread at en:User_talk:Waggie#Your_attitude_on_IRC. I go into great detail about why this happened. I am also willing, with Nick Moyes' and Jeske's (as the other involved person here) permission, to publicly release the logs of the encounter. There was no "permanent block", bans in -help are for 24 hours by default. Secondly, as soon as they were identified to a known "good" user, I lifted the ban immediately.

Sdkb (talkcontribs)

Looking through the list of tasks at https://en.wikipedia.org/wiki/Wikipedia:Task_Center...

As I've mentioned at a previous stage, I still think anti-vandalism has a ton of potential to be a structured task for newcomers (it somewhat already is with WikiLoop Battlefield). Categories and copy editing both sound good. There are also some more niche tasks that could be easily structured, such as fixing links to disambiguation pages that pop up in mainspace.

MMiller (WMF) (talkcontribs)

@Sdkb -- I remember when you mentioned that, and @Zoozaz1 brought up WikiLoop Battlefield as an example of how reverting vandalism is like a structured task. I guess my open question is still whether newcomers would do a good job of judging vandalism, given their low wiki experience. You recommended that we check in with some Wikipedians who do a lot of edit patrolling. I can go seek some out -- is there anyone in particular who you would recommend or tag?

Sdkb (talkcontribs)
Galendalia (talkcontribs)

My issues with the rollback that everyone gets are:

1. Inexperienced

2. Not trained

3. Causes edit wars

I recommend one or all of the following:

   A. IP users are not allowed to use the rollback feature
   B. Only the people who have graduated from the CVUA should have rollback rights (I see a lot of new users getting the right without any type of training. 
   C. To use the rollback built in it must be a registered user with 3 months experience.
MMiller (WMF) (talkcontribs)

Hi @Galendalia -- thanks for thinking about this. We've been talking a lot about easy editing tasks for newcomers to do, and we wanted to hear from someone in CVU because of the idea that maybe reverting simple vandalism is something newcomers could help with. It seems like an interesting idea, because on the one hand, some vandalism is really obvious, but on the other hand, newcomers know little about Wikipedia or vandalism, and might not have the judgment required. What's your take? Could you imagine newcomers being given something like a very simple version of Huggle, and asked to revert obvious vandalism? If I'm reading your previous comment correctly, it sounds like maybe you would say it's not a good idea.

Galendalia (talkcontribs)

Hi @MMiller (WMF) : Even though I have been on WP just over a month, I feel the inexperience would be a major hindrance. Like I stated above, They need to complete the CVUA and be on WP for at least 3 months. This will allow new editors time to process the policies and learn from their mistakes rather than reverting a valid entry. There are sometimes subtle entries which would probably not being noticed unless you are looking for them, like no source listed in the diffs. Wait what is a diff? That is a question I see users asking a lot of.

MMiller (WMF) (talkcontribs)

Thanks, @Galendalia. It sounds like your general advice is that reverting vandalism takes some experience and knowledge. Got it. But it also sounds like you have an interesting story, if I may ask -- how did you find your way to reverting vandalism so soon after joining Wikipedia? What caused you to try that type of editing in the first place? What were the very first edits you did?

Galendalia (talkcontribs)

Honestly it seemed like the only thing I can do without having someone revert anything I did or go on a tangent about questions I asked that end up not even answering the question I posed in the first place. I pretty much do 2 things. CVU and Dispute Resolution. I also am in the process of rebooting Spoken Wikipedia as there is plenty of interest in it. That will be the 3rd thing. I’ve been trying to maintain where active user lists are maintained and I’m getting a lot of flack for that because in one instance it requires removing the tag or userbox from someone’s user page and I only did this to those who are permanently blocked. However as soon as I did it people were all over me and reported me to ANI and I’m getting nothing but crap for housekeeping.

MMiller (WMF) (talkcontribs)

Also pinging @Revi (WMF), who has a perspective on this from Korean Wikipedia, which doesn't have any sort of bots for reverting simple vandalism.

NickK (talkcontribs)

I would very much like to have one more: correcting typos / improving language. Wikipedias have a lot of articles that are labelled as needing proofreading. If we can use some spellchecker or dictionary (e.g. for identifying words that are very similar to the dictionary ones but possibly misspelled) or some style problems (e.g. common stop words like 'outstanding' or 'interestingly'), that would give us a good task for a simple first edit. Beyond that, Ukrainian Wikipedia also has a good list of problems at uk:Вікіпедія:Проект:Якість.

Sdkb (talkcontribs)
MMiller (WMF) (talkcontribs)

@NickK -- I agree that would be a perfect task for newcomers. And I think you've hit on the main problem: how to automatically generate lists of potential spelling and grammar corrections across dozens of languages? @John Broughton pointed me towards the Typo Team's "moss" tool, which does this for English. Also, engineers on the Growth team pointed out the aspell and hunspell libraries, which have many languages. Do you know if Ukrainian Wikipedia already does anything like that? Where do the problems listed at uk:Вікіпедія:Проект:Якість come from? Are they from maintenance templates placed by users, or from some automation?

NickK (talkcontribs)

@MMiller (WMF): We had multiple discussions about libraries, we have several bot owners who are maintaining their own lists. There are some lists at uk:Вікіпедія:Список найтиповіших мовних помилок internally or Неправильно — правильно externally (it cannot be completely copied as some might still be accepted in some context, so a human check will be needed). If this is the only issue, I think we can come up with some solution.

Regarding uk:Вікіпедія:Проект:Якість, yes, they are maintenance templates placed by users.

Galendalia (talkcontribs)

I know autowikibrowser had this feature and I was going to start in on some of them, however, I was informed today, that feature has been long gone. I know there is a db source somewhere that contains dictionary words. This does not necessarily resolve synonyms or other word choices. It would be great to have a bot that could those changes based on the article language tag and also to fix dates based on the article date format tag.

Barkeep49 (talkcontribs)

I also agree with adding categories and typos as a potential task. Bigger picture I'm wondering if individual communities could input something into a template to generate these tasks rather than everything having to be done uniformly on the backend perhaps through categories which this tool could render in nice forms.

MMiller (WMF) (talkcontribs)

@Barkeep49 -- thanks for weighing in. I think that the way we have started to build newcomer tasks is in-line with how you're thinking about it. Right now, the feed that newcomers get runs off of maintenance templates, like these. Most wikis have big backlogs of these templates, but maybe one day in the future, newcomers (or others using this feature) could churn through the backlogs, and communities would be incentivized to keep tagging articles with them. That said, the idea we're talking about here, "structured tasks", is about these tasks coming from an algorithm, as opposed to from maintenance templates. Perhaps both sources could continue to be options, and communities could regulate which ones of the pipes (so to speak) they turn on and off into these tasks feeds.

Galendalia (talkcontribs)

To go off of this it would also be dependent on the users grasp of the language. There is a small difference in British English vs American English. Same with the Spanish language where I believe there are 3 versions. I know of a few editors from other countries who try to correct what they assume are typos but in fact are not based on the sentence. That may pose a potential problem with this being automated or templated.

Nick Moyes (talkcontribs)

I've just posted my support for Typo-fixing in the General Thoughts section above, but I'd like to reiterate it as a preferred first task, and to try to understand why it is that fixing typos as a structured task is seen as so difficult ti implement across different langauge sites.

English Wikipedia already has Wikipedia:Lists of common misspellings; Wikipedia:AutoWikiBrowser/Typos and even an article on Commonly misspelled English words, plus a list of variations of acceptable spellings that should NOT be corrected like colour>color and vice versa (Wikipedia:List of spelling variants).

Even if other language Wikipedias don't currently have any such similar internal lists, surely these spell-check lists are available from elsewhere? And it could even be an ideal opportunity to engage with wider editing communities to start building up such a list of common errors themselves which could be incorporated into this task.?

I do tend to feel that anti-vandalism might not be an ideal structured task as it does require some understanding of what is and isn't bad faith editing, and is possibly also prone to being abused if bad edits are let through. English Wikipedia already has edit filters and Cluebot for removing the worst of the worst - but what about other languages? Does manual input here have a role to play?

Galendalia (talkcontribs)

Hey Nick, I just wanted to point out, as I stated earlier, they removed that function from AWB. Also, you have to have a really good reason to gain access to the application to use it. I got denied a couple times, but then they accepted my reasoning. I think part of the difficulty may be the language format in which symbols/characters are used. That would require every language to have their own version of spell check.

John Broughton (talkcontribs)

I'm sure that there are spell checkers already in existence that cover the majority of Wikipedia languages - see https://webspellchecker.com/ , for example.

Pelagic (talkcontribs)

Grammar and punctuation fixing came to my mind. Most educated native speak have an intuitive sense of wrongness when they see ungrammatical a sentences. Having a feel for encyclopaedic tone is a more uncommon skill, but the improvements don’t have to be perfect.

Acquiring the software to identify problem sentences for non-major languages would be harder for grammar than spelling, I imagine.

MMiller (WMF) (talkcontribs)

@John Broughton @Sdkb @Nick Moyes @NickK @Pelagic @Barkeep49 -- since we were all talking about how it would be valuable to have copyediting as a structured task, @Tgr (WMF) and I did some research to look into it. We talked to @Beland, the creator of "moss", a typo-detection script on English Wikipedia. We learned how the tool works, and talked about prospects for doing similar things in other languages. You can see our notes here (@Beland, please add to or correct them!) We're going to keep thinking, learning, and posting about the possibilities around copyediting.

Sdkb (talkcontribs)

Sounds great; thanks for the update!

LittlePuppers (talkcontribs)

@MMiller (WMF): I would make spelling correction a separate task from copyediting and label it as such; I personally think of copyediting as more of a grammar/structure/clarity thing than spelling correction. That's not to say that fixing typos is unimportant or something we shouldn't do, but it might be more clear for other editors (and you should probably deal with categories differently for each). LittlePuppers (talk) 01:37, 4 June 2020 (UTC)

MMiller (WMF) (talkcontribs)

Hi @LittlePuppers -- thanks for weighing in. That distinction is not something I had thought about. And I think you're right -- the more we've thought about how we might build a structured task that would recommend spelling corrections, the more we think that such a task would only recommend spelling corrections, and not other kinds of grammar edits, which would require totally different algorithms to identify. Where would you say that the phrase "typos" fits in? Do you think typos are more about spelling, or more about punctuation or something else?

LittlePuppers (talkcontribs)

Thanks MMiller (WMF). I'd say that spelling is solidly within the realm of typos, and something like phrasing is solidly within the realm of grammar, while punctuation is somewhere in between. It's a bit harder to say, but I think that punctuation would fit into the category of typos if it's an obvious and entirely unambiguous error (for example, putting two periods instead of one at the end of a sentence), but more in the category of grammar when it's something less clear-cut (such as over or underuse of commas, or a period vs. a semicolon).

To generalize a bit more, typos are unambiguous mistakes based on basic rules (be it a misspelled word, or some other typographical error), while copyediting or grammar (whatever you decide to call it) focuses on improving language (be it sentence or article structure, phrasing, punctuation, etc.) in a way that makes it more clear or easier to understand, even if it wasn't strictly "wrong" before. To link to two projects on en.wp I think you're familiar with, typos are in the realm of the MOSS project and grammar/copyediting is in the realm of the Guild of Copyeditors. LittlePuppers (talk) 02:08, 24 June 2020 (UTC)

MMiller (WMF) (talkcontribs)

Thank you, @LittlePuppers. This actually helps a lot, especially where you said "typos are unambiguous mistakes". This has implications for our prioritization and design of different structured tasks. For "unambiguous mistakes", we can probably create a very confident algorithm that can feed easy edits to newcomers, which they could accept or reject. Copyediting or grammar is a more advanced task, requiring the newcomer to create/produce the change on their own. It's like the difference between a true/false question ("This word should actually be spelled this way. True or false?") and an open-ended question ("What is a better way to phrase this sentence?"

MMiller (WMF) (talkcontribs)

Hello @جار الله -- I'm the product manager for the WMF Growth team, and I work with @Dyolf77 (WMF). He said it would be okay if I ping you here, where we are having a conversation about "structured tasks". In this conversation, we have been talking about automated ways to identify spelling errors in the wikis, so that we can point them out to newcomers to fix. We've talked about the moss tool in English Wikipedia, and I've learned that you built something similar in Arabic Wikipedia with JarBot. We're trying to figure out how possible it would be to build similar things in many wikis. I'm hoping you can answer some questions about your work. Thank you!

  • Which dictionaries/spellcheckers does JarBot use, and which one is best?
  • Does JarBot scan every revision when it is made? Or does it follow its own path through the articles?
  • Approximately how many spelling corrections does it make per day?
  • How does JarBot avoid making changes to peoples' names or names of locations, or other words that cannot be found in a dictionary?
  • Does it assign a score for how likely something is to be a misspelling, with some having higher scores and some lower? Or does it simply decide that a word is either misspelled or not?
  • Does JarBot automatically make the corrections? How accurate is it? In other words, how often are its corrections reverted?
  • How easily do you think something like this could be made for another language?
جار الله (talkcontribs)

Hello @MMiller (WMF)

Which dictionaries/spellcheckers does JarBot use, and which one is best? I use list of the most common mistakes in Arabic, the list is made by arwiki editors.

Does JarBot scan every revision when it is made? Or does it follow its own path through the articles? It depends on the tasks, sometimes by new revisions and sometimes by scan the database.

Approximately how many spelling corrections does it make per day? I don't know, maybe 50-100.

How does JarBot avoid making changes to peoples' names or names of locations, or other words that cannot be found in a dictionary? There is a list of words that the bot most avoid, but our common mistakes list didn't includes names and locations.

Does it assign a score for how likely something is to be a misspelling, with some having higher scores and some lower? Or does it simply decide that a word is either misspelled or not? The script doesn't work on AI to make decisions (maybe in the future).

Does JarBot automatically make the corrections? How accurate is it? In other words, how often are its corrections reverted? Yes, the bot is automatically makes the corrections. And 99.99% are correct.

How easily do you think something like this could be made for another language? I don't know about other languages but in Arabic and maybe the languages of the Middle East, the start will be from scratch and work will be difficult because there are no valid word lists or comprehensive dictionaries.

Best regards.


MMiller (WMF) (talkcontribs)

Thank you for the quick reply, @جار الله. These answers are helpful for now, and I will get back in touch if we decide to work on a project around spelling.

HLHJ (talkcontribs)

Typo-fixing seems like a task that would fit well in a mobile interface. Subtitling movies on Commons and translating subtitles also spring to mind. Adding "lang" templates would also be very useful and make the Typo Team's life easier (flagging that these words are Latin, these are Japanese, and so on).

More creatively, the WikiProject Guild of Copy Editors is always looking for volunteers to read through select articles and review and fix. This is not as readily done on a small interface.

You are building this into a reader app. Maybe link it to what the reader is doing? If they are confused, help them add a "clarify" inline tag. If it needs a citation, have them add that inline tag (everyone knows that tag, even if they never edit). If it is US-centric, let them add "globalize" inline tags.

A good simple interface for this might be OpenStreetMap-style comments to articles; "I got lost here, because you did not define this mathematical term" and suchlike. Scan the text and suggest some inline tags in which the comment could fit as a "reason=" parameter, in this example "clarify". There's a related project for doing something similar in Huggle.

And then let them resolve tags. If a section is templated as needing expansion, ask them to submit a comment suggesting sources that could be used to expand the section, as plain URLs. If they spend time on a "citation needed", the app could tell them to click the tag for guidance on adding a reference (a few times only). Or a banner saying: "This article has a photo request. If you have or could take a photo to donate to this article, please [add it]".

Reply to "Task types"