Topic on Talk:Reading/Web/Projects/Related pages/Flow

Integrate with 'See also' section

17
Wittylama (talkcontribs)

It is not normal to have "content" outside of the editable window. Since this is a Beta Feature, could you test it as appearing as part of the "See also" section that most (english wikipedia) articles have? This is the logical place to put suggestions for other things. You'd need to carefully visually differentiate from the human-created suggestions (see the 'manual of style consistency' section here). There would need to be a fallback option for when there is no 'see also' section, of course. If you integrated these 'automatic' suggestions to the manually created section, then you wouldn't even need to have a separate heading called "related articles" - it would be just another facet of the "See also" section.

Jkatz (WMF) (talkcontribs)

As @Jdlrobson pointed out in https://phabricator.wikimedia.org/T121398,  this would require a re-architecture of the page. I agree there is some redundancy for most articles. I had thought that this would be an effective backup for if there are no 'see also' or for someone who might be overwhelmed by the sometimes large number of 'see also's'. There is a purpose to having this at the end of the article, which is to catch those users who have reached the end and may be looking for something else to do...putting it up higher in the article is certainly something we have considered and might test on the iOS app, but we wanted to see the results in web before promoting the feature any further.

He7d3r (talkcontribs)

I would prefer it not to be moved inside the real content (which is generated by the wikitext users add to the page).

Wittylama (talkcontribs)

I understand the idea of providing a 'now what' feature at the very bottom of the page if the reader has reached that far and found nothing that interests them to read onwards... But it feels like this feature has "escaped" from mobile to desktop. In mobile, the team has seen fit to hide both the categories and the navboxes that the community creates with so much effort - both of which serve the 'now what' use-case. [I find it irritating that all that good work is just hidden in mobile, but that's a different story... ]

However, in the desktop mode we DO have navboxes and categories (and the See also section) at the bottom of an article. Adding an EXTRA section that duplicates the purpose of the see-also section - just with a different method of selecting the items listed - seems to be adding confusion to the layout rather than adding a new/different service to the reader.

Considering that "integrating" the manual (see also) and automatic (related articles) would require changing the architecture of a mediawiki page itself, I'm not sure how to solve the problem though...

Tacsipacsi (talkcontribs)

I also think that this section must be at least above categories, navboxes etc. At the bottom of hu:Nyolcvan nap alatt a Föld körül (regény), there is the following structure:

<table class="navbox" /><!-- navbox -->
<table class="navbox" /><!-- authority control -->
<div /><!-- portals, unfortunately without any class -->
<div id="catlinks" /><!-- categories -->

The suggestions must be above them.

GoEThe (talkcontribs)

I also think it should be better integrated with the See also section, by either using the items suggested in the See also section (if it exists) or by merging the two (and explicitly saying that the Read more ones are automatically generated; not user generated).

Jey (talkcontribs)

I agree with this comment. I think that this is a must have requirement if we want to release this feature in the future.

Jdlrobson (talkcontribs)

FYI technically if we wanted to control the placement we'd have to get editors to agree to generate the "See also" section via some kind of hook e.g. {{#SeeAlso:Title1|Title2|Title3}}. Note in current form the see also section is an unstructured chunk of text with no semantic value - there is no way to know it lists related articles.

Before proposing such a big change, we could run an A/B test on a handful of pages (which show the same data) to compare click rates of putting it at the bottom vs in the article.

Given it's common to see these kind of things at the bottom of the page, I suspect there is data out there someway showing it's better at the end than within the article but if we want to show it ourselves it's possible...

Nihiltres (talkcontribs)

+1 for Jdlrobson's point of generating "See also" from some sort of hook; the catch there is that such a hook would have to accommodate structure where "see also" sections include it, e.g. with descriptions appended to links, or subsectioning. Nevertheless a good idea!

I think it's important that we distinguish some of the issues that come from this feature's differences with "vanilla" Wikipedia form. Some examples:

  • Placement: Our default is an order that's presumably "most to least relevant": "See also", "References", "Further reading", "External links", and then possibly navboxes. This feature effectively moves "See also" to the bottom. If this matches/improves user behaviour, the data might actually support changing the default section-ordering more than it supports adding this feature. We should compare "related pages" results with the results of moving "see also" sections to the end.
  • Visibility: Our default is text-only, while this feature adds images, making it obviously more eye-grabbing. We should compare "related pages" results with the results of adding preview images to an otherwise human-curated "See also" section.
  • Control: Editors want to be able to specify articles and value being able to pick the most relevant results (compared with the vagaries of a search algorithm), while automatic updates have been repeatedly cited as an advantage with "related pages". We should compare the results of manual "see also" sections vs. automatic "related pages" with each using the same format, to confirm the assumption that human-curated "See also" sections are significantly more useful than automatic "related pages". "Automatic updates" might better be done as a suggestion tool within VisualEditor support for a {{#SeeAlso:}} tag, or a maintenance list showing articles omitting high-relevance related pages that someone can rush through with AWB.

If we don't address each of these points, data showing better CTR for "related pages" is irrelevant, as improvements in other areas might be as good or better, while omitting its downsides.

In any event, it might be useful to stress that "related pages" are generated automatically as a contrast from the rest of the content, because we have enough problems with people being ignorant of Wikipedia's editable, user-generated nature despite the "[edit]" links all over the place.

Ruud Koot (talkcontribs)

Nihiltres raises some very good points.

  1. Moving the current See also section to the bottom might well make sense. The navigation boxes are also at the bottom, and the bottom of the article might well be the most visible piece of screen estate after the top. On the other hand, in relatively short articles with many references, the See also section may be more visible above the References than below. This would need proper A/B testing first. And in the end this is a content and thus community (and not a developer decision) to make.
  2. The inclusion of images is an orthogonal issue. Adding images in front of links will most certainly increase the click-through rate for those links. But this doesn't measure anything useful for several reasons:
    1. The links in the body of the article are likely to be of much more valuable to a reader than those in a See also section (especially if they are low-quality automatically generated links). So you would also have to take the opportunity cost of this decision into account (that is, more useful links that now no longer get clicked).
    2. Wikipedia is here to provide information to people on a very specific topic they have likely decided on in advance. Not to provide entertainment to the bored, or accumulate the greatest number of page views to generate advertisement revenue. I'm not convinced click-through rate has a positive correlation with user satisfaction.
    The community has made a very conscious decision to use non-functional/decorative and non-free images only very sparingly. We want to focus on the (free) content first and foremost. It's again not up to the developers to override the community on this.
  3. The selection of the links is another orthogonal issue. Hand-curated content is one of the things that makes Wikipedia among the most lauded websites on the net. The community has always been very sceptical of for example bot-generated or machine-translated articles. The current selection of related articles produced by this extension seems to be based on the results of a single call to a built-in CirrusSearch function which uses a very simplistic ranking metric (word frequency). From the examples I've seen the quality of this selection is unacceptably low. I have little faith that the developers will be able to come up with a system that can approach the quality of manually selected content. But in the end, whether we want automatically generated See also sections should be up to the community and not pushed through by the developers.

(Obligatory xkcd: "The problem with Wikipedia". I don't think it's the lack of click-throughs.)

Wittylama (talkcontribs)

Your final sentence - about how WP has many problems but a lack of things to click on isn't one of them - reminded me of a related frustration I have with the mobile view: that it hides the navboxes at the bottom of articles. These are manually curated collections of articles that are related to the topic at hand. They would seem to fit into the same type of purpose that the "read more" extension is trying to solve: show people things they might like to continue reading after they've finished an article. It seems strange to me that work is being done to build a new, and automated, system when the mobile version deliberately hides the existing solution the community has built.

So, I filed a phabricator ticket in the hope of addressing this: https://phabricator.wikimedia.org/T124168

Jdlrobson (talkcontribs)

User:Yanpas would be great to have you in this conversation! :)

Jdlrobson (talkcontribs)

Practically if there is sufficient editor buy in we could set up Template:See also and run an A/B test on a collection of say 1000 high traffic pages and explore whether related articles gives better click through rates for the same editor created results. We could then explore using the algorithmic results to compare click through rates to editor curated links.

We could hide the see also section on 50% of pages and on 50% pages hide the related pages widget (of course we should fix the image issue before doing that :))

Is this something we could explore? Is it an offer anyone would like to take me up on?

Ruud Koot (talkcontribs)

I'm not exactly sure what the test is you are proposing here. To make sure you test for only one variable the A/B test would look something like this:

  • A: Use Related Pages as-is, populated with three pages from the morelike query.
  • B: Use Related Pages as-is, populated with three hand-picked pages.

But that means we would have to really hand-pick three related pages links for 1000 article (quite a bit of work). A priori I'd say that B would result in a higher CTR. But of what use is that if that's true? We're not going to hand pick three links for Related Pages on all 5 million article afterwards.

So another test could be:

  • A: Use Related Pages as-is, populated with three pages from the morelike query.
  • B: Use Related Pages as-is, populated with three links extracted from the See also section.

A priory I'd say that A would result in a higher CTR: links in the See also sections aren't necessarily the highest-quality links in the article, but rather the links that couldn't be incorporated in the text naturally.

Another test:

  • A: Use Related Pages as-is, populated with three pages from the morelike query.
  • B: Use Related Pages as-is, populated with three links extracted from any part of the article (text, See also section, navboxes, ...)

Obvious problem: depends on how you select three out of n links, so this is not really controlling for one variable. If you can come up with a good way to pick those links this may be an interesting test. Limiting the Related Pages to links to those already present in the article may also solve a number of issues raised here (but probably not completely).

And none of those three tests really measure something very meaningful. I'd say a more interesting thing to measure first would be the non-canabalizing click-though rate. If that's very low or negative then that would be a good point to give up. If that number is large enough, then you could do a more complicated measurement to see if those extra clicks resulted in higher user satisfaction. (E.g. was the article really read, or did the user quickly click-through to yet another article? This is one of the dangers of focussing on CTR too much, of course: it may be well be that low-relevance links encourage people to click on more links, because then now need to click on more links before they find something something relevant. Thus CTR could be negatively correlated with user satisfaction.)

Yanpas (talkcontribs)

Sorry for the long responce. How do I see adoption of this feature:

For each See Also entry, Artcile should include template. For example page Dog can include something like:

{{SeeAlso|Pudel}}
{{SeeAlso|BullDog|YoungBulldog.jpg|Bulldogs are a medium-sized breed of dog...}}

The fields which were not manually written are written by program. (image, piece of rext). If there are 0-2 "See also" templates, the program generates neede amount of see also entries, so see also section would include 3 entries. But users may include whatever they want entries. Placing is important. There is an option to place them to the side panel. Bottom of the page is an option too, but I would like to see them where currently "See also" located. It's very hard technically, currently do not have an idea.

Converting ==See also== to templates can be made by bots or users. Parsing article is bad idea. But that it's obvious I guess

Jdlrobson (talkcontribs)

@Yanpas exactly. If see also was template driven we can run any a/b test the community would like around comparing this feature with see also.

Nihiltres (talkcontribs)

@Yanpas, Jdlrobson: The path forward seems straightforward, so here's some notes on a potential plan:

  1. Build a template that creates basic see-also-list functionality, preferably using Lua for some functionality. Some feature ideas for said template:
    • a classed div wrapper for the list
    • default-on (but optional) alphabetical list sorting
    • optional columnization (as, say, {{reflist}} has)
    • optional grouping into subsections
    • prettified section linking (Foo#Bar Foo § Bar)
    • items populated by link parameters, and some other optional parameters:
      • display-text
      • parent (allowing sub-lists)
      • description (optional suffixed, unlinked text)
  2. Get community consensus to implement these templates per wiki
  3. Get a bot to convert basic see-also sections across each wiki and flag edge cases for manual review
  4. Do A/B-testing on variations based on designer input, using structure provided via steps 1–3
  5. Iterate A/B-testing and proposed designs a bit based on initial results. Repeat as necessary.
  6. Develop see-also-list tag as MediaWiki extension
  7. Implement extension on each consenting wiki
  8. Migrate see-also-list templates to see-also-list tags, using a bot
  9. High-fives all around.

Notably, #1 and #2 probably should be transposed; consider consulting the community before doing tons of template-coding work. Although tempting, #1 should be careful about implementing too many features on-by-default: a minimal change from the unstructured status quo will be easier to get consensus on. Also worth considering: have the staff responsible for the "Related pages" project present the overall plan in context so that the community knows where things are heading.

Reply to "Integrate with 'See also' section"