Topic on Help talk:VisualEditor/User guide

Elitre (WMF) (talkcontribs)

After a conversation on de.wiki, a need has emerged to have up-to-date documentation about all the identifiers that are supported by citoid and the supported ways to specify all of them (for example, you can write 10.1016/j.neunet.2004.05.005 or DOI 10.1016/j.neunet.2004.05.005 or DOI:10.1016/j.neunet.2004.05.005, but apparently this isn't true for the other identifiers).

Elitre (WMF) (talkcontribs)
Mvolz (WMF) (talkcontribs)

Ok - where should this list go? In the User Guide?

There will be a lot of things that will be *possible* just to be more user-friendly, but the best results will be with:

  • Include the protocol with the link (i.e. http:// or https://) - we'll guess http if you don't use the protocol and if it doesn't match any searches. If the link contains a DOI and no prefix is given, for instance, it will actually select DOI as the identifier. And if it's only available over https we can follow a redirect if there is one, although a few, although not many, sites don't redirect http -> https.
  • Include the PMC prefix. i.e. PMC12345. We had to institute this because of collisions with PMIDs - previously with or without was allowed. Once we allow multiple results, then this will no longer be required and both will be returned and you can insert the correct citation.
  • PMIDs have to just be the integers. If there's other notation that commonly goes around PMIDs let me know and we can include it.
  • For DOIs, just the DOI is preferred with no prefix, but as you note, many things are acceptable with DOI- it can even be picked up from within a URL. This is because DOI has a reasonably distinct pattern to it so we can use regex to find it in the middle of a String.
  • For ISBNs, hopefully any correctly formatted ISBN will work.
  • If we are unable to identify the input, we stick an http:// on the front of it and assume it is a link and try to access it. If there's nothing there, we return a "couldn't make a citation for you" result.


Most of this stuff is an artefact of how easily the identifier can be identified. DOI has a reasonable unique pattern although there are definitely false positives and false negatives - PMIDs are terrible because it's just an integer of indefinite length and so we have to be as strict as we can with it. URLs are actually surprisingly difficult to identify - just google url regex and you'll find bunches of people doing it a bunch of different ways. So if you find anything unexpected with how the things are being identified, please report it on phabricator; there are a surprisingly large number of edges cases with these things.

Elitre (WMF) (talkcontribs)

Neither PMCs or ISBN are currently listed as options on the interface, and this is an initial source of confusion. The other one is that inconsistency isn't much user-friendly :) Expectations are that the system will work in the same way with any prefix. So it will either accept some combinations of prefix and numbers, or it won't - but allowing it in some cases and not in others looks confusing even if we document it well. When you say "multiple results", what will that look like? Here's a mockup a user provided - having a separate field for the identifier. You can put the list wherever you want, and then we can take care of placing it in relevant places. Thank you!

This post was hidden by 37.46.41.234 (history)
Whatamidoing (WMF) (talkcontribs)

Elitre, do you see this as being Phab:108980 or a separate project?

Elitre (WMF) (talkcontribs)

Well, of course it could/should go there, but I was hoping that a mere list, even here, wouldn't take so long to produce.

Whatamidoing (WMF) (talkcontribs)
This post was hidden by 37.46.41.234 (history)
Reply to "Automatic citation"