Page Content Service/References
The references endpoint provides a structured output of reference lists found on a particular page (phab:T170690).
Structure
editHTML and JSON are formatted for easier reading. Some attributes (some ids and data-mw) are removed for the same reason.
The output contains two main objects:
- reference_lists: a list of reference lists, useful to build a reference lists UI; potentially there could be more information about sections if the section was omitted due to being empty after the ref list was stripped
- references_by_id: a map of reference details
The reference_lists object
editThe reference_lists object is included to build a native view of reference lists. It currently contains only references lists. In the future there could be other things added as well, like information about section headings and other text (HTML) objects. The latter could come if a section towards the end of the article would become empty after stripping a references list. See T170690#3467608.
The references_by_id object
editThe references_by_id object contains a hash of reference details. Each references_by_id entry has an array of back_links, the HTML content, and some have an optional array for citation decorations.
Example 1: one ref list, one ref
editMarkup | Renders as |
---|---|
Bar<ref name="ref2">source 1</ref>. == References == <references/> |
Bar[1].
References
|
Parsoid HTML | References output |
---|---|
|
{
"revision": "2640831",
"tid": "ab21dbfa-f23b-11e7-9ffb-8e725cd7335b",
"reference_lists": [
{
"type": "section_heading",
"id": "References",
"html": "References"
},
{
"type": "reference_list",
"id": "#mwt4",
"order": [
"ref2-1"
]
}
],
"references_by_id": {
"ref2-1": {
"back_links": [
{
"href": "./Page_Content_Service/References/SimpleReference#cite_ref-ref2_1-0",
"text": "↑"
}
],
"content": {
"html": "source 1",
"type": "generic"
}
}
}
}
|
Live examples: Page_Content_Service/References/SimpleReference, the Parsoid HTML, production response, and the local MCS response for the same.
Example 2: two simple ref lists, one ref with two backlinks
editMarkup | Renders as |
---|---|
Foo<ref group="notes" name="ref1">note 1</ref><ref group="notes" name="ref1"></ref>. Bar<ref name="ref2">source 1</ref>. == Notes == <references group="notes"/> == References == <references/> |
Foo[notes 1][notes 1]. Bar[1].
NotesReferences
|
Parsoid HTML | References output |
---|---|
|
{
"revision": "2640615",
"tid": "830e4743-f238-11e7-ab56-48e0735b1d90",
"reference_lists": [
{
"type": "section_heading",
"id": "Notes",
"html": "Notes"
},
{
"type": "reference_list",
"id": "#mwt8",
"order": [
"ref1-1"
]
},
{
"type": "section_heading",
"id": "References",
"html": "References"
},
{
"type": "reference_list",
"id": "#mwt10",
"order": [
"ref2-2"
]
}
],
"references_by_id": {
"ref1-1": {
"back_links": [
{
"href": "./Page_Content_Service/References/MultipleReflists#cite_ref-ref1_1-0",
"text": "1"
},
{
"href": "./Page_Content_Service/References/MultipleReflists#cite_ref-ref1_1-1",
"text": "2"
}
],
"content": {
"html": "note 1",
"type": "generic"
}
},
"ref2-2": {
"back_links": [
{
"href": "./Page_Content_Service/References/MultipleReflists#cite_ref-ref2_2-0",
"text": "↑"
}
],
"content": {
"html": "source 1",
"type": "generic"
}
}
}
}
|
Live examples Page_Content_Service/References/MultipleReflists, the Parsoid HTML, production JSON output, and the local MCS response for the same.
Example 3: Various citation types
editNow let's look at more complex reference content. Reference content can have cite elements, which leads to various entries in the citation array in the output. citation decorations usually have one or more of the following values: book, journal, news, web, note. The citation decorations are derived from any cite elements in the HTML content.
Markup | Renders as |
---|---|
There are nearly 20,000 known species of bees in seven recognized biological families.<ref name="Danforthetal2006">{{cite journal |vauthors=Danforth BN, Sipes S, Fang J, Brady SG |title=The history of early bee diversification based on five genes plus morphology |journal=Proc. Natl. Acad. Sci. U.S.A. |volume=103 |issue=41 |pages=15118–23 |date=October 2006 |pmid=17015826 |pmc=1586180 |doi=10.1073/pnas.0604033103 }}</ref> ==References== {{Reflist}} |
There are nearly 20,000 known species of bees in seven recognized biological families.[1]
References
|
Parsoid HTML | References output |
---|---|
|
{
"revision": "814255996",
"tid": "5ca31a2e-f23b-11e7-bb72-3927404169a7",
"reference_lists": [
{
"type": "section_heading",
"id": "References",
"html": "References"
},
{
"type": "reference_list",
"id": "#mwt5",
"order": [
"Danforthetal2006-1"
]
}
],
"references_by_id": {
"Danforthetal2006-1": {
"back_links": [
{
"href": "./User:BSitzmann_(WMF)/CiteExample#cite_ref-Danforthetal2006_1-0",
"text": "↑"
}
],
"content": {
"html": "<cite class=\"citation journal\" id=\"mwBQ\">Danforth BN, Sipes S, Fang J, Brady SG (October 2006). <a rel=\"mw:ExtLink\" href=\"//www.ncbi.nlm.nih.gov/pmc/articles/PMC1586180\">\"The history of early bee diversification based on five genes plus morphology\"</a>. <i>Proc. Natl. Acad. Sci. U.S.A</i>. <b>103</b> (41): 15118–23. <a href=\"./Digital_object_identifier\" title=\"Digital object identifier\">doi</a>:<a rel=\"mw:ExtLink\" href=\"//doi.org/10.1073%2Fpnas.0604033103\">10.1073/pnas.0604033103</a>. <a href=\"./PubMed_Central\" title=\"PubMed Central\">PMC</a><span> </span><span class=\"plainlinks\"><a rel=\"mw:ExtLink\" href=\"//www.ncbi.nlm.nih.gov/pmc/articles/PMC1586180\">1586180</a><span> </span><figure-inline><span><img src=\"//upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/9px-Lock-green.svg.png\" data-file-type=\"drawing\" height=\"14\" width=\"9\" srcset=\"//upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/18px-Lock-green.svg.png 2x, //upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/14px-Lock-green.svg.png 1.5x\"></span></figure-inline></span>. <a href=\"./PubMed_Identifier\" title=\"PubMed Identifier\" class=\"mw-redirect\">PMID</a><span> </span><a rel=\"mw:ExtLink\" href=\"//www.ncbi.nlm.nih.gov/pubmed/17015826\">17015826</a>.</cite>",
"type": "journal"
}
}
}
}
|
Live examples on enwiki since MW.org doesn't have all the templates: (enwiki) CiteExample, the Parsoid HTML, production response, and the local MCS response for the same.
TODOs
editMost recent changes
edit- [x] backlinks: make them objects with the backlink href and the link content (phab:T182647)
- [x] citations becomes type with just a single (enum) value of "web", "news", "journal", "book", "generic". (phab:T182652)
- [x] added a type "note" for footnotes (phab:T274343)
Decisions
edit- Citations:
- Cite elements usually have some kind of type indicator in the class list, like "citation web" or "citation book".
- We show only one single value in the type field. If there is a single cite tag anywhere in the reference content or there are multiple cite tags with the same value we show that value. If there are none or multiple cite tags with different values we show "generic".
Open questions
edit- Can be added later: Should we keep the bibliographic metadata? See also the COinS syntax. It's a machine-readable format for bibliographic metadata. They appear right after cite elements. It doesn't get displayed (see style="display:none;). Historically MCS has stripped this out from mobile-sections (removing elements with 'span.Z3988'). Currently it's also stripped out by the new references endpoint.
Examples: Look above for class="Z3988" or here:
<span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=unknown&rft.jtitle=Merriam-Webster+Dictionary&rft.atitle=Barak&rft_id=https%3A%2F%2Fwww.merriam-webster.com%2Fdictionary%2FBarak&rfr_id=info%3Asid%2Fen.wikipedia.org%3ABarack+Obama" class="Z3988" about="#mwt14" id="mwBP0">
<span style="display:none;">
<span typeof="mw:Entity"> </span>
</span>
</span>