Specs/HTML/1.2.0

< Specs‎ | HTML

This page defines a MediaWiki-specific DOM based on HTML5 and RDFa. The semantics of MediaWiki-specific functionality are encoded using RDFa.

RDFa structures edit

Global prefix mappings:

  • prefix="mw: http://mediawiki.org/rdf/ dc: http://purl.org/dc/terms/"
  • Convention: Capital for types, lowercase for attributes.
  • Generally use the prefix instead of vocab definitions to avoid clashes (and allow mixing) with user-supplied RDFa. User-supplied RDFa with the mw prefix is moved to a non-clashing prefix in Parsoid.

Versioning edit

An integer version number is set in the head section of the returned HTML document. This version is incremented whenever this DOM spec or any other important aspect of the Parsoid HTML output changes. See bug 52937 for details.

<meta property="mw:parsoidVersion" content="0">

mw:Placeholder and general client behavior edit

A typeof="mw:Placeholder" protects DOM structures from any editing. Clients are expected to preserve / protect subtrees marked as such. Clients are also expected to preserve any DOM subtrees marked up with typeof, rel, property in the http://mediawiki.org/rdf/ namespace they don't understand. This decouples clients from Parsoid development, and lets them concentrate on editing constructs whose special semantics they understand without having to implement all possible content elements.

Media edit

Images edit

Status: Implemented. Followup work to be done, Tracking bug.

In the examples below, the original size of the example image is 1941 × 220 pixels (these are the dimensions of the Foobar.jpg used in parserTests). The width and height in the DOM represent the actual scaled image height (not the bounding box dimensions specified in the wikitext). When image dimensions are modified or images with a non-default size are created, we will serialize to a square bounding box around the given width and/or height attributes. In the future: When using a (possibly scaled) version of the default thumbnail size, we will serialize using the scale or square option to enforce a square thumbnail bounding box (see task T64671).

The basic tree structure of all images, regardless of formatting options, alignment, or thumbnails, is:

<figure or span typeof="mw:Image"> <!-- or mw:Image/Thumb, mw:Image/Frame etc -->
 <a or span><img resource="..." src="..."></a or span>
 <figcaption (optional)>....</figcaption>
</figure or span>

The outer <figure> element needs to become a <span> element when the figure is rendered inline, since otherwise the HTML5 parser will interrupt a surrounding block context. The inner <figcaption> element is rendered as a data-mw attribute in this case (since block content in an invisible caption would otherwise break parsing). The inner <a> element needs to become a span if there is no link; see task T46627. An "alt" attribute on the <img> is present if (and only if) the "alt=" options are present in the wikitext markup. If the "lang=" option is present, the <img> tag will have a "lang" attribute. The "resource" attribute on the <img> tag specifies the wiki title and namespace for the image (so it doesn't have to be reverse-engineered from the "src" attribute); it should point to a relative URL based on the image title. The "link=" option will be present in generated wikitext if and only if the "resource" attribute of <img> differs from the "href" attribute of the <a> tag.

The <img> tag will have data-file-width, data-file-height, and data-file-type attributes indicating the original (unscaled) size and type of the image. See task T64881.

See phab:118520 for a proposal to replace the <span> with <figure-inline> when the figure in rendered inline.

Summary of semantic info for images edit

Summary of semantic info that is present in the HTML generated for images:

wrapper node
<figure> for block images and <span> for inline images
typeof attribute on the wrapper
mw:Image, mw:Image/Thumb, mw:Image/Frame, mw:Image/Frameless for different image uses
figure classes
mw-valign-{baseline,middle,sub,super,text-top,text-bottom,top,bottom}, mw-halign-{left,right,center,none} and optionally mw-image-border and mw-default-size for full-size images and thumbs scaled to the wiki's and user's default thumb size
figcaption sub-element
The caption
resource attribute on image
link to image resource page. TODO: what to use for images from commons?
width and / or height on image
scaled image size. Only one of width or height is fine for easier client-side scaling without aspect ratio issues.
alt attribute on image
alt property
src attribute on image
thumb governed by explicit thumb option or implicit from image
href attribute on a around image
link target, normally just the image page- BUT a element can be absent if link is explicitly empty.
Specific image examples edit

[[Image:Foobar.jpg]] (Note 1)

<span typeof="mw:Image" class="mw-default-size">
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
       width="1941" height="220">
 </a>
</span>

Without a link, we use the same basic DOM structure, but use a span instead of an a wrapper (see bug 44627):
[[Image:Foo.jpg|link=]] (Note 1)

<span typeof="mw:Image" class="mw-default-size">
 <span>
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
       width="1941" height="220">
 </span>
</span>

Adding 'left' causes the image to be rendered in block context, so the outer <span> becomes a <figure>:
[[Image:Foo.jpg|left|<p>caption</p>]] (Note 2, Note 5)

<figure typeof="mw:Image" class="mw-default-size">
 <a href="./File:Foo.jpg">
  <img resource="./File:Foo.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foo.jpg"
       width="1941" height="220">
 </a>
 <figcaption><p>caption</p></figcaption>
</figure>

Scaling, vertical alignment of an inline image:
[[Image:Foobar.jpg|50px|middle]] (Note 1)

<span typeof="mw:Image" class="mw-valign-middle">
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
       width="50" height="6">
 </a>
</span>

Caption (containing disallowed markup) on an inline image:
[[Image:Foobar.jpg|500x10px|baseline|cap<div></div>tion]] (Note 2, Note 5)

<span typeof="mw:Image" class="mw-valign-baseline"
    data-mw='{"caption":"cap<div></div>tion"}'>
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
       width="89" height="10">
 </a>
</span>

[[Image:Foobar.jpg|50px|border|caption]] (Note 2)

<span typeof="mw:Image" class="mw-image-border"
    data-mw='{"caption":"caption"}'>
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
       width="50" height="6">
 </a>
</span>

[[Image:Foobar.jpg|thumb|left|baseline|caption content]] (Note 3, Note 4)

<figure typeof="mw:Image/Thumb" 
  class="mw-halign-left mw-valign-baseline mw-default-size">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="180" height="20" 
        resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption content</figcaption>
</figure>

[[Image:Foobar.jpg|thumb|50x50px|right|middle|caption]] (Note 3)

<figure typeof="mw:Image/Thumb" class="mw-halign-right mw-valign-middle">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="50" height="6" 
        resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

[[Image:Foobar.jpg|frame|caption]]

<figure typeof="mw:Image/Frame" class="mw-default-size">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="1941" height="220" 
        resource="./Image:Foo.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

[[Image:Foobar.jpg|500x50px|frame|left|baseline|caption]]

<figure typeof="mw:Image/Frame" class="mw-halign-left mw-valign-baseline">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="442" height="50" 
        resource="./Image:Foo.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

[[Image:Foobar.jpg|frameless|500x50px|caption]] (Note 5)

<figure typeof="mw:Image/Frameless">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="442" height="50" 
        resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

Note that "border" can be combined with "frameless".
[[Image:Foobar.jpg|frameless|500x50px|border|caption]] (Note 5)

<figure typeof="mw:Image/Frameless" class="mw-image-border">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="442" height="50" 
        resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

Manual thumbnails; note that the resource attribute points at the original image, the src attribute points to the manually-specific thumbnail image, and the data-mw attribute indicates the resource name of the thumbnail (so it doesn't have to be inferred from the img src):
[[File:Foobar.jpg|thumb=Thumb.png|Title]]

<figure class="mw-default-size" typeof="mw:Image/Thumb" data-mw='{"thumb":"Thumb.png"}'>
  <a href="File:Foobar.jpg">
    <img src="//example.com/images/e/ea/Thumb.png" height="135" width="135"
         resource="./File:Foobar.jpg" />
  </a>
  <figcaption>Title</figcaption>
</figure>

Resizing images with the "scale" option:
[[File:Foobar.jpg|scale=0.5]]

<span typeof="mw:Image" class="mw-default-size" data-mw='{"scale":0.5}'>
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="//upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
       width="971" height="110">
 </a>
</span>

Resizing thumbs with the "scale" option (this is a square 220x220px bounding box, see bugzilla:62671):
[[File:Foobar.jpg|thumb|scale=1]]

<figure class="mw-default-size" typeof="mw:Image/Thumb" data-mw='{"scale":1}'>
  <a href="File:Foobar.jpg">
    <img src="//upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Foobar.jpg/220px-Foobar.jpg"
         height="26" width="220" resource="./File:Foobar.jpg" />
  </a>
</figure>

Resizing with the "upright" option (note that this is converted to an appropriate "scale" option, see above):
[[File:Foobar.jpg|thumb|upright=1]]

<figure class="mw-default-size" typeof="mw:Image/Thumb" data-mw='{"scale":1}'>
  <a href="File:Foobar.jpg">
    <img src="//upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Foobar.jpg/220px-Foobar.jpg"
         height="26" width="220" resource="./File:Foobar.jpg" />
  </a>
</figure>

See enwiki help for all options, see mw for inline/float details

Note 1: The PHP parser adds a default alt attribute to the <img> tag, with content "Foobar.jpg". Client-side post-processing will need to add this for compatibility. (Parsoid does not add this attribute because it does not correspond to anything in the wikitext.)

Note 2: In this case the PHP parser adds a title attribute to the <a> and an alt attribute to the <img>, both with the value "caption". Note that this is a markup-stripped version of the supplied caption in some cases. Client-side post-processing will need to add these.

Note 3: The PHP parser adds a <a href="./File:Foo.jpg" class="internal sprite details magnify" title="View photo details"></a> element inside the <figure>. Post-processing can add this if needed by a client.

Note 4: The default thumbnail width is a user-specified preference for the PHP parser. Parsoid uses a fixed 220px thumbnail width. The "mw-default-size" class indicates "no size given" and can be used to resize thumbs according to user preferences.

Note 5: In this example, the caption is not visible in PHP output, so the there should be a rule in the default stylesheet like (IE7+ and other modern browsers):

figure[typeof~="mw:Image/Frameless"] > figcaption,
figure[typeof~="mw:Image"] > figcaption { display: none }

In the PHP parser output, the caption does appear as a title attribute on the <a> and an alt attribute on the <img>; client side post-processing should add these (unless there are existing title and alt attributes, resulting from "title=" and "alt=" properties in the wikitext).

Audio/Video (Proposal) edit

Status: To be finalized and implemented (See Tracking bug for details and progress.)

The basic <figure> wrapper for audio and video media is identical to that for images, described in the section above, including provisions for inline players and captions. (Note that the PHP implementation does not properly render manual thumbnails or inline.)

The inner <video> element tracks the elements emitted by the video.js implementation in phab:T100106.

[[File:Folgers.ogv|thumb|50x50px|right|middle|caption]]

<figure typeof="mw:Image/Thumb" class="mw-halign-right mw-valign-middle">
   <span>
     <video poster="https://upload.wikimedia.org/wikipedia/commons/thumb/9/94/Folgers.ogv/352px--Folgers.ogv.jpg"
	controls=""
	preload="none"
	class="video-js"
	width="352"
	height="264"
        data-playerlayout="badge|mini|full"
	data-durationhint="60"
	data-startoffset="0"
	resource="Folgers.ogv"
	data-mwprovider="wikimediacommons">
		<source src="https://upload.wikimedia.org/wikipedia/commons/transcoded/9/94/Folgers.ogv/Folgers.ogv.360p.webm"
		type="video/webm; codecs=&quot;vp8, vorbis&quot;"
		transcodekey="360p.webm"
		data-title="Web streamable WebM (360P)"
		data-shorttitle="WebM 360P"
		data-file-width="352"
		data-file-height="264"
		data-bandwidth="574352"
		data-framerate="29.97002997003" />
		<source src="https://upload.wikimedia.org/wikipedia/commons/9/94/Folgers.ogv"
			type="video/ogg; codecs=&quot;theora, vorbis&quot;"
			data-title="Original Ogg file, 352 × 264 (637 kbps)"
			data-shorttitle="Ogg source"
			data-width="352"
			data-height="264"
			data-bandwidth="636645"
			data-framerate="29.97002997003" />
		[...]

		<track kind="subtitles" data-mwtitle="TimedText:Folgers.ogv.de.srt" data-mwprovider="wikimediacommons" type="text/x-srt" src="https://commons.wikimedia.org/w/index.php?title=TimedText:Folgers.ogv.de.srt&amp;action=raw&amp;ctype=text%2Fx-srt" srclang="de" data-dir="ltr" label="Deutsch (de) subtitles" />
                [...]
		<track kind="subtitles" data-mwtitle="TimedText:Folgers.ogv.en.srt" data-mwprovider="wikimediacommons" type="text/x-srt" src="https://commons.wikimedia.org/w/index.php?title=TimedText:Folgers.ogv.en.srt&amp;action=raw&amp;ctype=text%2Fx-srt" srclang="en" data-dir="ltr" label="English (en) subtitles" />
		Sorry, your browser either has JavaScript disabled or does not have any supported player.<br /> You can <a href="https://upload.wikimedia.org/wikipedia/commons/9/94/Folgers.ogv">download the clip</a> or <a href="https://www.mediawiki.org/wiki/Special:MyLanguage/Extension:TimedMediaHandler/Client_download">download a player</a> to play the clip in your browser.
     </video>
   </span>
   <figcaption>caption</figcaption>
</figure>

Notes:

  • "thumbtime needs to be matched" (CSA: I forget exactly what this means; presumably we need to represent this parameter in the HTML.)
  • Some of the data-* attributes should probably be data-file-* attributes for consistency. As a general rule, attributes derived from inspection of the original media file (original size, bandwidth, framerate, etc) should get data-file- prefixes. Attributes of derived/transcoded media can be plain data- attributes.
  • The wikitext alt options does not exist for video (it can be specified but not added to output, spec defines it should not be present since accessibility for video is via captions specified by the <track> element). We probably need to represent this hidden attribute in data-mw.
  • The wikitext link option does not exist for video (it can be specified but is not added to output) -- videos always produce figure > span, never figure > a. We probably need to represent this hidden attribute in data-mw.
  • The Sorry, your browser has JavaScript disabled text can/should be radically minimized; perhaps eliminated. In particular, HTML5 video support means that "JavaScript disabled" shouldn't prevent the video from being viewed.
  • The <source> and <track> tags are ignored during HTML-to-wikitext serialization; all information encoded in wikitext is represented on the <figure>, <span>, <video>, and <figcaption> elements.

Wiki links edit

  • The href attribute is UTF8 (as everything else), with a relative link prefix that always navigates up to the top of the wiki namespace, especially in subpages / pages containing slashes in the title. Example: './Foo', or (in a subpage) './../Foo'. We percent-encode percents and question marks in hrefs to support following links to wiki pages with question marks in their name. On the way in (when posting HTML to Parsoid) we assume href values to be urlencoded and decode them during serialization. Modified link hrefs without ./ or ../ prefix are temporarily assumed to be absolute to the wiki namespace for now, but will also be interpreted as relative to the page soon to support relative links in other HTML content. After that change, the equivalent of an absolute wikilink [[Foo]] would need to return an href="/Foo" instead.

[[Main Page|alternate linked content]]

<a rel="mw:WikiLink" href="./Main_Page">alternate linked content</a>


[[Main Page]]

<a rel="mw:WikiLink" href="./Main_Page">Main Page</a>

Link with tail: [[Potato]]es

<a rel="mw:WikiLink" href="./Potato">Potatoes</a>

Category links edit

[[Category:Foo]]

<link rel="mw:PageProp/Category" href="./Category:Foo">

[[Category:Foo|Bar baz#quux]]

<link rel="mw:PageProp/Category" href="./Category:Foo#Bar baz%23quux">

Language links edit

[[en:Foo]]

<link rel="mw:PageProp/Language" href="http://en.wikipedia.org/wiki/Foo">

Interwiki non-language links edit

[[:en:Foo]]

<a rel="mw:ExtLink" href="http://en.wikipedia.org/wiki/Foo">en:Foo</a>

External links edit

Autolinked URLs edit

http://example.com

<a rel="mw:ExtLink" href="http://example.com">http://example.com</a>

Numbered external link edit

[http://example.com]

<a rel="mw:ExtLink" href="http://example.com"></a>

Named external link edit

[http://example.com Link content]

<a rel="mw:ExtLink" href="http://example.com">Link content</a>

Magic links edit

ISBN link edit

ISBN 978-1413304541

<a rel="mw:WikiLink"
   href="./Special:BookSources/9781413304541">
  ISBN 978-1413304541
</a>

RFC link edit

RFC 1945

<a rel="mw:ExtLink" 
   href="http://tools.ietf.org/html/rfc1945">
  RFC 1945
</a>

PMID link edit

PMID 20610307

<a rel="mw:ExtLink"
   href="//www.ncbi.nlm.nih.gov/pubmed/20610307?dopt=Abstract">
  PMID 20610307
</a>


Nowiki blocks edit

There are two options to handle nowiki editing:

  1. Strip the tags from the DOM and let the serializer add those that are needed after each edit
  2. Keep them in the DOM for more accurate round-tripping of manually created nowiki blocks, and prevent non-text content from being entered into these blocks in the editor (TODO)

We picked option 2 for now. The nowiki content remains editable. If the content is modified in a way that makes nowiki unnecessary Parsoid can remove the wrapper in the serializer.

<nowiki>[[foo]]</nowiki>

<span typeof="mw:Nowiki">[[foo]]</span>

HTML entities edit

œ

<span typeof="mw:Entity">œ</span>

Behavior switches edit

__NOTOC__

<meta property="mw:PageProp/notoc">

__FORCETOC__

<meta property="mw:PageProp/forcetoc">

__NEWSECTIONLINK__

<meta property="mw:PageProp/newsectionlink">

__NONEWSECTIONLINK__

<meta property="mw:PageProp/nonewsectionlink">

__NOGALLERY__

<meta property="mw:PageProp/nogallery">

__HIDDENCAT__

<meta property="mw:PageProp/hiddencat">

__NOCONTENTCONVERT__

<meta property="mw:PageProp/nocontentconvert">

__NOCC__

<meta property="mw:PageProp/nocontentconvert">

__NOTITLECONVERT__

<meta property="mw:PageProp/notitleconvert">

__NOTC__

<meta property="mw:PageProp/notitleconvert">

__NOEDITSECTION__

<meta property="mw:PageProp/noeditsection">

__NOINDEX__

<meta property="mw:PageProp/noindex">

__INDEX__

<meta property="mw:PageProp/index">

__STATICREDIRECT__

<meta property="mw:PageProp/staticredirect">

Category default sort key edit

See bug 46470. Status: ready for implementation.

{{DEFAULTSORT:foo}}

<meta property="mw:PageProp/categorydefaultsort" content="foo">

Displaytitle edit

{{DISPLAYTITLE:foo}}

<meta property="mw:PageProp/displaytitle" content="foo">

Redirects edit

#REDIRECT [[foo]]

<link rel="mw:PageProp/redirect" href="./Foo">

#REDIRECT [[:Category:Foo]]

<link rel="mw:PageProp/redirect" href="./Category:Foo">

#REDIRECT [[Category:Foo]]

<link rel="mw:PageProp/redirect" href="./Category:Foo">

(T104502: This no longer creates a category.)

#REDIRECT [[meatball:Foo]]

<link rel="mw:PageProp/redirect" href="http://www.usemod.com/cgi-bin/mb.pl?Foo"/>

Note that interwiki links generate redirect tags; the client is responsible for not doing an HTTP 301 or 302 redirect to an external site.

#REDIRECT [[:en:File:Wiki.png]]

<link rel="mw:PageProp/redirect" href="//en.wikipedia.org/wiki/File:Wiki.png"/>

Note that, unlike the PHP parser, using language links still generates correct redirect tags in Parsoid. The client is again responsible for not doing an HTTP redirect to an external wiki.

Transclusion content edit

Many transclusion parameters contain arbitrary wikitext, styles, template names and other non-semantic / DOM strings. We also have very little information which attributes are semantic and which are presentational. So for now, we will thus expose all attributes in the "wt" (wikitext) format:

{{foo|unused value|paramname=used value}}

<body prefix="mw: http://mediawiki.org/rdf/
      mwns10: http://en.wikipedia.org/wiki/Template%58">
  
<span typeof="mw:Transclusion" about="#mwt1"
  data-mw='{"parts": [{"template":{"target":{"wt":"foo","href":"./Template:Foo"},"params":{"1":{"wt":"unused value"},"paramname":{"wt":"used value"}},"i":0}}]}'>
  Some text content
</span>
<table about="#mwt1">
  <tr>
    <td>used value</td>
  </tr>
</table>
</body>

The i property is used to associate additional information with each transclusion or extension fragment. This lets us support inline editing of things like infobox parameters in the future without changes to the JSON data structure.

Parameter names are represented by their index, if not explicitly named, or by the name that will be used when replacing them. In the case that the normalized parameter named is different from the actual parameter name in the text, a key.wt attribute is used to keep the name as it appears in the text. For example:

{{foo|param<!--comment-->name=value}}

<body prefix="mw: http://mediawiki.org/rdf/
      mwns10: http://en.wikipedia.org/wiki/Template%58">
  
<span typeof="mw:Transclusion" about="#mwt1"
  data-mw='{"parts": [{"template":{"target":{"wt":"foo","href":"./Template:Foo"},"params":{"paramname":{"wt":"value", "key":{"wt":"param&lt;!--comment-->name"}}},"i":0}}]}'>
  Some text content
</span>
<table about="#mwt1">
  <tr>
    <td>value</td>
  </tr>
</table>
</body>

Compound content blocks that include output from several transclusions like this football table is represented by interspersing wikitext strings with transclusion information in the data-mw.parts array:

{{table-start}}
{{cell|unused value|param=used value}}
|-
{{cell|unused value|param=used value}}
|-
|<math>1+1</math>
|}
<span typeof="mw:Transclusion" about="#mwt1"
  data-mw='{"parts":
[
  {"template":{"target":{"wt":"Template:Table-start"}},"i":0},
  "\n",
  {"template":{"target":{"wt":"Template:Cell"},"params":{"1":{"wt":"unused value"},"param":{"wt":"used value"}}},"i":1},
  "\n|-\n",
  {"template":{"target":{"wt":"Template:Cell"},"params":{"1":{"wt":"unused value"},"param":{"wt":"used value"}}},"i":2},
  "\n|-\n|",
  {"extension":{"name":"math","attrs": {}, "body":{"extsrc":"1+1","mathml":"<maybelater/>"}},"i":3},
  "|}"
]}'>
  Some text content
</span>
<table about="#mwt1">
  <tr>
    <td>used value</td>
  </tr>
</table>


Editing support for the interspersed wikitext is difficult to implement on the server side, as those wikitext edits need to be restricted in their effect to the original DOM range. A potential solution to this could be to wrap the multi-template compound block into a template hook that expands its content to a well-balanced DOM structure. Arbitrary wikitext edits within this tag would still only affect the original DOM range, both in Parsoid and the PHP parser. This is lower priority though, so for now the interspersed wikitext will be read-only.

Parameter Substitution at the top-level edit

This section specifies wrapping for parameter uses in the top-level namespace where all parameter substitutions evaluate to a null value.

{{{foo|''some italic'' plain text '''some bold'''}}}

<body prefix="mw: http://mediawiki.org/rdf/ mwns10: http://en.wikipedia.org/wiki/">

<p typeof="mw:Param" about="#mwt0">
  <i>some italic</i> plain text <b>some bold</b>
</p>

Transclusion-affected attributes edit

Status: Implemented. See bug 52913.

This is the representation of attributes in links, tables, and html tags whose keys and/or values are fully or partially generated by transclusions. When only attributes are affected, the element is be assigned an "mw:ExpandedAttrs" typeof attribute and the data-wm JSON object will provide additional specific information about the keys or values that are fully or partially generated by templates. If other parts of the content are also transclusion-affected, the element will be marked up according as a general transclusion instead.

It is conceivable to think up use-cases where part of an attribute value is generated by a template (ex: color of a background-color of a style attribute), but not as much for attribute-keys. This spec also assumes that a template can only generate one attribute rather than multiple attributes.

data-mw = {
    "attribs": [
        [{"txt": "href","html": "..." }, {"html": "..."}],
        ["sortKey", { "html": "..." }],
        ["1", { "html": "..." }],
        ["2", { "html": "..." }]
    ]
}

A few examples are worked out below.

Example 1: [[F{{echo|o}}o|bar]]

<a href="./Foo"
  typeof="mw:ExpandedAttrs"
  rel="mw:WikiLink"
  about="#mwt2"
  data-mw='{
    "attribs": [
       [
          "href",
          { 
             "html": "F<span about=\"#mwt1\" typeof=\"mw:Transclusion\" 
                data-mw=\'{\"target\\":{\"wt\":\"echo\",\"href\":\"./Template:Echo\"},\"params\":{\"1\":{\"wt\":\"o\"}},\"i\":0}\'>o</span>o" 
          }
       ] 
    ] 
  }'>
bar</a>

Example 2: <div style="{{echo|color:red;}}">...</div>

<div style="color:red;"
  typeof="mw:ExpandedAttrs"
  about="#mwt2"
  data-mw="{
    'attrs': [
      [ 
        'style',
        { 'html': '<span about=\"#mwt1\" typeof=\"mw:Transclusion\" 
                data-mw=\'{\"target\\":{\"wt\":\"echo\",\"href\":\"./Template:Echo\"},\"params\":{\"1\":{\"wt\":\"color:red;\"}},\"i\":0}\'>color:red;</span>' }
      ]
    ]
  }">
...
</div>

Example 3: [[File:foo.jpg|{{echo|thumb}}|{{echo|160px}}]]

<figure
  typeof="mw:Image/Thumb mw:ExpandedAttrs"
  about="#mwt2"
  data-mw="{
    'attrs': [
      [ '1',
        { 'html': '<span about=\"#mwt1\" typeof=\"mw:Transclusion\" 
                data-mw=\'{\"target\\":{\"wt\":\"echo\",\"href\":\"./Template:Echo\"},\"params\":{\"1\":{\"wt\":\"thumb\"}},\"i\":0}\'>thumb</span>' }
      ],
      [ '2',
        { 'html': '<span about=\"#mwt1\" typeof=\"mw:Transclusion\" 
                data-mw=\'{\"target\\":{\"wt\":\"echo\",\"href\":\"./Template:Echo\"},\"params\":{\"1\":{\"wt\":\"160px\"}},\"i\":0}\'>160px</span>' }
      ]
    ]
  }" ... >
    ... Rest of image HTML here ...
</figure>


Extension content edit

<ref group='x' name='y'>{{Cite|foo|bar=baz}}</ref>

<span id="cite_ref-0-0" class="reference" about="#mwt1" typeof="mw:Extension/Ref"
  data-mw='{"name":"ref",
            "attrs": {
               "group": "x",
               "name": "y"
            },
            "body":{
               "html":"&lt;span typeof=\"mw:Transclusion\" about=\"#mw-t2\" id=\"mw-t2\"
                           data-mw=&apos;{ \"parts\": [
                               \"target\": {\"wt\":\"Cite\"},
                               \"params\": {\"1\":{\"wt\":\"foo\"},\"bar\":{\"wt\":\"baz\"}}
                           ] }&apos; &gt;
                       The citation content
                       &lt;/span&gt;"
                }
           }'>
  <a data-type="hashlink" href="#cite_note-0">[1]</a>
</span>

<math>1+1</math>

<span about="#mwt1" typeof="mw:Extension/Math"
  data-mw='{"name": "math": "attrs": {}, "body": {"extsrc":"1+1"}}' about="#mwt1">
  1 + 1
</span>

The data-mw attribute is a JSON object. It is meant as an extensible public interface, so more top-level members can be added. The top-level structure depends on the content type, with the main types being transclusions and extensions. See also the transclusion content section.

At present, Parsoid has few native extension handlers. As a fallback, it places the raw extension body text in body.extsrc for clients to edit in a raw-text editor. See the implemented extensions below for details on editing their content.

Ref and References edit

First one <ref>One</ref>
Second one <ref>Two <p>p1</p> <p>p2</p> </ref>
Named one <ref name='three'>Three</ref>
Reused <ref name='three' />
Reused again <ref name='three' />

<references />

The HTML output for this wikitext is shown below. Content with block content is wrapped in a div and content with inline content is wrapped in a span.

As of Feb 2015, Parsoid provides the element id of the HTML via the data-mw.body.id property rather than copying the HTML dom in the data-mw.body.html property as it used to earlier. But, both formats are considered valid and Parsoid accepts both formats when serializing HTML to wikitext.

If data-mw.body.id is specified, it is the client's responsibility to make sure that the element id is present in the DOM. If both data-mw.body.html and data-mw.body.id are specified, Parsoid uses the html property and ignores the id property.

<p>
First one
<span about="#mwt2" class="reference" id="cite_ref-1" rel="dc:references" typeof="mw:Extension/Ref" 
    data-mw='{"name": "ref", "attrs": {}, "body":{"id":"mw-reference-text-cite_note-1"}}'>
  <a href="#cite_note-1">[1]</a>
</span>
Second one
<span about="#mwt4" class="reference" id="cite_ref-2" rel="dc:references" typeof="mw:Extension/Ref" 
    data-mw='{"name": "ref", "attrs": {}, "body":{"id":"mw-reference-text-cite_note-2"}}'>
  <a href="#cite_note-2">[2]</a>
</span>
Named one
<span about="#mwt6" class="reference" id="cite_ref-three_3-0" rel="dc:references" typeof="mw:Extension/Ref"
    data-mw='{"name": "ref", "attrs": {"name": "three"}, "body":{"id":"mw-reference-text-cite_three-3"}}'>
  <a href="#cite_note-three-3">[3]</a>
</span>
Reused
<span about="#mwt8" class="reference" id="cite_ref-three_3-1" rel="dc:references" typeof="mw:Extension/ref" 
    data-mw='{"name":"ref", "attrs":{"name":"three"}}'>
  <a href="#cite_note-three-3">[3]</a>
</span>
Reused again
<span about="#mwt10" class="reference" id="cite_ref-three_3-2" rel="dc:references" typeof="mw:Extension/ref" 
    data-mw='{"name":"ref","attrs":{"name":"three"}}'>
  <a href="#cite_note-three-3">[3]</a>
</span>
</p>

<ol about="#mwt11" typeof="mw:Extension/References" data-mw='{"name":"references","attrs":{}}'>
  <li about="#cite_note-1" id="cite_note-1">
    <span rel="mw:referencedBy"><a href="#cite_ref-1"></a>
    </span> <span id="mw-reference-text-cite_note-1" class="mw-reference-text" data-parsoid="{}">
    One</span>
  </li>
  <li about="#cite_note-2" id="cite_note-2">
    <span rel="mw:referencedBy"><a href="#cite_ref-2"></a>
    </span> <span id="mw-reference-text-cite_note-2" class="mw-reference-text" data-parsoid="{}">
    Two <p data-parsoid='{"stx":"html","dsr":[45,54,3,4]}'>p1</p> <p data-parsoid='{"stx":"html","dsr":[55,64,3,4]}'>p2</p> </span>
  </li>
  <li about="#cite_note-three-3" id="cite_note-three-3">
    <span rel="mw:referencedBy"><a href="#cite_ref-three_3-0">3.0</a> <a href="#cite_ref-three_3-1">3.1</a> <a href="#cite_ref-three_3-2">3.2</a>
    </span> <span id="mw-reference-text-cite_note-three-3" class="mw-reference-text" data-parsoid="{}">Three</span>
  </li>
</ol>

This results in an RDF graph like this (courtesy of http://rdfa.info/play/):  

noinclude / includeonly / onlyinclude edit

Not yet implemented, tracked in bugzilla:40305. We only care about these in the actual page context, not in transcluded pages / templates. foo<noinclude>bar</noinclude>baz

<body prefix="mw: http://mediawiki.org/rdf/
      mwt0: http://en.wikipedia.org/wiki/Template%58"> 
<p>foo<meta typeof="mw:NoInclude">bar<meta typeof="mw:NoInclude/End">baz</p>
</body>

foo<onlyinclude>bar</onlyinclude>baz

<body prefix="mw: http://mediawiki.org/rdf/
      mwt0: http://en.wikipedia.org/wiki/Template%58"> 
<p>foo<meta typeof="mw:OnlyInclude">bar<meta typeof="mw:OnlyInclude/End">baz</p>
</body>


foo<includeonly>bar</includeonly>baz

<body prefix="mw: http://mediawiki.org/rdf/
      mwt0: http://en.wikipedia.org/wiki/Template%58"> 
<p>foo<meta typeof="mw:IncludeOnly">baz</p>
</body>

Language conversion blocks edit

Status: provisional / strawman. See bug 41716 and /Language conversion blocks.

Error handling edit

See bug 48900.

  • For API errors because of a non-existing image, data-mw.errors.key is set to "missing-image".
  • For API errors getting image info, data-mw.errors.key is set to "api-error" and data-mw.errors.message has more information about the specific error.
  • For image wikitext where a manual thumbnail is specified and it is not present, the data-mw.errors.key is set to "missing-thumbnail" and data-mw.errors.message is set to "This thumbnail does not exist.".

Ex: [[File:Nonexisting.jpg|thumb|caption content]]

<figure typeof="mw:Image/Thumb mw:Error" 
  data-mw='{"errors":[{"key":"missing-image"}]}'
  class="mw-halign-right mw-default-size">
   <a href="./File:Foobar.jpg">
     <img alt="Nonexisting.jpg"
        resource="./File:Foobar.jpg" />
   </a>
   <figcaption>caption content</figcaption>
</figure>

Ex:[[File:Blah.jpg|thumb|caption content]]

<figure typeof="mw:Image/Thumb mw:Error" 
  data-mw='{"errors":[{"key":"api-error", "message": "... whatever the API returns here ..."}]}'
  class="mw-halign-right mw-default-size">
....
</figure>

Recent changes edit

Some cleanup is needed (from bug 53432):

  1. mw:WikiLink/Interwiki became mw:ExtLink, and we automatically detect interwiki prefixes in new/modified mw:ExtLink content.
  2. mw:ExtLink/* became just mw:ExtLink. The information can mostly be extracted by matching a href prefix. Client-side rendering of numbered external links can be handled with CSS as discussed in bug 53505.
  3. mw:WikiLink/Category became mw:PageProp/Category as these are not really links. They don't render at all in the page, don't accept a caption etc.
  4. mw:WikiLink/Language became mw:PageProp/Language - same as with categories.
  5. template-affected attributes were moved to data-mw (see #Transclusion-affected attributes)

In https://gerrit.wikimedia.org/r/232284 the following was changed:

  1. ISBNs became mw:WikiLink instead of mw:ExtLink. More discussion at https://phabricator.wikimedia.org/T63558#1217861

Upcoming changes edit

ID attributes on all elements edit

We will also assign ID attributes to all elements, and use this to associate external metadata with those elements: /Element IDs. We will eventually move data-parsoid (private, so should not matter to users) and likely also data-mw (public) from the DOM into JSON objects keyed on the ID.

TODO edit

The following constructs still need a RDFa markup definition. They will initially only be marked with typeof="mw:Placeholder" for simple read-only round-tripping.

  • template parameter references (implemented, but not tested much)
  • __TOC__
  • <section> extension as HTML5 sections (see bug 47936).