Parsoid/API/v2

v2 API is now deprecated. These incomplete docs have been archived from the main API docs page.

Parsoid gained a "v2" API in December 2014 to act as middleware behind RESTBase entry points. The spec for it is at docs/specs/apiv2.yaml.

The main addition is a new format pagebundle, that responds to a request with a JSON structure containing separate data-parsoid data for clean roundtripping between HTML and wikitext and a html key with slimmer HTML of {title} without data-parsoid attributes.

For example, compare the value of html.body in

http://parsoid-lb.eqiad.wikimedia.org/v2/en.wikipedia.org/pagebundle/Siskiwit_Lake/650380826

with the HTML of

http://parsoid-lb.eqiad.wikimedia.org/enwiki/html/Siskiwit_Lake?oldid=650380826

Path Parameters

The entry points are of the form /v2/{domain}/{format}/{title}/{revision}

{domain}
The hostname of the wiki, rather than the wiki database {prefix} in v1 entry points.
{format}
Content format returned by the API.
{title}
Page title.
{revision}
Page revision number.

Formats

Content formats returned by the API.

html
Parsoid's XHTML5 + RDFa output, which includes inlined data-parsoid attributes. Content type is text/html
pagebundle 
A JSON blob containing the above html with the data-parsoid attributes split out and ids added to each node. Content type is application/json
wt 
Wikitext. Content type is text/plain

Payloads

The payloads should be delivered as JSON.

wikitext original

previous

For wt2html conversion

wikitext
The wikitext to convert.
body (optional)
A boolean flag. If true, returns only the <body> element of the result.
Note that this is subtly different from the RESTbase "bodyOnly" option, which returns the *children* of the <body> element.

Keys

data-parsoid 
Internal data for clean roundtripping between HTML and wikitext.
html
Parsoid's XHTML5 + RDFa output.
wikitext
Everyone's favourite markup language.

These all should contain,

{
  "headers": { "content-type": "..." },
  "body": "..."
}

wt2html

GET requests

The acceptable format parameters for GET requests are html and pagebundle.

GET /v2/{domain}/{format}/{title}

Redirects to the latest revision.

GET /v2/{domain}/{format}/{title}/{revision}

Returns {format} for a given revision.

POST requests

The acceptable format parameters for POST requests are html and pagebundle.

html2wt

POST requests