User:GWicke/Notes/MW service URI layout considerations

Deterministic URIs for caching vs. page sub-resources edit

Deterministic URIs let us cache and purge cached requests. REST-style paths are generally deterministic, but using them also for page-related sub-resources is complicated by literal slashes in page names and public URIs.

We could percent-encode slashes in the page name.

/pages/Foo%2FBar/rev/latest/html
- relative links from content within the API would not work as expected (or, worse, would seem to work at first but fail on links to pages with slashes)
- manually deriving an API URI from a page view URI would require manual encoding of slashes
+ encodeURIComponent or the like to escape the path fragment corresponding to the title is simple and standardized
+ valid subresources and listings easy to discover (REST path)
(+) used by swift

We can instead use a query string to mark up the sub-resource. The easiest method to make those deterministic is to use only a single key-value pair, so that ordering of query parameters cannot introduce non-determinism.

Options for sub-resource encoding in query strings considered are:

/pages/Foo/Bar?rev=latest/html
- Sounds odd, as the key does not really match the sub-resource on the right.
+ single key-value pair makes URI deterministic
- listing URI for revisions not clear (rev/ in REST)
+ listing URI for formats clearish (?rev=latest/)
/pages/Foo/Bar?rev=latest&format=html
+ natural to work with on clients, reads well
- Would require query string key order normalization (alphabetic ordering) in caches as many client libraries don't let users control the order of parameters.
- invalid parameter combinations would need to be rejected (don't want to encode that in VCL)
- listing URIs hard to derive and discover
/pages/Foo/Bar?rev/latest/html
+ Short2
+ Deterministic
+ REST path makes it easy to discover valid parameter combinations and listing URIs
- A bit less natural to read and work with for people used to key=value query strings.
- Client library might add a trailing =, but that is easy to strip in VCL

Public content API entry points edit

We have both page-related resources (revisions, metadata) and page-independent resources. Both can be handled by a generic entry point.

Main options:

/api/v1/pages/Main_Page?rev/latest/html
separate entry point
- does not work well for wikis without a /wiki/ style prefix.
/wiki/_api/v1/pages/Main_Page?rev/latest/html
Stay within the wiki namespace, but don't collide with articles as those can't start with an underscore or double colon.
+ Works well with or without /wiki/ style prefix.
(-) /wiki/_api is currently redirected to /wiki/Api, so could potentially break some incoming links. Unlikely to matter in practice.
(-) leading underscore has 'private' connotation
(-) _api prefix won't be used in other ways to access the API, so less consistent
/wiki/::1/pages/Main_Page?rev/latest/html
Stay within the wiki namespace, but don't collide with articles as those can't start with an underscore or double colon.
+ very compact
+ can be used consistently internally (in the PHP virtual REST service interface) and externally
(+) currently an invalid title, so won't break existing incoming links
(+) consistent with use of colons in wiki links and namespaces
- possibly slightly more cryptic than /wiki/_api/v1/
This is my current favorite.

For page-related sub-resources like revisions or page metadata we could additionally provide an entry point that reuses the established page URIs as a base (example: /wiki/Foo?rev/latest/html. The general disadvantage of having an additional entry point for the same content is consistency and more potential for confusion. An option would be to treat the global API entry point as the canonical version, and only redirect there from the page-related 'convenience' entry point. It is not clear that the small convenience is worth it though.

Options:

/wiki/Main_Page?rev/latest/html
+ consistent with global API
- not versioned
/wiki/Main_Page?v1/rev/latest/html
+ also versioned
- inconsistent with global API