API:Page info in search results
Use the MediaWiki API to provide more context searching for Wikipedia pages. |
Introduction
editThis shows the API results that search results use to display additional information about articles, including a lead image and a description of the article's subject from Wikidata.
Showing useful page information
editWhen you search in the Wikipedia mobile apps, as you type they show a drop-down list of matching pages. They also show the lead image for the article and a description of it.
The lead image comes from Extension:PageImages , which adds a page_image property to pages giving its guess as to an appropriate image for the page. The description comes from Wikidata, which maintains a localized description of the subject of each wiki page.
A slow way to do this would be to query for pages matching what the user types, then make an API action=query
request for the pageimages
property of the set of titles, and make another API query to wikidata.org requesting the Wikidata description.
This works but involves multiple API queries.
How it works on Wikimedia wikis
editInstead, WMF changed most wikis (but as of May 2015, not www.mediawiki.org) to load the Wikibase client extension for accessing Wikidata.
This allows you to query for prop=pageterms
along with prop=pageimages
on the source wiki, instead of making a second request to www.wikidata.org
.
Example:
Result |
---|
{
"query": {
"pages": [
{
"pageid": 736,
"ns": 0,
"title": "Albert Einstein",
"thumbnail": {
"source": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/38px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
"width": 38,
"height": 50
},
"pageimage": "Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
"terms": {
"alias": [
"Einstein"
],
"description": [
"German-American physicist and founder of the theory of relativity"
],
"label": [
"Albert Einstein"
]
}
}
]
}
}
|
formatversion=2
in API requests if you are requesting results in JSON format. It returns results in a structure that is easier to process, and uses UTF8 encoding by default.If you have a set of page titles, you can request their information all at once. Set pilimit
to the number of titles you are querying, otherwise it will only return one thumbnail, from the first article in the set that has a plausible image. Also you should reduce the API response size by specifying only the properties you want the API modules to supply, in this case only the thumbnail and Wikidata description. Finally, you may want the query to handle pages that are redirects.
Example:
Result |
---|
{
"query": {
"pages": [
{
"pageid": 736,
"ns": 0,
"title": "Albert Einstein",
"thumbnail": {
"source": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/38px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
"width": 38,
"height": 50
},
"terms": {
"description": [
"German-American physicist and founder of the theory of relativity"
]
}
},
{
"pageid": 243597,
"ns": 0,
"title": "Albert Ellis",
"thumbnail": {
"source": "https://upload.wikimedia.org/wikipedia/en/thumb/3/3e/Albert_Ellis.jpg/50px-Albert_Ellis.jpg",
"width": 50,
"height": 33
},
"terms": {
"description": [
"American psychologist"
]
}
},
{
"pageid": 4463732,
"ns": 0,
"title": "Albert Estopinal",
"thumbnail": {
"source": "https://upload.wikimedia.org/wikipedia/commons/thumb/4/4a/EstopinalOfLouisiana.jpg/35px-EstopinalOfLouisiana.jpg",
"width": 35,
"height": 50
},
"terms": {
"description": [
"American politician"
]
}
}
]
}
}
|
Querying query results in one request
editThe above example is incomplete, since the set of page titles whose properties we are querying – Albert Einstein|Albert Ellis|Albert Estopinal – must have come from another query.
In many situations you can combine getting the page properties you want with the initial query for a set of pages, using the MediaWiki API's generator feature. The list of pages from the generator become the set of pages for the other part of the query, all in a single API request.
The MediaWiki API's query module has a prefixsearch
submodule that queries for a list of pages starting with the prefix you specify ("Albert Ei"), and list queries can act as a generator.
The MobileFrontend extension and mobile apps do this, If you look at MobileFrontend's API query in SearchApi.js, you can see it combines the generator=prefixsearch
with a query for the pageimages
property. We can do the same, asking for the Wikidata description as well with prop=pageimages|pageterms
.
Example:
The prefixsearch
generator provides an index
for each page in the pages
array in the result; you can use this to sort the page titles, each with its thumbnail image and description, in the correct order.
list=prefixsearch
query to get the titles in the correct order (phab:T98125).Further niceties
editIf the set of articles that start with what the user types does not fill the search results list, the Wikimedia mobile apps go on to search for in-page matches that you would get from Special:Search.
The Wikipedia Android and iOS mobile apps combines generator=prefixsearch
with querying for the pageterms and pageimages properties and getting a list of search terms. From its implementation file:
@"action": @"query",
@"generator": @"prefixsearch",
@"gpssearch": self.searchTerm,
@"gpsnamespace": @0,
@"gpslimit": @(SEARCH_MAX_RESULTS),
@"prop": @"pageterms|pageimages",
@"piprop": @"thumbnail",
@"wbptterms": @"description",
@"pithumbsize" : @(SEARCH_THUMBNAIL_WIDTH),
@"pilimit": @(SEARCH_MAX_RESULTS),
// -- Parameters causing prefix search to efficiently return suggestion.
@"list": @"search",
@"srsearch": self.searchTerm,
@"srnamespace": @0,
@"srwhat": @"text",
@"srinfo": @"suggestion",
@"srprop": @"",
@"sroffset": @0,
@"srlimit": @1,
Going beyond
editThe Wikipedia iOS mobile app uses face detection to detect the focal region of the image!
Alternatives
editAs an alternative, the Popups extension behind the Hovercards beta feature uses the extracts
query submodule of the TextExtracts extension to show two sentences from the lead text of an article from the local wiki (together with its image, "last edited", etc.) when you hover over a link. This text from the local wiki is usually longer and less definitive than the Wikidata description. Its API request is in resources/ext.popups.renderer.article.js
Example:
Result |
---|
{
"query": {
"pages": [
{
"pageid": 736,
"ns": 0,
"title": "Albert Einstein",
"extract": "Albert Einstein (/ˈælbərt ˈaɪnʃtaɪn/; German: [ˈalbɐrt ˈaɪnʃtaɪn]; 14 March 1879 – 18 April 1955) was a German-born theoretical physicist. He developed the general theory of relativity, one of the two pillars of modern physics (alongside quantum mechanics).",
"thumbnail": {
"source": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/228px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
"width": 228,
"height": 300
},
"revisions": [
{
"timestamp": "2015-06-24T12:17:17Z"
}
]
}
]
}
}
|
The query returns an array of pages, if successful this will have one element, the single matching page. This also requests the last-changed timestamp (`prop=revisions&rvprop=timestamp`) to display "Edited N days/hours ago."
Next steps
editTry these API requests in the Special:ApiSandbox page, then make the same API requests from your own applications.
See also
edit- Introducing lead images to Wikipedia’s Android beta app – a blog post on lead images