API:Page info in search results/de-formal

This page is a translated version of the page API:Page info in search results and the translation is 3% complete.

This shows the API results that search results use to display additional information about articles, including a lead image and a description of the article's subject from Wikidata.

Showing useful page information

search results in Wikipedia Android App

When you search in the Wikipedia mobile apps, as you type they show a drop-down list of matching pages. They also show the lead image for the article and a description of it.

The lead image comes from Extension:PageImages , which adds a page_image property to pages giving its guess as to an appropriate image for the page. The description comes from Wikidata, which maintains a localized description of the subject of each wiki page.

A slow way to do this would be to query for pages matching what the user types, then make an API action=query request for the pageimages property of the set of titles, and make another API query to wikidata.org requesting the Wikidata description. This works but involves multiple API queries.

How it works on Wikimedia wikis

Instead, WMF changed most wikis (but as of Mai 2015, not www.mediawiki.org) to load the Wikibase client extension for accessing Wikidata. This allows you to query for prop=pageterms along with prop=pageimages on the source wiki, instead of making a second request to www.wikidata.org.

Example: Simple query for pageimages and pageterms properties of Albert Einstein:

Result
{
    "query": {
        "pages": [
            {
                "pageid": 736,
                "ns": 0,
                "title": "Albert Einstein",
                "thumbnail": {
                    "source": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/38px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
                    "width": 38,
                    "height": 50
                },
                "pageimage": "Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
                "terms": {
                    "alias": [
                        "Einstein"
                    ],
                    "description": [
                        "German-American physicist and founder of the theory of relativity"
                    ],
                    "label": [
                        "Albert Einstein"
                    ]
                }
            }
        ]
    }
}
You should always use formatversion=2 in API requests if you are requesting results in JSON format. It returns results in a structure that is easier to process, and uses UTF8 encoding by default.

If you have a set of page titles, you can request their information all at once. Set pilimit to the number of titles you are querying, otherwise it will only return one thumbnail, from the first article in the set that has a plausible image. Also you should reduce the API response size by specifying only the properties you want the API modules to supply, in this case only the thumbnail and Wikidata description. Finally, you may want the query to handle pages that are redirects.

Example: Query for pageterms and pageimage thumbnails of several pages

Result
{
    "query": {
        "pages": [
            {
                "pageid": 736,
                "ns": 0,
                "title": "Albert Einstein",
                "thumbnail": {
                    "source": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/38px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
                    "width": 38,
                    "height": 50
                },
                "terms": {
                    "description": [
                        "German-American physicist and founder of the theory of relativity"
                    ]
                }
            },
            {
                "pageid": 243597,
                "ns": 0,
                "title": "Albert Ellis",
                "thumbnail": {
                    "source": "https://upload.wikimedia.org/wikipedia/en/thumb/3/3e/Albert_Ellis.jpg/50px-Albert_Ellis.jpg",
                    "width": 50,
                    "height": 33
                },
                "terms": {
                    "description": [
                        "American psychologist"
                    ]
                }
            },
            {
                "pageid": 4463732,
                "ns": 0,
                "title": "Albert Estopinal",
                "thumbnail": {
                    "source": "https://upload.wikimedia.org/wikipedia/commons/thumb/4/4a/EstopinalOfLouisiana.jpg/35px-EstopinalOfLouisiana.jpg",
                    "width": 35,
                    "height": 50
                },
                "terms": {
                    "description": [
                        "American politician"
                    ]
                }
            }
        ]
    }
}

Querying query results in one request

The above example is incomplete, since the set of page titles whose properties we are querying – Albert Einstein|Albert Ellis|Albert Estopinal – must have come from another query.

In many situations you can combine getting the page properties you want with the initial query for a set of pages, using the MediaWiki API's generator feature. The list of pages from the generator become the set of pages for the other part of the query, all in a single API request. The MediaWiki API's query module has a prefixsearch submodule that queries for a list of pages starting with the prefix you specify ("Albert Ei"), and list queries can act as a generator. The MobileFrontend extension and mobile apps do this, If you look at MobileFrontend's API query in SearchApi.js, you can see it combines the generator=prefixsearch with a query for the pageimages property. We can do the same, asking for the Wikidata description as well with prop=pageimages|pageterms.

Example: A query feeding a prefixsearch generator into useful property queries.


The prefixsearch generator provides an index for each page in the pages array in the result; you can use this to sort the page titles, each with its thumbnail image and description, in the correct order.

You used to have to append a separate list=prefixsearch query to get the titles in the correct order (phab:T98125).

Further niceties

If the set of articles that start with what the user types does not fill the search results list, the Wikimedia mobile apps go on to search for in-page matches that you would get from Special:Search. The Wikipedia Android and iOS mobile apps combines generator=prefixsearch with querying for the pageterms and pageimages properties and getting a list of search terms. From its implementation file:

	 @"action": @"query",
	 @"generator": @"prefixsearch",
	 @"gpssearch": self.searchTerm,
	 @"gpsnamespace": @0,
	 @"gpslimit": @(SEARCH_MAX_RESULTS),
	 @"prop": @"pageterms|pageimages",
	 @"piprop": @"thumbnail",
	 @"wbptterms": @"description",
	 @"pithumbsize" : @(SEARCH_THUMBNAIL_WIDTH),
	 @"pilimit": @(SEARCH_MAX_RESULTS),
	 // -- Parameters causing prefix search to efficiently return suggestion.
	 @"list": @"search",
	 @"srsearch": self.searchTerm,
	 @"srnamespace": @0,
	 @"srwhat": @"text",
	 @"srinfo": @"suggestion",
	 @"srprop": @"",
	 @"sroffset": @0,
	 @"srlimit": @1,


Going beyond

The Wikipedia iOS mobile app uses face detection to detect the focal region of the image!

Alternatives

As an alternative, the Popups extension behind the Hovercards beta feature uses the extracts query submodule of the TextExtracts extension to show two sentences from the lead text of an article from the local wiki (together with its image, "last edited", etc.) when you hover over a link. This text from the local wiki is usually longer and less definitive than the Wikidata description. Its API request is in resources/ext.popups.renderer.article.js

Example: Sample Hovercards API request including textextracts properties of Albert Einstein:

Result
{
    "query": {
        "pages": [
            {
                "pageid": 736,
                "ns": 0,
                "title": "Albert Einstein",
                "extract": "Albert Einstein (/ˈælbərt ˈaɪnʃtaɪn/; German: [ˈalbɐrt ˈaɪnʃtaɪn]; 14 March 1879 – 18 April 1955) was a German-born theoretical physicist. He developed the general theory of relativity, one of the two pillars of modern physics (alongside quantum mechanics).",
                "thumbnail": {
                    "source": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/228px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
                    "width": 228,
                    "height": 300
                },
                "revisions": [
                    {
                        "timestamp": "2015-06-24T12:17:17Z"
                    }
                ]
            }
        ]
    }
}

The query returns an array of pages, if successful this will have one element, the single matching page. This also requests the last-changed timestamp (`prop=revisions&rvprop=timestamp`) to display "Edited N days/hours ago."

Next steps

Try these API requests in the Special:ApiSandbox page, then make the same API requests from your own applications.


Siehe auch