API:Wikitext parsen

This page is a translated version of the page API:Parsing wikitext and the translation is 91% complete.

Diese Seite ist Teil der Dokumentation der MediaWiki action API.

GET/POST-Abfrage um den Inhalt einer Seite zu parsen und die Ausgabe zu erhalten.

API-Dokumentation

Die folgende Dokumentation ist die Ausgabe von Special:ApiHelp/parse, die automatisch von der vorveröffentlichten MediaWiki-Version generiert wird, die auf dieser Seite (MediaWiki.org) läuft.

action=parse

(main | parse)

This module requires read rights.
Source: MediaWiki
License: GPL-2.0-or-later

Parses content and returns parser output.

See the various prop-modules of action=query to get information from the current version of a page.

There are several ways to specify the text to parse:

Specify a page or revision, using page, pageid, or oldid.
Specify content explicitly, using text, title, revid, and contentmodel.
Specify only a summary to parse. prop should be given an empty value.

https://www.mediawiki.org/wiki/Special:MyLanguage/API:Parsing_wikitext

Specific parameters:

Other general parameters are available.

title

Title of page the text belongs to. If omitted, contentmodel must be specified, and API will be used as the title.

text

Text to parse. Use title or contentmodel to control the content model.

revid

Revision ID, for {{REVISIONID}} and similar variables.

Type: integer

summary

Summary to parse.

page

Parse the content of this page. Cannot be used together with text and title.

pageid

Parse the content of this page. Overrides page.

Type: integer

redirects

If page or pageid is set to a redirect, resolve it.

Type: boolean (details)

oldid

Parse the content of this revision. Overrides page and pageid.

Type: integer

prop

Which pieces of information to get:

text: Gives the parsed text of the wikitext.
langlinks: Gives the language links in the parsed wikitext.
categories: Gives the categories in the parsed wikitext.
categorieshtml: Gives the HTML version of the categories.
links: Gives the internal links in the parsed wikitext.
templates: Gives the templates in the parsed wikitext.
images: Gives the images in the parsed wikitext.
externallinks: Gives the external links in the parsed wikitext.
sections: Gives the sections in the parsed wikitext.
revid: Adds the revision ID of the parsed page.
displaytitle: Adds the title of the parsed wikitext.
subtitle: Adds the page subtitle for the parsed page.
headhtml: Gives parsed doctype, opening <html>, <head> element and opening <body> of the page.
modules: Gives the ResourceLoader modules used on the page. To load, use mw.loader.using(). Either jsconfigvars or encodedjsconfigvars must be requested jointly with modules.
jsconfigvars: Gives the JavaScript configuration variables specific to the page. To apply, use mw.config.set().
encodedjsconfigvars: Gives the JavaScript configuration variables specific to the page as a JSON string.
indicators: Gives the HTML of page status indicators used on the page.
iwlinks: Gives interwiki links in the parsed wikitext.
wikitext: Gives the original wikitext that was parsed.
properties: Gives various properties defined in the parsed wikitext.
limitreportdata: Gives the limit report in a structured way. Gives no data, when disablelimitreport is set.
limitreporthtml: Gives the HTML version of the limit report. Gives no data, when disablelimitreport is set.
parsetree: The XML parse tree of revision content (requires content model wikitext)
parsewarnings: Gives the warnings that occurred while parsing content (as wikitext).
parsewarningshtml: Gives the warnings that occurred while parsing content (as HTML).
headitems: Deprecated. Gives items to put in the <head> of the page.

Values (separate with | or alternative): categories, categorieshtml, displaytitle, encodedjsconfigvars, externallinks, headhtml, images, indicators, iwlinks, jsconfigvars, langlinks, limitreportdata, limitreporthtml, links, modules, parsetree, parsewarnings, parsewarningshtml, properties, revid, sections, subtitle, templates, text, wikitext, headitems

wrapoutputclass

CSS class to use to wrap the parser output.

Default: mw-parser-output

usearticle

Use the ArticleParserOptions hook to ensure the options used match those used for article page views

Type: boolean (details)

parsoid

Generate HTML conforming to the MediaWiki DOM spec using Parsoid.

Type: boolean (details)

pst

Do a pre-save transform on the input before parsing it. Only valid when used with text.

Type: boolean (details)

onlypst

Do a pre-save transform (PST) on the input, but don't parse it. Returns the same wikitext, after a PST has been applied. Only valid when used with text.

Type: boolean (details)

effectivelanglinks

Deprecated.

Includes language links supplied by extensions (for use with prop=langlinks).

Type: boolean (details)

section

Only parse the content of the section with this identifier.

When new, parse text and sectiontitle as if adding a new section to the page.

new is allowed only when specifying text.

sectiontitle

New section title when section is new.

Unlike page editing, this does not fall back to summary when omitted or empty.

disablepp

Deprecated.

Use disablelimitreport instead.

Type: boolean (details)

disablelimitreport

Omit the limit report ("NewPP limit report") from the parser output.

Type: boolean (details)

disableeditsection

Omit edit section links from the parser output.

Type: boolean (details)

disablestylededuplication

Do not deduplicate inline stylesheets in the parser output.

Type: boolean (details)

showstrategykeys

Whether to include internal merge strategy information in jsconfigvars.

Type: boolean (details)

generatexml

Deprecated.

Generate XML parse tree (requires content model wikitext; replaced by prop=parsetree).

Type: boolean (details)

preview

Parse in preview mode.

Type: boolean (details)

sectionpreview

Parse in section preview mode (enables preview mode too).

Type: boolean (details)

disabletoc

Omit table of contents in output.

Type: boolean (details)

useskin

Apply the selected skin to the parser output. May affect the following properties: text, langlinks, headitems, modules, jsconfigvars, indicators.

One of the following values: apioutput, authentication-popup, cologneblue, fallback, json, minerva, modern, monobook, timeless, vector, vector-2022

contentformat

Content serialization format used for the input text. Only valid when used with text.

One of the following values: application/json, application/octet-stream, application/unknown, application/x-binary, text/css, text/javascript, text/plain, text/unknown, text/x-wiki, unknown/unknown

contentmodel

Content model of the input text. If omitted, title must be specified, and default will be the model of the specified title. Only valid when used with text.

One of the following values: Chart.JsonConfig, GadgetDefinition, Json.JsonConfig, JsonSchema, Map.JsonConfig, MassMessageListContent, NewsletterContent, Scribunto, SecurePoll, Tabular.JsonConfig, css, flow-board, javascript, json, sanitized-css, text, translate-messagebundle, unknown, wikitext

mobileformat

Return parse output in a format suitable for mobile devices.

Type: boolean (details)

templatesandboxprefix

Template sandbox prefix, as with Special:TemplateSandbox.

Separate values with | or alternative.

Maximum number of values is 50 (500 for clients that are allowed higher limits).

templatesandboxtitle

Parse the page using templatesandboxtext in place of the contents of the page named here.

templatesandboxtext

Parse the page using this page content in place of the page named by templatesandboxtitle.

templatesandboxcontentmodel

Content model of templatesandboxtext.

One of the following values: Chart.JsonConfig, GadgetDefinition, Json.JsonConfig, JsonSchema, Map.JsonConfig, MassMessageListContent, NewsletterContent, Scribunto, SecurePoll, Tabular.JsonConfig, css, flow-board, javascript, json, sanitized-css, text, translate-messagebundle, unknown, wikitext

templatesandboxcontentformat

Content format of templatesandboxtext.

One of the following values: application/json, application/octet-stream, application/unknown, application/x-binary, text/css, text/javascript, text/plain, text/unknown, text/x-wiki, unknown/unknown

Examples:

Parse a page.: api.php?action=parse&page=Project:Sandbox [open in sandbox]
Parse wikitext.: api.php?action=parse&text={{Project:Sandbox}}&contentmodel=wikitext [open in sandbox]
Parse wikitext, specifying the page title.: api.php?action=parse&text={{PAGENAME}}&title=Test [open in sandbox]
Parse a summary.: api.php?action=parse&summary=Some+[[link]]&prop= [open in sandbox]

Beispiel 1: Parsen des Inhalts einer Seite

GET-Anfrage

api.php?action=parse&page=Pet_door&format=json [In der ApiSandbox ausprobieren]

Antwort

{
    "parse": {
        "title": "Pet door",
        "pageid": 3276454,
        "revid": 852892138,
        "text": {
            "*": "<div class=\"mw-parser-output\"><div class=\"thumb tright\"><div class=\"thumbinner\" style=\"width:222px;\"><a href=\"/wiki/File:Doggy_door_exit.JPG\" class=\"image\"><img alt=\"\" src=\"//upload.wikimedia.org/wikipedia/commons/thumb/7/71/Doggy_door_exit.JPG/220px-Doggy_door_exit.JPG\" width=\"220\" height=\"165\" class=\"thumbimage\" srcset=\"//upload.wikimedia.org/wikipedia/commons/thumb/7/71/Doggy_door_exit.JPG/330px-Doggy_door_exit.JPG 1.5x, 
            ...
        }
    }
}

Beispielcode

Python

#!/usr/bin/python3

"""
    parse.py

    MediaWiki API Demos
    Demo of `Parse` module: Parse content of a page

    MIT License
"""

import requests

S = requests.Session()

URL = "https://en.wikipedia.org/w/api.php"

PARAMS = {
    "action": "parse",
    "page": "Pet door",
    "format": "json"
}

R = S.get(url=URL, params=PARAMS)
DATA = R.json()

print(DATA["parse"]["text"]["*"])

PHP

<?php
/*
    parse.php

    MediaWiki API Demos
    Demo of `Parse` module: Parse content of a page

    MIT License
*/

$endPoint = "https://en.wikipedia.org/w/api.php";
$params = [
    "action" => "parse",
    "page" => "Pet door",
    "format" => "json"
];

$url = $endPoint . "?" . http_build_query( $params );

$ch = curl_init( $url );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
$output = curl_exec( $ch );
curl_close( $ch );

$result = json_decode( $output, true );

echo( $result["parse"]["text"]["*"] );

JavaScript

/**
 * parse.js
 *
 * MediaWiki API Demos
 * Demo of `Parse` module: Parse content of a page
 *
 * MIT License
 */
 
const url = "https://en.wikipedia.org/w/api.php?" +
    new URLSearchParams({
        origin: "*",
        action: "parse",
        page: "Pet door",
        format: "json",
    });

try {
    const req = await fetch(url);
    const json = await req.json();
    console.log(json.parse.text["*"]);
} catch (e) {
    console.error(e);
}

MediaWiki JS

/**
 * parse.js
 *
 * MediaWiki API Demos
 * Demo of `Parse` module: Parse content of a page
 * MIT License
 */

const params = {
	action: 'parse',
	page: 'Pet door',
	format: 'json'
};
const api = new mw.Api();

api.get(params).done(data => {
	console.log(data.parse.text['*']);
});

Beispiel 2: Parsen eines Abschnitts einer Seite und Erhalt der Tabellendaten

GET-Anfrage

api.php?action=parse&page=Wikipedia:Unusual_articles/Places_and_infrastructure&prop=wikitext&section=5&format=json [In der ApiSandbox ausprobieren]

Antwort

{
    "parse": {
        "title": "Wikipedia:Unusual articles/Places and infrastructure",
        "pageid": 38664530,
        "wikitext": {
            "*": "===Antarctica===\n<!--[[File:Grytviken church.jpg|thumb|150px|right|A little church in [[Grytviken]] in the [[Religion in Antarctica|Antarctic]].]]-->\n{| class=\"wikitable\"\n|-\n| '''[[Emilio Palma]]'''\n| An Argentine national who is the first person known to be born on the continent of Antarctica.\n|-\n| '''[[Scouting in the Antarctic]]'''\n| Always be prepared for glaciers and penguins.\n|}"
        }
    }
}

Beispielcode

parse_wikitable.py

#!/usr/bin/python3

"""
    parse_wikitable.py

    MediaWiki Action API Code Samples
    Demo of `Parse` module: Parse a section of a page, fetch its table data and save
    it to a CSV file

    MIT license
"""

import csv
import requests

S = requests.Session()

URL = "https://en.wikipedia.org/w/api.php"

TITLE = "Wikipedia:Unusual_articles/Places_and_infrastructure"

PARAMS = {
    'action': "parse",
    'page': TITLE,
    'prop': 'wikitext',
    'section': 5,
    'format': "json"
}


def get_table():
    """ Parse a section of a page, fetch its table data and save it to a CSV file
    """
    res = S.get(url=URL, params=PARAMS)
    data = res.json()
    wikitext = data['parse']['wikitext']['*']
    lines = wikitext.split('|-')
    entries = []

    for line in lines:
        line = line.strip()
        if line.startswith("|"):
            table = line[2:].split('||')
            entry = table[0].split("|")[0].strip("'''[[]]\n"), table[0].split("|")[1].strip("\n")
            entries.append(entry)

    file = open("places_and_infrastructure.csv", "w")
    writer = csv.writer(file)
    writer.writerows(entries)
    file.close()

if __name__ == '__main__':
    get_table()

Mögliche Fehler


Code	Information
missingtitle	The page you specified doesn't exist.
nosuchsection	Es gibt keinen Abschnitt section in page.
pagecannotexist	Namespace doesn't allow actual pages.
invalidparammix	The parameters `page`, `pageid`, `oldid`, `text` can not be used together. The parameters `page`, `pageid`, `oldid`, `title` can not be used together. The parameters `page`, `pageid`, `oldid`, `revid` can not be used together.

Parametergeschichte

v1.38: showstrategykeys eingeführt
v1.32: disabletidy veraltet
v1.31: disablestylededuplication eingeführt
v1.30: revid, useskin, wrapoutputclass eingeführt

Siehe auch