Mediawiki-utilities

MediaWiki-utilities is a collection of simple, sharp tools for extracting and processing MediaWiki data using Python. These libraries are inspired by the Unix philosophy. Each library is designed to do one thing and do it well. The libraries are designed to work together. Where applicable, they also include unix-style command line utilities that handle text streams, because that is a universal interface.

edit
  • mwparserfromhell (sourcedocs) -- an easy-to-use and outrageously powerful parser for MediaWiki wikicode.
  • mwparserfromhtml (sourcedocs) -- a parser for Wikimedia Enterprise (Parsoid) HTML dumps inspired by mwparserfromhell.
  • mwedittypes (sourcedocs) -- a diff engine that generates structured details about changes between two revisions of wikitext.
  • mwtokenizer (sourcedocs) -- a tokenizer for splitting plain text into sentences, words that works for (almost) all languages in Wikipedia
  • pywikibase (source) -- a set of types for handling the Wikibase data model (item, property, claim, etc.)

See also

edit
  • pywikibot (sourcedocs) -- a monolithic collection of tools that automate work on MediaWiki sites
  • wikicmd (sourcedocs) -- a command-line utility for working with MediaWiki websites.

Resources

edit