Extension:PandocUltimateConverter

MediaWiki extensions manual
PandocUltimateConverter
Release status: beta
Implementation Special page
Description Pandoc converter extension for mediawiki which imports not only text, but also images
Author(s) Urfiner (Nikolai Kochkin)
Latest version 0.2.1
MediaWiki 1.39+
License MIT License
Download

The PandocUltimateConverter is an extension for MediaWiki that converts document files (.docx, .odt) and other files to wikitext. It imports not only text but also images. It is highly inspired by Microsoft's PandocUpload extension but written from scratch to support 1.41 MediaWiki and include image import.

You can see demo here.

Installation

edit

Installation is just a bit more complicated than usual:

  1. Install pandoc
  2. Download extension
  3. Load the extension in LocalSettings.php wfLoadExtension( 'PandocUltimateConverter' );
  4. (Optional) Configure path to pandoc binary $wgPandocUltimateConverter_PandocExecutablePath = 'C:\Program Files\Pandoc\pandoc.exe';. It will work without this parameter if pandoc is in PATH.
  5. (Optional) Configure path to a temp folder where pandoc will store images before upload $wgPandocUltimateConverter_TempFolderPath = 'D:\_TMP';. It will try to use the default system temp folder if not specified.
  6. Allow additional file extensions to be uploaded to MediaWiki
    $wgFileExtensions[] = 'docx';
    $wgFileExtensions[] = 'odt';
    // You can specify other required extensions as well
    
  7. Enable uploads if they are not enabled
    $wgEnableUploads = true;
    

Usage

edit

Follow these steps:

  1. Go to Special:PandocUltimateConverter page.
     
  2. Choose source type: file or URL.
  3. Specify file (or URL) to convert and target page name.
  4. After the file conversion is finished, you will be redirected to the target page
    • The Source file will be automatically removed from the wiki
    • All the images will be automatically uploaded to MediaWiki with the name "Pandocultimateconverter-{guid}-{imageOriginalNameAndExtension}"
    • If the image is already present on the wiki, the image duplicate will not be uploaded. We will just use the existing image.
    • All the images will be automatically removed from the temp folder
Target page and all the images will be overwritten if they already exist

Limitations

edit

Consider the following limitations:

  1. The extension was tested on Windows and Linux.
  2. The extension was tested on MediaWiki 1.39, 1.40, 1.41, 1.42.
  3. The list of supported formats can be found on the Pandoc website. For example, it supports docx and odt formats, but does not support pdf format.
  4. For URLs, it may add a lot of garbage data to the target webpage; that is a known behavior of pandoc.

Advanced configuration

edit

There are additional configs:

  1. $wgPandocUltimateConverter_MediaFileExtensionsToSkip = [ 'emf' ]; -- You can specify an array of extensions that should not be uploaded to MediaWiki as a file. For example, EMF images are not supported on the web, and there is no reason to upload them. The config is case-insensitive.
  2. Global configs $wgPandocExecutablePath and $wgPandocTmpFolderPath are still working but we recommend to switch to configuration parameters $wgPandocUltimateConverter_PandocExecutablePath and $wgPandocUltimateConverter_TempFolderPath.
  3. You can specify custom user rights for the extensions via $wgPandocUltimateConverter_PandocCustomUserRight where you can specify the required permission. For example: $wgPandocUltimateConverter_PandocCustomUserRight = 'nominornewtalk'; should prohibit access for non-bots.

See also

edit