Extension:Collection/PDF Writer

mwlib.rl is a python library for writing pdf documents from MediaWiki articles which were parsed by the mwlib library.

See this press release Wikis Go Printable for more information on this project.

No Installation required ! edit

The PDF Writer can run standalone on a server and provide PDF generation for multiple MediaWiki instances. A server for public testing and low traffic Wikis runs at http://tools.pediapress.com .

All you need is the Collection extension which is configured to use this server by default.

Example edit

Solar system, example article from the English language Wikipedia, rendered as PDF using the PediaPress technology.

Technical edit

The PDF Writer uses the Python Reportlab libraries to generate PDF based on a DOM derived from parsing mediawiki-markup using the mwlib parser. The Collection Extension can be used to select and manage articles that shall constitute the resulting PDF.

Source edit

mwlib.rl is copyrighted by PediaPress and is distributed under a BSD license (see the included README.txt for details).

Install edit

Using easy_install edit

Make sure, you have the needed environment. On Debian systems:

apt-get install g++ perl python python-dev python-setuptools python-imaging python-lxml libevent-dev

Simply download and install mwlib with easy_install:

easy_install mwlib && rehash && easy_install mwlib.rl

RPM edit

RPM based Distros that have yum - just do : yum search mwlib , then do : yum install mwlib

fyi: mwlib has some depedencies which makes it more hard to compile from scratch.

Alternate Installation Instructions (works on Ubuntu) edit

The following commands can be used to install mwlib on Ubuntu (http://mwlib.readthedocs.org/en/latest/installation.html)

Run the following as root:

apt-get install -y gcc g++ make python python-dev python-virtualenv libjpeg-dev libz-dev libfreetype6-dev liblcms-dev libxml2-dev libxslt-dev ocaml-nox git-core python-imaging python-lxml texlive-latex-recommended ploticus dvipng imagemagick pdftk

For Ubuntu 16.04.1 (Xenial Xerus) the above command becomes:

apt-get install -y gcc g++ make python python-dev python-virtualenv libjpeg-dev zlib1g-dev libfreetype6-dev libxml2-dev libxslt1-dev ocaml-nox git-core python-imaging python-lxml texlive-latex-recommended ploticus dvipng imagemagick pdftk liblcms2-dev

After that switch to a user account and run:

virtualenv --distribute --no-site-packages ~/pp
export PATH=~/pp/bin:$PATH
hash -r
export PIP_INDEX_URL=http://pypi.pediapress.com/simple/
pip install pyfribidi mwlib mwlib.rl

Install texvc:

git clone https://github.com/pediapress/texvc cd texvc; make; make install PREFIX=~/pp

Custom render server edit

For the execution of a custom render server (you have a local mediawiki instance i.e), you need to install mwlib as stated before and then follow the instructions here [1].

According to this person, the following commands must be executed for the server to run:

nslave.py --cachedir ~/cache/

Alternatively you can execute all of these commands in one line: (tested in ubuntu)

nserve & mw-qserve & nslave --cachedir ~/cache/ & postman &

Once the above commands are executed on your render server, and your localsettings.php is correct then you should be able to print to pdf using the collection extension. As another contributor suggested, this may be voodoo, but sometimes if there are errors, restarting the linux server and re-entering these commands helps.

You can put them in a shell script to make the start process easier. If you use these default commands, you'll have a render server listening on, but isn't just that simple to figure out configuring your LocalSettings.php if you are running your MediaWiki instance on localhost.

For PDF generation to work, you have to set the following variables in LocalSettings.php:

//Your MediaWiki server (beginning of the file)
$wgServer="http://LAN_IP | PUBLIC_IP | HOST_NAME";
localhost, 127.0.xx, 192.168.xx won't work as far as I know (from the mailing list) for security reasons
I successfully test with $wgServer="";

//This goes after including Collection extension, usually at file bottom (localhost is also allowed here)
$wgCollectionMWServeURL = '';

Mailing List edit

We have set up a google group for discussion of mwlib.rl. You can subscribe to it via email: mailto:mwlib-subscribe@googlegroups.com.

Help Needed edit

Please help us translate some strings used in the generated PDF. The process of internationalisation is done at translatewiki.net. We appreciate your help there.

Programs edit

mwlib installs the following programs:

generates documents in formats like PDF or ODF from MediaWiki articles
generates ZIP files from MediaWiki articles that contain all information to produce some output document like a PDF file
starts a render server that allows the Collection extension to render documents from article collections

Configuration edit

If your MediaWiki has the MediaWiki API enabled, you just specify the base URL of the wiki as the configuration. For example using the English Wikipedia, this

$ mw-render --config http://en.wikipedia.org/w/ --username='xxxx' --password='yyyy' --output test.pdf --writer rl Physics

will produce a PDF document containing the article Physics.

Customization edit

It is possible to customize the resulting PDFs - for more information check the README.rst

See also edit