Parsoid/JS/Setup

Before you begin

edit

If you are using Parsoid in conjunction with VisualEditor, note that they are developed in parallel and upgrades to one often require a corresponding upgrade to the other. If in doubt, please check the Extension:VisualEditor page and follow its setup instructions first.

Of course, Parsoid can be used stand-alone (to convert wikitext to HTML DOM and vice-versa) and if you don't need VisualEditor, you can ignore the above.

If you are a developer, or if you lack access to sudo apt-get because you're on shared hosting, you probably want to follow the Developer Setup instructions. This page documents setup for a typical user of Parsoid using the native software packaging for your operating system. (Although, if your preferred operating system is not listed here, you might try the developer setup instructions — good luck!)

If your shared hosting doesn't allow you to install Parsoid, you can also try this workaround: Installation on a shared host.

If you run into problems, consult the troubleshooting pages. If you'd like to contribute your problems and suggested solutions to others, we encourage you to add that information to the troubleshooting pages, in order to keep this page of typical installation instructions as clear and simple as possible.

First you will need to install Parsoid. After this is done, skip to the #Configuration section of this page in order to ensure that Parsoid can talk to your mediawiki instance.

Please note that Parsoid 0.10.0 may not work with node 12, as described on Phabricator.

Installation

edit

Ubuntu / Debian

edit
Parsoid switched package repositories on 2015-09-17. If you installed Parsoid prior to this date, you will need to follow the instructions below to add the new Parsoid repository to get updated packages. If you are installing Parsoid as a new package, follow the instructions below.
Parsoid updated its GPG key on 2016-07-27 and on 2019-06-13 respectively. If you installed Parsoid prior to this date, you will need to follow the instructions below to add the new Parsoid GPG key before you will get updated packages. If you are installing Parsoid as a new package, follow the instructions below.

These packages work on all architectures and with current distros: Ubuntu 14+ and Debian testing, unstable or wheezy (stable) with backports enabled. See the manual installation on Linux or Mac OS X instructions if your distribution is older & doesn't have nodejs ≥ v8.x available.

Import the repository gpg key: (key created on 2019-06-13 with validity 2029-04-23)

sudo apt install dirmngr
sudo apt-key advanced --keyserver keys.gnupg.net --recv-keys AF380A3036A03444

If the last command above isn't working (gpgkeys: key AF380A3036A03444 canot be retrieved), you can try another key server. This one should work :

sudo apt-key advanced --keyserver pgp.mit.edu --recv-keys AF380A3036A03444

or (if the firewall is blocking)

sudo apt-key advanced --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys AF380A3036A03444

Add the Wikimedia repository:

  • Ubuntu
sudo apt-add-repository "deb https://releases.wikimedia.org/debian jessie-mediawiki main"

On ubuntu 16.04 you may need to install "software-properties-common" in order to run "apt-add-repository"

sudo apt-get install software-properties-common
  • Debian

With Jessie and Stretch, first make sure you have backports enabled, because nodejs 4.x is too old to run the latest release of Parsoid. Then, add the parsoid repository:

echo "deb https://releases.wikimedia.org/debian jessie-mediawiki main" | sudo tee /etc/apt/sources.list.d/parsoid.list

Install:

sudo apt install apt-transport-https
sudo apt update && sudo apt install parsoid

Then, open the config file in /etc/mediawiki/parsoid/config.yaml and update it to reflect your API URL. See the #Configuration section below for details.

Notes:

  • Modifications brought to the configuration file will only become active after restarting the service by service parsoid restart.
  • The repository will contain the latest available version of Parsoid. Older versions can be installed manually.
  • The default port used is 8142 (not 8000 so you'll need to, for example, change $wgVirtualRestConfig['modules']['parsoid']['url'] in LocalSettings.php).
  • The log file is /var/log/parsoid/parsoid.log, and is automatically rotated.

Caveats about the deb:

  • With nodejs version 4.x (default version included in debian), parsoid may fail to start. Upgrading nodejs (version 10.x worked in one case) version may be required to solve the issue.
  • If you are on an older distribution and nodejs >= v4 is not available, see the Nodejs installation instructions. You might be able to get a recent packaged version of nodejs. If you have to install node.js from source (we recommend nave), you'll need to use the Parsoid/Developer Setup instructions.
  • Some folks report that you should also ensure that curl is installed as well: sudo apt-get install curl. Please add some more details here if you find this to be true on your setup.
    • 2018-09-20 during our upgrade from mw v29 to 31 , We already had curl installed. when i tested ve got this:
Error loading data from server: internal_api_error_Exception: [0db2f13b5ceecfae5a4c1a98] Exception caught: PHP cURL extension missing. Check https://www.mediawiki.org/wiki/Manual:CURL. Would you like to retry?

solved with

apt install php-curl
systemctl reload apache2

PS: after posting this noticed https://www.mediawiki.org/wiki/Extension:VisualEditor#Troubleshooting mentions similar .

Arch

edit

Parsoid is available in AUR under aur/parsoid (release version) or aur/parsoid-git (development version). Install however you would usually install AUR packages. Enable and start the parsoid service (systemctl enable parsoid; systemctl start parsoid) and configure per below. Remember to restart the service for changes to take effect.

This installs to /usr/share/webapps/parsoid/ by default.

RedHat/CentOS 7

edit

Parsoid is available from the Git repository.

Try creating an empty directory and checking out a copy of Parsoid:

yum install git npm
git clone --recursive https://gerrit.wikimedia.org/r/mediawiki/services/parsoid/deploy
git clone https://gerrit.wikimedia.org/r/mediawiki/services/parsoid
cd parsoid
npm install

If you want to use a specific branch, use something like

git clone --single-branch --branch v0.9.0 --recursive https://gerrit.wikimedia.org/r/mediawiki/services/parsoid/deploy
git clone --single-branch --branch v0.9.0 https://gerrit.wikimedia.org/r/mediawiki/services/parsoid

This should dump everything you need into the current directory. Copy the default configuration (which configures the Parsoid server to listen at http://localhost:8000):

cp config.example.yaml config.yaml

Edit this file per #Configuration (below), then start the server with:

node bin/server.js

At this point, opening a browser to localhost:8000 should display a page with links to the Parsoid documentation on www.mediawiki.org

As one final step, change your startup files (init.d) to add a task to relaunch node bin/server.js on server startup.

RedHat/CentOS 8

edit
dnf install git npm make python2 gcc-c++ -y
cd /opt
git clone --recursive https://gerrit.wikimedia.org/r/mediawiki/services/parsoid/deploy
git clone https://gerrit.wikimedia.org/r/mediawiki/services/parsoid
cd parsoid
PYTHON=python2 npm install
cp config.example.yaml config.yaml
chown apache:apache /opt/parsoid

Edit config.yaml. You may need to change uri parameter. Often localhost uri works, and default setting may be ok.

Create this systemd unit file:

# /etc/systemd/system/parsoid.service
[Unit]
Description=Mediawiki Parsoid web service on node.js
Documentation=http://www.mediawiki.org/wiki/Parsoid
Wants=local-fs.target network-online.target
After=local-fs.target network-online.target

[Install]
WantedBy=multi-user.target

[Service]
Type=simple
User=apache
Group=apache
WorkingDirectory=/opt/parsoid
ExecStart=/usr/bin/node /opt/parsoid/bin/server.js
KillMode=process
Restart=on-failure
PrivateTmp=true
StandardOutput=syslog

Create /etc/firewalld/services/parsoid.xml:

<?xml version="1.0" encoding="utf-8"?>
<service>
  <short>Parsoid</short>
  <description>Wikitext converter for Mediawiki</description>
  <port protocol="tcp" port="8000"/>
</service>

Publish, add and activate the newly defined firewalld service.

firewall-cmd --reload
firewall-cmd --add-service=parsoid --permanent
firewall-cmd --reload

Enable the systemd service and verify if Parsoid is reachable.

systemctl enable parsoid --now
curl http://localhost:8000

Vagrant

edit

If you are using the MediaWiki-Vagrant virtual machine, the parsoid role sets up a working Parsoid. If you use the visualeditor role, it will enable parsoid as well.

Windows

edit

Requirements:

  • Install Nodejs x86
  • Install Git x86

With nodejs you have to install the build tools using PowerShell 32-bit (as administrator) (this may take a while)

npm install --global --production windows-build-tools

You may need to update your npm version to avoid errors

npm -g install npm@latest

npm@next is broken. See https://github.com/npm/npm/issues/16037

If your current directory is in C:\windows\system32, Perform the following command

cd $env:userprofile

Install parsoid

npm install parsoid

Note: if you get errors 'no such file or directory on 'c:\user\{username}\package.json' run the following:

npm init

or enter the directory in which the package.json file should be located:

%USERPROFILE%\node_modules\parsoid\

Copy the default config and configure parsoid

copy %USERPROFILE%\node_modules\parsoid\localsettings.example.js %USERPROFILE%\node_modules\parsoid\localsettings.js

If you use the config.yaml configuration file, copy that file from the example:

copy %USERPROFILE%\node_modules\parsoid\config.example.yaml %USERPROFILE%\node_modules\parsoid\config.yaml

and edit the content according to your wiki installation.

Run parsoid

C:\Users\USERNAME\node_modules\parsoid>node bin/server.js

or on Windows 10+

cd %APPDATA%\npm\node_modules\parsoid
node bin/server.js

You should open your php.ini file and uncomment the next php modules:

  • extension=php_curl.dll
  • extension=php_openssl.dll

otherwise you will get troubles with Parsoid

Docker

edit

This is a version of Parsoid created by community. The original repository can be found at TheNets's GitHub.

Versions available: 0.8, 0.9, 0.10, 0.11.

The images was created over Alpine image.

Requirements

edit

How to run

edit

To start Parsoid run the command below. Just pay attention to the MediaWiki version and choose a compatible Parsoid version.

# For MediaWiki <= 1.30
docker run -d -p 8080:8000 -e PARSOID_DOMAIN_localhost=http://localhost/w/api.php thenets/parsoid:0.8

# For MediaWiki >= 1.31 & <= 1.32
docker run -d -p 8080:8000 -e PARSOID_DOMAIN_localhost=http://localhost/w/api.php thenets/parsoid:0.10

# For MediaWiki >= 1.33
docker run -d -p 8080:8000 -e PARSOID_DOMAIN_localhost=http://localhost/w/api.php thenets/parsoid:0.11

Examples

edit

How to add more than one domain:

docker run -d -p 8080:8000 \
            -e PARSOID_DOMAIN_foobar=http://foobar.com/w/api.php \
            -e PARSOID_DOMAIN_example=http://example.com/w/api.php \
            -e PARSOID_DOMAIN_localhost=http://localhost/w/api.php \
            thenets/parsoid:0.11

How to expose on a specific port: (You can use arbitrary port numbers which are not already in use)

# Expose port 8081
docker run -d -p 8081:8000 -e PARSOID_DOMAIN_localhost=http://localhost/w/api.php thenets/parsoid:0.11

# Expose port 8142
docker run -d -p 8142:8000 -e PARSOID_DOMAIN_localhost=http://localhost/w/api.php thenets/parsoid:0.11

For more information about Docker setup, check the GitHub page.

Configuration

edit
Parsoid should be configured via edits to the static config.yaml file. If you are using an version of Parsoid older than v0.6.0, you will have to edit localsettings.js, which also allows non-static configs for the time being.

You can find an example configuration file on github.

Starting with Parsoid 0.6.0, the configuration file is located here:

  • /etc/mediawiki/parsoid/config.yaml

If the api.php file for your wiki is not in the default 'http://localhost/w/api.php' edit the config.yaml file and modify the uri parameter to point to the correct location:

services:
  - module: lib/index.js
    entrypoint: apiServiceWorker
    conf:
        mwApis:
          # This is the only required parameter,
          # the URL of you MediaWiki API endpoint.
        - uri: 'http://yoursite.com/w/api.php'
          # The "domain" is used for communication with Visual Editor
          # and RESTBase.  It defaults to the hostname portion of
          # the `uri` property above, but you can manually set it
          # to an arbitrary string.
          domain: 'yoursite.com'  # optional

The uri property gives the API path to your local wiki. The domain property is optional; it defaults to the hostname used in the uri property if not explicitly set, but it can be an arbitrary string (it doesn't actually have to resolve in DNS).

If your wiki is inside a reverse proxy configuration or similar, you can set the hostname in uri to an internal hostname or IP that actually points to your wiki internal IP address, to avoid requests going to the public IP address and then routed back again to the internal server. However, be sure that your web server actually routes requests with that hostname to the wiki (if the server is configured to serve different things for multiple sites or subdomains).

For example: you have a wiki at www.example.com with a public IP behind a proxy, and the actual application server is on a second internal server which only serves requests to the wiki when accessed from the www.example.com hostname but not with other hostnames (because it serves various different websites). You may need to set an /etc/hosts alias of internal.www.example.com and set up your web server to have internal.www.example.com as an alias to www.example.com. Then you can use that hostname in the uri property.

By default, Parsoid opens a UDP socket and send each minute some metrics about the Parsoid heap to a statsD server. If you want to send instead these metrics to the logging backend (with the "trace" log level), add in config.yaml:

metrics:
    type: log

localsettings.js (or settings.js) as configuration file

edit

If you prefer to use localsettings.js as your configuration file, in the config.yaml file uncomment the localsettings path like this:

services:
  - module: lib/index.js
    entrypoint: apiServiceWorker
    conf:
        # For backwards compatibility, and to continue to support non-static
        # configs for the time being, optionally provide a path to a
        # localsettings.js file.  See localsettings.example.js
        localsettings: ./localsettings.js

and comment mwApis, uri and domain parameters like this:

        #mwApis:
        #- # This is the only required parameter,
          # the URL of you MediaWiki API endpoint.
          #uri: 'http://localhost/w/api.php'
          # The "domain" is used for communication with Visual Editor
          # and RESTBase.  It defaults to the hostname portion of
          # the `uri` property below, but you can manually set it
          # to an arbitrary string.
          #domain: 'localhost'  # optional

In this approach the configuration file lives in one of the following locations:

  • /etc/mediawiki/parsoid/settings.js (if you have installed from our Linux packages)
  • <parsoid directory>/api/localsettings.js (if you have followed the developer setup instructions)

Most configuration options are described in the file itself. The only required edit is to update it to reflect your API URL, something like:

parsoidConfig.setMwApi({ uri: 'http://yoursite.com/w/api.php', domain: 'yoursite.com', prefix: 'myspecialwiki' });

The prefix is an arbitrarily-selected short string identifying your local wiki, used in log messages. The prefix is also optional, an arbitrary unique string will be generated if it is omitted. Make sure that the VisualEditor configuration uses the same "domain" and/or "prefix" values as Parsoid. (See the VisualEditor configuration instructions.)


Multiple wikis sharing the same parsoid service

edit

If you have multiple wikis, make sure the domain string (and/or prefix if you are using localsettings.js as config file) is unique for each. Multiple wikis sharing a single host might need to explicitly set the domain property to an arbitrary unique string for each wiki.

Example of configuration in config.yaml for multiple wikis:

services:
  - module: lib/index.js
    entrypoint: apiServiceWorker
    conf:
        mwApis:
        - # First wiki
          uri: 'http://yoursite.com/w/api.php'
          domain: 'yoursite.com'  # optional
        - # If you have another wiki on a different domain
          uri: 'http://yourothersite.com/w/api.php'
          domain: 'yourothersite.com'  # optional
        - # If you have another wiki on the same domain
          uri: 'http://yoursite.com/w2/api.php'
          domain: 'wiki2'  # optional

Example of configuration in localsettings.js for multiple wikis:

parsoidConfig.setMwApi({ uri: 'http://yoursite.com/w/api.php', domain: 'yoursite.com', prefix: 'myspecialwiki' });
// If you have another wiki on a different domain:
parsoidConfig.setMwApi({ uri: 'http://yourothersite.com/w/api.php', domain: 'yourothersite.com', prefix: 'myotherspecialwiki' });
// If you have another wiki on the same domain:
// Note that the domain and prefix can be arbitrary, they just need to be unique to this wiki.
// (And they shouldn't contain slashes.)
parsoidConfig.setMwApi({ uri: 'http://yoursite.com/w2/api.php', domain: 'wiki2', prefix: 'wiki2' });

See Parsoid/JS/Troubleshooting#Configuration for additional troubleshooting help.

See Parsoid/Setup/RESTBase for information on how to configure a local RESTBase instance between your local VisualEditor and local Parsoid.

The Parsoid/Setup/RESTBase/Arbitrary domains page describes advanced RESTBase setup, but it may offer additional insight about the purpose of the prefix and domain properties.