Parsoid/Setup/RESTBase

This page describes Parsoid/JS , which has been replaced by Parsoid/PHP in MW 1.35 and newer.

This page documents how to configure RESTBase to point at a local Parsoid instance.

This is intended for developers only. This does not describe production configuration of RESTBase.

Setting up a RESTBase storage backend

edit

First you will need to set up a storage backend for RESTBase. When that is done, you'll need to configure RESTBase, Parsoid, and Visual Editor to point at each other.

Typical small installations

edit

If you aren't Wikipedia, we recommend that you use the SQLite backend. Installation for that is described at https://github.com/wikimedia/restbase-mod-table-sqlite.

Large installations

edit

Large installations can use Cassandra for scalable storage.

Tweaking Cassandra for testing

edit

RESTBase can use Cassandra for backend storage. However, the default configuration of Cassandra (on Debian, at least) is more suited for production than local development. For example, Cassandra's default configuration attempts to consume between 800M and 4G of memory for its heap, which can be a substantial fraction of total memory available on a developer's local machine.

After installing Cassandra, I recommend making the following configuration changes to reduce its memory footprint.

Add the following to /etc/cassandra/cassandra-env.sh:

MAX_HEAP_SIZE="128M"
HEAP_NEWSIZE="20M"

(There are probably already commented-out lines in cassandra-env.sh defining these; uncomment the lines and tweak the values.)

Make the following changes to /etc/cassandra/cassandra.yaml:

key_cache_size_in_mb: 0
concurrent_reads: 2
concurrent_writes: 2
rpc_server_type: hsha
rpc_min_threads: 1
rpc_max_threads: 1
concurrent_compactors: 1
compaction_throughput_mb_per_sec: 0

Pointing RESTBase at a local Parsoid

edit

Checkout RESTBase and copy config.example.yaml to config.yaml inside the restbase directory, if you have not already done so.

In your config.yaml there is a clause like:

      /{module:parsoid}:
        x-modules:
          - name: parsoid
            version: 1.0.0
            type: file
            options:
              parsoidHost: http://parsoid-lb.eqiad.wikimedia.org
              # For local testing, use:
              # parsoidHost: http://localhost:8000

Do what the comment says to do; that is, change the parsoidHost value to:

parsoidHost: http://localhost:8000

Make sure the port matches the value of serverPort set in /etc/mediawiki/parsoid/settings.js (or <parsoid directory>/api/localsettings.js if you have followed the developer setup instructions). The default port is 8142 if using the Debian/Ubuntu packages, and 8000 if running Parsoid from a source checkout.

Now you need to add a domain for your local wiki. It is most convenient to make this match the hostname in your wiki's api.php URL, but it can actually be an arbitrary string. In config.yaml there is a section like:

spec: &spec
  title: "The RESTBase root"
  # Some more general RESTBase info
  paths:
    /{domain:en.wikipedia.org}: *wp/default/1.0.0
#   /{domain:de.wikipedia.org}: *wp/default/1.0.0
#   /{domain:es.wikipedia.org}: *wp/default/1.0.0
#   /{domain:nl.wikipedia.org}: *wp/default/1.0.0
#
    # test domain
    /{domain:en.wikipedia.test.local}: *wp/default/1.0.0

You want to edit it to read:

spec: &spec
  title: "The RESTBase root"
  # Some more general RESTBase info
  paths:
    /{domain:localhost}: *wp/default/1.0.0

This configures RESTBase to use the domain "localhost". Again, it's most convenient if the part after domain: matches the hostname of the api.php URL you are going to specify below.

If it does not, see the /Arbitrary domains page for configuration information.

Now let's configure that api.php URL. Find a section like:

      /{module:action}:
        x-modules:
          - name: action
            type: file
            options:
              apiRequest:
                method: post
                uri: 'http://{domain}/w/api.php'
                headers:
                  host: '{$.request.params.domain}'
                body: '{$.request.body}'

Again, if your chosen "domain" matches the domain of your local wiki's api.php endpoint, you might not have to change anything here. But you can also edit this if needed. For example, if your Parsoid localsettings.js contains:

    parsoidConfig.setMwApi({ domain: 'somedomain', uri: 'http://localhost/~cananian/mediawiki/api.php' });

Then you need:

      /{module:action}:
        x-modules:
          - name: action
            type: file
            options:
              apiRequest:
                method: post
                uri: http://localhost/~cananian/mediawiki/api.php
                headers:
                  host: 'localhost'
                body: '{$.request.body}'
...
spec: &spec
  title: "The RESTBase root"
  # Some more general RESTBase info
  paths:
    /{domain:somedomain}: *wp/default/1.0.0

Note that I've changed the "host" header here too, since in this case RESTBase would be using the "somedomain" domain, and it would confuse my webserver if I sent "somedomain" as the host header in the HTTP request, since it thinks it is serving for "localhost".

The apiRequest endpoint is very flexible; you can perform some substitutions on the URL in order to configure a multiwiki setup. See /Arbitrary domains below for more complicated multiwiki configurations.

Lastly, you may also wish to change the default RESTBase port, in the services clause at the bottom of config.yaml. RESTBase starts up on port 7231 by default.

Note that RESTBase doesn't care what "prefix" Parsoid uses to describe your wiki (one of the optional fields in the call to parsoidConfig.setMwApi in the Parsoid configuration). It just cares about the "domain". By default Parsoid uses the host portion of the api.php URL as the domain, in which case all four of RESTBase, Parsoid, Visual Editor, and the web server actually serving api.php must agree on this. This is a common cause of setup issues, since there are usually multiple different domain names for the same host, and you might inadvertently use different names in different places. When in doubt, explicitly specify the "domain" separate from the api.php URL when you configure RESTBase, Parsoid, and Visual Editor. Remember that the "domain" can be an arbitrary string, so it can be helpful to set it to something unique (like "this-is-not-a-dns-domain") when debugging to avoid confusion.

Configuring VisualEditor

edit

Configuring VisualEditor to point to your local RESTBase is easy!

Add the following to your LocalSettings.php:

$wgVirtualRestConfig['modules']['restbase'] = array(
	'url' => 'http://localhost:7231',
	'domain' => 'somedomain', # matches the "domain" used above
	'forwardCookies' => false,
	'parsoidCompat' => false
);

And, optionally, for direct access to the RESTBase server from client-side code:

$wgVisualEditorRestbaseURL = 'http://localhost:7231/somedomain/v1/page/html/';

Note that the portion before /v1/ should match the "domain" used above.