Wikidata Query Service/Implementation/Standalone
WDQS can be run as a service for any Wikibase instance, not just Wikidata. You can still follow the instructions in the documentation, with the changes described below.
To generate the dump of your database, use dumpRdf.php
script in the repo/maintenance
directory of Wikibase extension. Depending on your requirements, you may still want to run munge.sh
script, or you may load the resulting RDF directly into the database.
For development, you may also consider using the Docker-based setup at https://github.com/wmde/wikibase-release-pipeline .
Note that Blazegraph and Updater require a significant amount of memory to run, so it is recommended if running on a VM (or VM-like setup like Docker) or other memory-restricted environment to allocate enough memory; 4 to 8G should be a good guideline.
Required setup
editSo far, the following conditions should be fulfilled by Wikibase instance for WDQS to work properly:
Given Wikibase install top URL as WIKIBASE_URL
,
- RecentChanges API should be accessible at
WIKIBASE_URL/w/api.php
- Entity data dump should be accessible at
WIKIBASE_URL/wiki/Special:EntityData/Q123.ttl
for entityQ123
.
If your Wikibase instance has different URL scheme, the recommended way is to create web server redirects for these two, although these parts will be customizable as of WDQS 0.3.69 (via --apiPath /w/api.php
and --entityDataPath /wiki/Special:EntityData/
). See below about the URL customization.
You can also separately set Wikibase entity concept URL. The assumptions are, given base URL as CONCEPT_URL
:
- The entity prefix is
CONCEPT_URL/entity/
- The data prefix is
CONCEPT_URL/wiki/Special:EntityData/
You can verify those looking at wd:
and wdata:
prefixes in the entity dump URL above, e.g. https://www.wikidata.org/wiki/Special:EntityData/Q4.ttl
WDQS Configurations
editTwo main things you may need to change are Wikibase endpoint (the URL at which your Wikibase instance is accessible) and concept URI (the URI which prefixes the RDF URIs describing data in your instance). Note that these by default are related but are controlled independently, and do not have to match. By default both settings are set up to match Wikidata data.
If you're running a copy of Wikidata but on your own domain, you may need to change Wikibase endpoint. If you are running your own dataset, you also need to change concept URI.
Setting Wikibase endpoint
editThis setting controls the URL at which Wikibase instance is found. See above for the list of URLs that are expected to work relative to this URL.
For Updater:
- Use
--wikibaseUrl URL
option when running Updater to set up Wikibase URL.
For Munger:
- No changes are needed since Munger does not communicate with Wikibase
For Blazegraph:
- No changes are needed since Blazegraph does not communicate with Wikibase
Setting concept URI
editFor Munger:
- Use
--conceptUri URL
option when running Munger. The rules for the URL are the same as for Update above.
Example:
bash munge.sh -f mydump.ttl.gz -d data/split -- --conceptUri https://my-wikibase:8081
For Updater:
- Use
--conceptUri URL
option when running Updater. The URL should match the one seen in the TTL export inwd:
prefix, e.g. if the prefix is defined as:@prefix wd: <http://test.wikidata.org/entity/> .
then the URL will behttp://test.wikidata.org
Example:
bash runUpdate.sh -- --wikibaseUrl https://my-wikibase:8081 --conceptUri https://my-wikibase:8081
For Blazegraph:
- Set
wikibaseConceptUri
Java property when running Blazegraph. If you only change hostname, you can usewikibaseHost
instead. Example:
BLAZEGRAPH_OPTS="-DwikibaseConceptUri=https://my-wikibase:8081" bash ./runBlazegraph.sh
BLAZEGRAPH_OPTS="-DwikibaseHost=www.my-wikibasehost.org" bash ./runBlazegraph.sh
Setting entity namespaces
editFor Updater:
The updater looks for changes in namespaces 0 and 120 by default, which are the Item and Property namespaces on Wikidata.
In a default Wikibase installation, Item and Property are instead namespaces 120 and 122.
If your installation follows this setup, add the option --entityNamespaces 120,122
when running the updater.
(If you have other entity namespaces, e. g. for lexicographical data, make sure to add them to the list.)
GUI configurations
editIn order to configure GUI, you have to use source (not built/minimized) version of GUI: https://github.com/wikimedia/wikidata-query-gui
In the wikibase/config.js
file there are two settings you may want to change:
api.sparql.uri
: The URL of the SPARQL endpoint (can be relative to the GUI main URL)api.wikibase.uri
: The URL of the Wikidata API endpoint (including path, e.g.https://www.wikidata.org/w/api.php
.