Руководство:Доступ к базе данных

This page is a translated version of the page Manual:Database access and the translation is 14% complete.
Outdated translations are marked like this.

Эта статья предоставляет обзор доступа к базе данных и общие задачи базы данных в MediaWiki.

Когда кодите в MediaWiki, обычный доступ к базе данных идет только через функции MediaWiki для этих целей.

Модель данных

Информацию о модели базы данных MediaWiki, такую как описание таблиц и их содержимого, см. Руководство:Макет базы данных и $tables. Изначально в MediaWiki это было задокументировано в maintenance/tables.sql, однако начиная с MediaWiki 1.35, его постепенно переместили в maintenance/tables.json в рамках инициативы абстрактной схемы. This means the maintenance/tables.json is turned into maintenance/tables-generated.sql by a maintenance script , making it easier to generate schema files to support different database engines.

Авторизация в MySQL

Использование sql.php

MediaWiki provides a maintenance script to access the database. From the maintenance directory run:

php sql.php

You can then write out database queries. Alternatively you can provide a filename, and MediaWiki will execute it, substituting any MW special variables as appropriate. For more information, see Manual:Sql.php .

This will work for all database backends. However, the prompt is not as full features as the command line clients that come with your database.

Использование клиента для командной строки mysql

## Настройки базы данных
$wgDBtype           = "mysql";
$wgDBserver         = "localhost";
$wgDBname           = "your-database-name";
$wgDBuser           = "your-database-username";  // Default: root
$wgDBpassword       = "your-password";

В LocalSettings.php вы найдете логин и пароль от MySQL, например:

With SSH, login by entering the following:

mysql -u <$wgDBuser> -p --database=<$wgDBname>

Замените <$wgDBuser> и <$wgDBname> информацией из LocalSettings.php. После чего вам будет предложено ввести пароль $wgDBpassword после чего вы увидите приглашение mysql>.

Уровень абстракции базы данных

MediaWiki предоставляет слой абстракции баз данных. Если вы работаете на этом уровне абстракции, вы никогда не должны напрямую вызывать функции баз данных PHP (такие как mysql_query() или pg_send_query().)

The abstraction layer is accessed through the Wikimedia\Rdbms\Database class. An instance of this class can be acquired by calling getConnectionRef() (preferred) or getConnection() on an injected ILoadBalancer. The function wfGetDB() is being phased out and should not be used in new code. Обычно, wfGetDB() вызывается с одним параметром, которым может быть константа DB_REPLICA (для запросов чтения) или DB_MASTER (для запросов записи и запросов чтения, которым требуется абсолютно новая информация). Различие между master и slave важно в среде со множеством баз данных, таких как Wikimedia. См. Раздел Оберточные функции ниже, чтобы узнать, что можно сделать с возвращенным объектом Database.

Обертками результата запроса select являются массивы с целыми ключами, начиная с 1. Чтобы прочитать запрос, что-то вроде этого, как правило, достаточно:

$lb = MediaWikiServices::getInstance()->getDBLoadBalancer();
$dbr = $lb->getConnectionRef( DB_REPLICA );
$res = $dbr->select( /* ...see docs... */ );
foreach( $res as $row ) {

Для запроса записи используют что-то вроде этого:

$dbw = $lb->getConnectionRef( DB_PRIMARY );
$dbw->insert( /* ...see docs... */ );

We use the convention $dbr for read and $dbw for write to help you keep track of whether the database object is a replica (read-only) or a primary (read/write). If you write to a replica, the world will explode. Or to be precise, a subsequent write query which succeeded on the primary server may fail when replicated to the replica due to a unique key collision. Replication on the replica will stop and it may take hours to repair the database and get it back online. Setting read_only in my.cnf on the replica will avoid this scenario, but given the dire consequences, we prefer to have as many checks as possible.

Wrapper functions

We provide a query() function for raw SQL, but the wrapper functions like select() and insert() are usually more convenient. They can take care of things like table prefixes and escaping for you under some circumstances. If you really need to make your own SQL, please read the documentation for tableName() and addQuotes(). You will need both of them. Please keep in mind that failing to use addQuotes() properly can introduce severe security holes into your wiki.

Another important reason to use the high level methods rather than constructing your own queries is to ensure that your code will run properly regardless of the database type. Currently the best support is for MySQL/MariaDB. There is also good support for SQLite, however it is much slower than MySQL or MariaDB. There is support for PostgreSQL, but it is not as stable as MySQL.

In the following, the available wrapper functions are listed. For a detailed description of the parameters of the wrapper functions, please refer to class Database's docs. Particularly see Database::select for an explanation of the $table, $vars, $conds, $fname, $options, and $join_conds parameters that are used by many of the other wrapper functions.

The parameters $table, $vars, $conds, $fname, $options, and $join_conds should NOT be null or false (that was working until REL 1.35) but empty string '' or empty array [].
function select( $table, $vars, $conds = '', $fname = 'Database::select', $options = [], $join_conds = [] );
function selectField( $table, $var, $cond = '', $fname = __METHOD__, $options = [] );
function selectRow( $table, $vars, $conds = '', $fname = 'Database::select', $options = [] );
function insert( $table, $a, $fname = 'Database::insert', $options = [] );
function insertSelect( $destTable, $srcTable, $varMap, $conds, $fname = 'Database::insertSelect', $insertOptions = [], $selectOptions = [] );
function update( $table, $values, $conds, $fname = 'Database::update', $options = [] );
function delete( $table, $conds, $fname = 'Database::delete' );
function deleteJoin( $delTable, $joinTable, $delVar, $joinVar, $conds, $fname = 'Database::deleteJoin' );
function buildLike(/*...*/);

Wrapper function: select()

The select() function provides the MediaWiki interface for a SELECT statement. The components of the SELECT statement are coded as parameters of the select() function. An example is

$dbr = $lb->getConnectionRef( DB_REPLICA );
$res = $dbr->select(
	'category',                              // $table The table to query FROM (or array of tables)
	[ 'cat_title', 'cat_pages' ],            // $vars (columns of the table to SELECT)
	'cat_pages > 0',                         // $conds (The WHERE conditions)
	__METHOD__,                              // $fname The current __METHOD__ (for performance tracking)
	[ 'ORDER BY' => 'cat_title ASC' ]        // $options = []

This example corresponds to the query

SELECT cat_title, cat_pages FROM category WHERE cat_pages > 0 ORDER BY cat_title ASC

JOINs are also possible; for example:

$res = $dbw->select(
	[ 'watchlist', 'user_properties' ],
	[ 'wl_user' ],
		'wl_user != 1' ,
		'wl_namespace' => '0',
		'wl_title' => 'Main_page',
		'up_property' => 'enotifwatchlistpages',
		'user_properties' => [ 'INNER JOIN', [ 'wl_user=up_user' ] ]

This example corresponds to the query

SELECT wl_user FROM `watchlist` INNER JOIN `user_properties` ON ((wl_user=up_user)) WHERE (wl_user != 1) AND wl_namespace = '0' AND wl_title = 'Main_page'
AND up_property = 'enotifwatchlistpages'

Extension:OrphanedTalkPages provides an example of how to use table aliases in queries.

Arguments are either single values (such as 'category' and 'cat_pages > 0') or arrays, if more than one value is passed for an argument position (such as ['cat_pages > 0', $myNextCond]). If you pass in strings to the third or fifth argument, you must manually use Database::addQuotes() on your values as you construct the string, as the wrapper will not do this for you. The values for table names (1st argument) or field names (2nd argument) must not be user controlled. The array construction for $conds is somewhat limited; it can only do equality and IS NULL relationships (i.e. WHERE key = 'value').

You can access individual rows of the result using a foreach loop. Once you have a row object, you can use the -> operator to access a specific field. A full example might be:

$dbr = $lb->getConnectionRef( DB_REPLICA );
$res = $dbr->select(
	'category',                              // $table
	[ 'cat_title', 'cat_pages' ],            // $vars (columns of the table)
	'cat_pages > 0',                         // $conds
	__METHOD__,                              // $fname = 'Database::select',
	[ 'ORDER BY' => 'cat_title ASC' ]        // $options = []
$output = '';
foreach( $res as $row ) {
        $output .= 'Category ' . $row->cat_title . ' contains ' . $row->cat_pages . " entries.\n";

Which will put an alphabetical list of categories with how many entries each category has in the variable $output. If you are outputting as HTML, ensure to escape values from the database with htmlspecialchars()

Convenience functions

Версия MediaWiki:

For compatibility with PostgreSQL, insert ids are obtained using nextSequenceValue() and insertId(). The parameter for nextSequenceValue() can be obtained from the CREATE SEQUENCE statement in maintenance/postgres/tables.sql and always follows the format of x_y_seq, with x being the table name (e.g. page) and y being the primary key (e.g. page_id), e.g. page_page_id_seq. For example:

$id = $dbw->nextSequenceValue( 'page_page_id_seq' );
$dbw->insert( 'page', [ 'page_id' => $id ] );
$id = $dbw->insertId();

For some other useful functions, e.g. affectedRows(), numRows(), etc., see Manual:Database.php.

Basic query optimization

MediaWiki developers who need to write DB queries should have some understanding of databases and the performance issues associated with them. Patches containing unacceptably slow features will not be accepted. Unindexed queries are generally not welcome in MediaWiki, except in special pages derived from QueryPage. It's a common pitfall for new developers to submit code containing SQL queries which examine huge numbers of rows. Remember that COUNT(*) is O(N), counting rows in a table is like counting beans in a bucket.

Backward compatibility

Often, due to design changes to the DB, different DB accesses are necessary to ensure backward compatibility. This can be handled for example with the global variables $wgDBprefix and $wgVersion :

$res = WrapperClass::getQueryFoo();

class WrapperClass {

	public static function getQueryFoo() {
		global $wgDBprefix, $wgVersion;

		$param = '';
		if ( version_compare( $wgVersion, '1.33', '<' ) ) {
			$param = self::getQueryInfoFooBefore_v1_33( $wgDBprefix );
		} else {
			$param = self::getQueryInfoFoo( $wgDBprefix );

		return = $dbw->select(
			$param['join_conds'] );

	private static function getQueryInfoFoo( $prefix ) {
		return [
			'tables' => [ 'table1', 'table2', 'table3' ],
			'fields' => [
				'field_name1' => $prefix . 'table1.field1',
				'field_name2' => 'field2',
			'conds' => [ 
			'join_conds' => [
				'table2' => [
					'INNER JOIN',
				'table3' => [
					'LEFT JOIN',
			'options' => [ 

	private static function getQueryInfoFooBefore_v1_33( $prefix ) {
		return [
			'tables' => [ 'table1', 'table2', 'table3_before' ],
			'fields' => [
				'field_name1' => $prefix . 'table1.field1',
				'field_name2' => 'field2_before',
			'conds' => [ 
			'join_conds' => [
				'table2' => [
					'INNER JOIN',
				'table3_before' => [
					'LEFT JOIN',
			'options' => [ 


Большие пользователи MediaWiki, такие как Wikipedia, используют большой набор подчиненных серверов MySQL, копирующих записи, сделанные на главный сервер MySQL. Важно понимать проблемы, связанные с этой настройкой, если вы хотите написать код, предназначенный для Википедии.

Часто бывает так, что лучший алгоритм, который нужно использовать для данной задачи, зависит от того, используется ли копирование. Из-за нашего безоговорочного центризма в Википедии мы часто просто используем дружественную к копированию версию, но если хотите, вы можете использовать wfGetLB()->getServerCount() > 1, чтобы проверить, используется ли копирование.


Lag primarily occurs when large write queries are sent to the primary server. Writes on the primary server are executed in parallel, but they are executed in serial when they are replicated to the replicas. The primary server writes the query to the binlog when the transaction is committed. The replicas poll the binlog and start executing the query as soon as it appears. They can service reads while they are performing a write query, but will not read anything more from the binlog and thus will perform no more writes. This means that if the write query runs for a long time, the replicas will lag behind the primary server for the time it takes for the write query to complete.

Lag can be exacerbated by high read load. MediaWiki's load balancer will stop sending reads to a replica when it is lagged by more than 30 seconds. If the load ratios are set incorrectly, or if there is too much load generally, this may lead to a replica permanently hovering around 30 seconds lag.

If all replicas are lagged by more than 30 seconds (according to $wgDBservers ), MediaWiki will stop writing to the database. All edits and other write operations will be refused, with an error returned to the user. This gives the replicas a chance to catch up. Before we had this mechanism, the replicas would regularly lag by several minutes, making review of recent edits difficult.

In addition to this, MediaWiki attempts to ensure that the user sees events occurring on the wiki in chronological order. A few seconds of lag can be tolerated, as long as the user sees a consistent picture from subsequent requests. This is done by saving the primary binlog position in the session, and then at the start of each request, waiting for the replica to catch up to that position before doing any reads from it. If this wait times out, reads are allowed anyway, but the request is considered to be in "lagged replica mode". Lagged replica mode can be checked by calling wfGetLB()->getLaggedReplicaMode(). The only practical consequence at present is a warning displayed in the page footer.

Shell users can check replication lag with getLagTimes.php ; the other users with the siteinfo API.

Databases often have their own monitoring systems in place as well, see for instance wikitech:MariaDB#Replication lag (Wikimedia) and wikitech:Help:Toolforge/Database#Identifying lag (Wikimedia Cloud VPS).

Lag avoidance

To avoid excessive lag, queries that write large numbers of rows should be split up, generally to write one row at a time. Multi-row INSERT ... SELECT queries are the worst offenders and should be avoided altogether. Instead do the select first and then the insert.

Even small writes can cause lag if they are done at a very high speed and replication is unable to keep up. This most commonly happens in maintenance scripts. To prevent it, you should call LBFactory::waitForReplication() after every few hundred writes. Most scripts make the exact number configurable:

class MyMaintenanceScript extends Maintenance {
    public function __construct() {
        // ...
        $this->setBatchSize( 100 );

    public function execute() {
        $lbFactory = MediaWikiServices::getInstance()->getDBLoadBalancerFactory();
        $limit = $this->getBatchSize();
        while ( true ) {
             // ...select up to $limit rows to write, break the loop if there are no more rows...
             // ...do the writes...

Working with lag

Despite our best efforts, it's not practical to guarantee a low-lag environment. Replication lag will usually be less than one second, but may occasionally be up to 30 seconds. For scalability, it's very important to keep load on the primary server low, so simply sending all your queries to the primary server is not the answer. So when you have a genuine need for up-to-date data, the following approach is advised:

  1. Do a quick query to the primary server for a sequence number or timestamp
  2. Run the full query on the replica and check if it matches the data you got from the primary server
  3. If it doesn't, run the full query on the primary server

To avoid swamping the primary server every time the replicas lag, use of this approach should be kept to a minimum. In most cases you should just read from the replica and let the user deal with the delay.

Lock contention

Due to the high write rate on Wikipedia (and some other wikis), MediaWiki developers need to be very careful to structure their writes to avoid long-lasting locks. By default, MediaWiki opens a transaction at the first query, and commits it before the output is sent. Locks will be held from the time when the query is done until the commit. So you can reduce lock time by doing as much processing as possible before you do your write queries. Update operations which do not require database access can be delayed until after the commit by adding an object to $wgPostCommitUpdateList .

Often this approach is not good enough, and it becomes necessary to enclose small groups of queries in their own transaction. Use the following syntax:

$factory = \MediaWiki\MediaWikiServices::getInstance()->getDBLoadBalancerFactory();
/* Do queries */

Use of locking reads (e.g. the FOR UPDATE clause) is not advised. They are poorly implemented in InnoDB and will cause regular deadlock errors. It's also surprisingly easy to cripple the wiki with lock contention.

Instead of locking reads, combine your existence checks into your write queries, by using an appropriate condition in the WHERE clause of an UPDATE, or by using unique indexes in combination with INSERT IGNORE. Then use the affected row count to see if the query succeeded.

Database schema

Don't forget about indexes when designing databases, things may work smoothly on your test wiki with a dozen of pages, but will bring a real wiki to a halt. See above for details.

For naming conventions, see Manual:Coding conventions/Database .

SQLite compatibility

When writing MySQL table definitions or upgrade patches, it is important to remember that SQLite shares MySQL's schema, but that works only if definitions are written in a specific way:

  • Primary keys must be declared within main table declaration, but normal keys should be added separately with CREATE INDEX:
Wrong Right
CREATE TABLE /*_*/foo (
    foo_text VARCHAR(256),
    PRIMARY KEY(foo_id),
CREATE TABLE /*_*/foo (
    foo_text VARCHAR(256)
) /*$wgDBTableOptions*/;

CREATE INDEX /*i*/foo_text ON /*_*/foo (foo_text);

However, primary keys spanning over more than one field should be included in the main table definition:

CREATE TABLE /*_*/foo (
    foo_id INT NOT NULL,
    foo_text VARCHAR(256),
    PRIMARY KEY(foo_id, foo_text)
) /*$wgDBTableOptions*/;

CREATE INDEX /*i*/foo_text ON /*_*/foo (foo_text);
/*i*/ has been removed in MediaWiki 1.35, see Variable replacement.
  • Don't add more than one column per statement:
Wrong Right
ALTER TABLE /*_*/foo
    ADD foo_bar BLOB,
    ADD foo_baz INT;
ALTER TABLE /*_*/foo ADD foo_bar BLOB;
ALTER TABLE /*_*/foo ADD foo_baz INT;
  • Set explicit defaults when adding NOT NULL columns:
Wrong Right
ALTER TABLE /*_*/foo ADD COLUMN foo_bar varchar(32) BINARY NOT NULL;
ALTER TABLE /*_*/foo ADD COLUMN foo_bar varchar(32) BINARY NOT NULL DEFAULT '';

You can run basic compatibility checks with:

Or, if you need to test an update patch, both:

  • php sqlite.php --check-syntax tables.sql (with the new tables.sql)
  • php sqlite.php --check-syntax tables.sql filename.sql
    • Since DB patches update the tables.sql file as well, for this one you should pass in the pre-commit version of tables.sql (the file with the full DB definition). Otherwise, you can get an error if you e.g. drop an index (since it already doesn't exist in tables.sql because you just removed it).

The above assumes you're in $IP/maintenance/, otherwise, pass the full path of the file. For extension patches, use the extension's equivalent of these files.

См. также