Příručka:Databázová tabulka searchindex

This page is a translated version of the page Manual:Searchindex table and the translation is 46% complete.
Manuál:Obsah MediaWiki schéma databáze Tabulka searchindex

S tabulkou searchindex se pracuje při fulltextovém vyhledávání (viz searches ). U MySQL nužších verzí než 5.6, bylo možné fulltextové indexování jen pokud byla tabulka typu MyISAM, ale ne InnoDB. Jenže tabulka 'text' (do verze 1.4 'cur') je však typu InnoDB, protože je výhodnější z hlediska výkonu při souběžném přístupu k datům, takže aby bylo možné využít indexování na straně databáze, bylo nutné pro každou stránku vytvořit kopii. A to je důvod proč existuje tahle tabulka. Pokud používáte Postgres, tak vaše databáze tuhle tabulku nemá: informace pro fulltextové vyhledávání má v tabulce page a pagecontent svůj vlastní sloupec, do kterého se zapisuje rovnou.

This table is populated when edits are saved on pages, unless $wgDisableSearchUpdate is set to true, or an extension that implements its own search index (like CirrusSearch extension) is installed. The rebuildtextindex.php maintenance script can be used to populate it from scratch, or updateSearchIndex.php to update it for recently changed pages.

Pole

si_page

Key to page_id .

si_title

Munged version of page title.

si_text

Munged version of the current text of the page.

Podrobnosti

Indexes for this, using en wikipedia for cardinality:

+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table       | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| searchindex |          0 | si_page  |            1 | si_page     | A         |      797275 |     NULL | NULL   |      | BTREE      |         |
| searchindex |          1 | si_title |            1 | si_title    | A         |      265758 |     NULL | NULL   |      | FULLTEXT   |         |
+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

A common search query is of the form:

SELECT page_id, page_namespace, page_title FROM `page`,`searchindex`
WHERE page_id=si_page
AND MATCH(si_text) AGAINST('+''searchterm''' IN BOOLEAN MODE)
AND page_is_redirect=0
AND page_namespace IN (0)
LIMIT 20

In MediaWiki version 1.4 and earlier:

SELECT cur_id,cur_namespace,cur_title,cur_text FROM cur,searchindex
WHERE cur_id=si_page
AND (
  MATCH(si_text) AGAINST('+''searchterm''' IN BOOLEAN MODE)
  AND cur_is_redirect=0
)
AND cur_namespace IN (0,9,11)
LIMIT 0, 20;

Explain shows the following:

+-------------+--------+-------------------------------------------+---------------+---------+------------+--------+-------------+
| table       | type   | possible_keys                             | key           | key_len | ref        | rows   | Extra       |
+-------------+--------+-------------------------------------------+---------------+---------+------------+--------+-------------+
| cur         | range  | cur_id,cur_namespace,name_title_timestamp | cur_namespace |       1 | NULL       | 317499 | Using where |
| searchindex | eq_ref | si_page                                   | si_page       |       4 | cur.cur_id |      1 | Using where |
+-------------+--------+-------------------------------------------+---------------+---------+------------+--------+-------------+

Známé problémy

A range scan with namespaces containing a large number of records requires that a large portion of the searchindex table be loaded into RAM. This is very costly, in part because the searchindex table contains the full text of the article. A long term plan is to remove the text from the searchindex table. Temporary measures may involve adding fields to searchindex and indexing them (and using a self join to allow both that index and the fulltext index to be used) or adding a searchindex index to decrease the number of complete searchindex records which must be read to match the query.

Přehled změn ve schématu tabulky

Verze MediaWiki:
1.10

DESCRIBE searchindex;

+----------+------------------+------+-----+---------+-------+
| Field    | Type             | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+-------+
| si_page  | int(10) unsigned | NO   | PRI | NULL    |       |
| si_title | varchar(255)     | NO   | MUL |         |       |
| si_text  | mediumtext       | NO   | MUL | NULL    |       |
+----------+------------------+------+-----+---------+-------+
Verze MediaWiki:
1.1 – 1.9

DESCRIBE searchindex;

+----------+-----------------+------+-----+---------+-------+
| Field    | Type            | Null | Key | Default | Extra |
+----------+-----------------+------+-----+---------+-------+
| si_page  | int(8) unsigned | NO   | PRI | NULL    |       |
| si_title | varchar(255)    | NO   | MUL | NULL    |       |
| si_text  | mediumtext      | NO   | MUL | NULL    |       |
+----------+-----------------+------+-----+---------+-------+

Indexy

Verze MediaWiki:
1.4

SHOW INDEX IN searchindex;

+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table       | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| searchindex |          0 | si_page  |            1 | si_page     | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| searchindex |          1 | si_title |            1 | si_title    | NULL      |           0 |     NULL | NULL   |      | FULLTEXT   |         |               |
| searchindex |          1 | si_text  |            1 | si_text     | NULL      |           0 |     NULL | NULL   |      | FULLTEXT   |         |               |
+-------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+