扩展:CirrusSearch

This page is a translated version of the page Extension:CirrusSearch and the translation is 40% complete.
Outdated translations are marked like this.
MediaWiki的扩展的手冊
CirrusSearch
發佈狀態: 穩定版本
实现 搜索, API , 钩子
描述 使用Elasticsearch实现搜索MediaWiki
作者 Nik Everett, Chad Horohoe, Erik Bernhardson
最新版本 持续更新
兼容性政策 快照跟随MediaWiki发布。 master分支不向後兼容。
MediaWiki >= 1.41.0
Composer mediawiki/cirrussearch
许可协议 GNU通用公眾授權條款2.0或更新版本
下載
README
  • $wgCirrusSearchLanguageWeight
  • $wgCirrusSearchAutomationCIDRs
  • $wgCirrusSearchUseIcuFolding
  • $wgCirrusSearchStemmedWeight
  • $wgCirrusSearchQueryStringMaxDeterminizedStates
  • $wgCirrusSearchCrossClusterSearch
  • $wgCirrusSearchExtraIndexSettings
  • $wgCirrusSearchAutomationUserAgentRegex
  • $wgCirrusSearchTalkNamespaceWeight
  • $wgCirrusSearchPrefixWeights
  • $wgCirrusSearchPrefixSearchRescoreProfile
  • $wgCirrusSearchDisableUpdate
  • $wgCirrusSearchActiveTest
  • $wgCirrusSearchExtraFieldsInSearchResults
  • $wgCirrusSearchMoreLikeThisMaxQueryTermsLimit
  • $wgCirrusSearchUseIcuTokenizer
  • $wgCirrusSearchCompletionBannedPageIds
  • $wgCirrusSearchOptimizeIndexForExperimentalHighlighter
  • $wgCirrusSearchRescoreProfiles
  • $wgCirrusSearchPhraseRescoreBoost
  • $wgCirrusSearchInterwikiProv
  • $wgCirrusSearchDefaultCluster
  • $wgCirrusSearchQueryStringMaxWildcards
  • $wgCirrusSearchElasticQuirks
  • $wgCirrusSearchMaxFileTextLength
  • $wgCirrusSearchFallbackProfiles
  • $wgCirrusSearchMoreLikeThisTTL
  • $wgCirrusSearchAllowLeadingWildcard
  • $wgCirrusSearchInterwikiPrefixOverrides
  • $wgCirrusSearchMaintenanceTimeout
  • $wgCirrusSearchReplicas
  • $wgCirrusSearchPhraseSlop
  • $wgCirrusSearchBoostOpening
  • $wgCirrusSearchWriteBackoffExponent
  • $wgCirrusSearchUserTesting
  • $wgCirrusSearchShardCount
  • $wgCirrusSearchUseCompletionSuggester
  • $wgCirrusSearchPhraseSuggestReverseField
  • $wgCirrusSearchFallbackProfile
  • $wgCirrusSearchFragmentSize
  • $wgCirrusSearchUnlinkedArticlesToUpdate
  • $wgCirrusSearchCustomPageFields
  • $wgCirrusSearchClientSideUpdateTimeout
  • $wgCirrusSearchIgnoreOnWikiBoostTemplates
  • $wgCirrusSearchRegexMaxDeterminizedStates
  • $wgCirrusSearchInterwikiHTTPConnectTimeout
  • $wgCirrusSearchExtraIndexes
  • $wgCirrusSearchCategoryDepth
  • $wgCirrusSearchMergeSettings
  • $wgCirrusSearchClusters
  • $wgCirrusSearchAllFields
  • $wgCirrusSearchBannedPlugins
  • $wgCirrusSearchMoreLikeThisConfig
  • $wgCirrusSearchClusterOverrides
  • $wgCirrusSearchCrossProjectBlockScorerProfiles
  • $wgCirrusSearchEnableIncomingLinkCounting
  • $wgCirrusSearchNearMatchWeight
  • $wgCirrusSearchWriteIsolateClusters
  • $wgCirrusSearchIndexedRedirects
  • $wgCirrusSearchIndexAllocation
  • $wgCirrusSearchNumCrossProjectSearchResults
  • $wgCirrusSearchLanguageDetectors
  • $wgCirrusSearchUpdateShardTimeout
  • $wgCirrusSearchEnableCrossProjectSearch
  • $wgCirrusSearchFullTextQueryBuilderProfiles
  • $wgCirrusSearchCompletionDefaultScore
  • $wgCirrusSearchWriteClusters
  • $wgCirrusSearchCompletionSuggesterHardLimit
  • $wgCirrusSearchRecycleCompletionSuggesterIndex
  • $wgCirrusSearchLogElasticRequests
  • $wgCirrusSearchConnectionAttempts
  • $wgCirrusSearchElasticaWritePartitionCounts
  • $wgCirrusSearchWikiToNameMap
  • $wgCirrusSearchMaxFullTextQueryLength
  • $wgCirrusSearchLogElasticRequestsSecret
  • $wgCirrusSearchEnableRegex
  • $wgCirrusSearchClientSideSearchTimeout
  • $wgCirrusSearchDeduplicateAnalysis
  • $wgCirrusSearchWMFExtraFeatures
  • $wgCirrusSearchExtraBackendLatency
  • $wgCirrusSearchNamespaceMappings
  • $wgCirrusSearchNamespaceResolutionMethod
  • $wgCirrusSearchPreferRecentUnspecifiedDecayPortion
  • $wgCirrusSearchCategoryMax
  • $wgCirrusSearchDocumentSizeLimiterProfiles
  • $wgCirrusSearchSearchShardTimeout
  • $wgCirrusSearchCategoryEndpoint
  • $wgCirrusSearchPrivateClusters
  • $wgCirrusSearchSimilarityProfiles
  • $wgCirrusSearchCrossProjectShowMultimedia
  • $wgCirrusSearchWeights
  • $wgCirrusSearchPoolCounterKey
  • $wgCirrusSearchCompletionProfiles
  • $wgCirrusSearchMaxShardsPerNode
  • $wgCirrusSearchEnableArchive
  • $wgCirrusSearchIndexDeletes
  • $wgCirrusSearchSimilarityProfile
  • $wgCirrusSearchInterwikiThreshold
  • $wgCirrusSearchDocumentSizeLimiterProfile
  • $wgCirrusSearchFiletypeAliases
  • $wgCirrusSearchDevelOptions
  • $wgCirrusSearchBoostTemplates
  • $wgCirrusSearchPrefixSearchStartsWithAnyWord
  • $wgCirrusSearchUpdateConflictRetryCount
  • $wgCirrusSearchInterwikiHTTPTimeout
  • $wgCirrusSearchFetchConfigFromApi
  • $wgCirrusSearchPrefixIds
  • $wgCirrusSearchExtraIndexBoostTemplates
  • $wgCirrusSearchFullTextQueryBuilderProfile
  • $wgCirrusSearchStripQuestionMarks
  • $wgCirrusSearchIndexBaseName
  • $wgCirrusSearchMoreLikeThisFields
  • $wgCirrusSearchTextcatConfig
  • $wgCirrusSearchMasterTimeout
  • $wgCirrusSearchSanityCheck
  • $wgCirrusSearchTextcatModel
  • $wgCirrusSearchNamespaceWeights
  • $wgCirrusSearchCrossProjectOrder
  • $wgCirrusSearchRescoreProfile
  • $wgCirrusSearchRescoreFunctionChains
  • $wgCirrusSearchMoreAccurateScoringMode
  • $wgCirrusSearchMaxPhraseTokens
  • $wgCirrusSearchCrossProjectSearchBlockList
  • $wgCirrusSearchCrossProjectProfiles
  • $wgCirrusSearchLanguageToWikiMap
  • $wgCirrusSearchMaxIncategoryOptions
  • $wgCirrusSearchEnableAltLanguage
  • $wgCirrusSearchWikimediaExtraPlugin
  • $wgCirrusSearchCompletionSuggesterUseDefaultSort
  • $wgCirrusSearchLinkedArticlesToUpdate
  • $wgCirrusSearchCompletionSuggesterSubphrases
  • $wgCirrusSearchPreferRecentDefaultHalfLife
  • $wgCirrusSearchSlowSearch
  • $wgCirrusSearchFunctionRescoreWindowSize
  • $wgCirrusSearchEnablePhraseSuggest
  • $wgCirrusSearchCompletionSettings
  • $wgCirrusSearchUseExperimentalHighlighter
  • $wgCirrusSearchFeedbackLink
  • $wgCirrusSearchUpdateDelay
  • $wgCirrusSearchRefreshInterval
  • $wgCirrusSearchInterleaveConfig
  • $wgCirrusSearchPhraseRescoreWindowSize
  • $wgCirrusSearchMoreLikeThisAllowedFields
  • $wgCirrusSearchPreferRecentDefaultDecayPortion
  • $wgCirrusSearchClientSideConnectTimeout
  • $wgCirrusSearchPhraseSuggestUseText
  • $wgCirrusSearchPhraseSuggestProfiles
  • $wgCirrusSearchDefaultNamespaceWeight
  • $wgCirrusSearchInterwikiSources
  • $wgCirrusSearchPhraseSuggestUseOpeningText
  • $wgCirrusSearchICUNormalizationUnicodeSetFilter
  • $wgCirrusSearchICUFoldingUnicodeSetFilter
  • $wgCirrusSearchReplicaGroup
季度下載量 447 (Ranked 23rd)
公開的wiki使用 1,226 (Ranked 212nd)
翻譯CirrusSearch的扩展,若在translatewiki.net可用
Vagrant角色 cirrussearch
問題 尚未完成的工作 · 报告錯誤

CirrusSearch扩展使用Elasticsearch实现搜索MediaWiki。

Elastic Search is a standalone third-party software that must be installed in advance. It's a database system that provides search and indexing functionality, and where the current text of all pages of your wiki will be indexed for faster search results. The communication between MediaWiki and ElasticSearch is done through web services.

这个页面的内容与插件的安装有关。 安装成功后,参看帮助:CirrusSearch 以了解用法


目标

  • 去除使该扩展难以安装的本地依赖关系
    • 仅有的依赖项是纯PHP MediaWiki扩展和Elasticsearch本身
  • 为其他MediaWiki扩展可扩展的Wiki页面提供近实时搜索索引

提供MWSearch 为用户提供的所有查询选项等

依赖性

PHP 和 cURL
  • Note that Elasticsearch versions prior to 6.8 are not compatible with PHP 8.
Elasticsearch

Every version of ElasticSearch change how web services work, and cause compatibility problems. You must install the version of Elastic Search compatible with the version of MediaWiki you are currently using:

  • MediaWiki 1.29.x - 1.30.x需要Elasticsearch 5.3.x - 5.4.x
  • MediaWiki 1.31.x - 1.32.x需要Elasticsearch 5.5.x - 5.6.x
  • MediaWiki 1.33.x - 1.38.x需要Elasticsearch 6.5.x - 6.8.x (建议6.8.23+)
  • MediaWiki 1.39+ require Elasticsearch 7.10.2 (6.8.23+ is possible using a compatibility layer )

请注意,还需要像OpenJDK这样的Java安装。 最好使用官方的Elasticsearch Docker镜像或自托管版本。 A managed product like Amazon OpenSearch (formerly Amazon Elasticsearch) can work but may require additional configuration depending on its specifics. For example, Amazon OpenSearch only listens for Elasticsearch API requests over HTTPS on port 443 (i.e. it does not expose the default Elasticsearch port 9200), so a TLS-enabled proxy (e.g. Nginx) can enable CirrusSearch to communicate with an Amazon OpenSearch cluster.

Elastica
  • Elastica是一个与Elasticsearch对话的PHP库。按照下面的说明安装Elastica。

其他
  • 由于CirrusSearch扩展实际处理作业,建议在Redis中设置作业以防止Notice: unserialize(): Error at offset 64870 of 65535 bytes in JobQueueDB.php之类的消息和Unsupported operand types之类的后续错误。 参见任务T157759

安裝

扩展:Elastica

Even though the instructions below tell you to only run Composer when installing from git, it may be necessary to issue it anyway in order to install all PHP dependencies.

  • 下载文件,并将其放置在您extensions/文件夹中的Elastica目录内。
  • 只有從git安裝才运行Composer来安装PHP依赖,通过发行composer install --no-dev至扩展目录。 (参见任务T173141了解潜在问题。)
  • 将下列代码放置在您的LocalSettings.php 的底部:
    wfLoadExtension( 'Elastica' );
    
  •   完成 – 在您的wiki上导航至Special:Version,以验证已成功安装扩展。

CirrusSearch

  • 下载文件,并将其放置在您extensions/文件夹中的CirrusSearch目录内。
  • 只有從git安裝才运行Composer来安装PHP依赖,通过发行composer install --no-dev至扩展目录。 (参见任务T173141了解潜在问题。)
  • 将下列代码放置在您的LocalSettings.php 的底部:
    wfLoadExtension( 'CirrusSearch' );
    
  • Now follow the setup instructions in the CirrusSearch README delivered with your extension i.e. $IP/extensions/CirrusSearch/README. Note that all info in it might not apply to your version of the extension, especially the version of Elasticsearch supported.
  • 按照以下要求进行配置
  •   完成 – 在您的wiki上导航至Special:Version,以验证已成功安装扩展。

升级

请遵循在CirrusSearch UPGRADE中的升级指南。

配置

The configuration parameters of CirrusSearch are documented at the "settings.txt" file. See also documentation on CirrusSearch configuration profiles.

Elasticsearch will fail to index for CirrusSearch if one is using a database name for MySQL which contains a capital character, e.g. "MyWikiDatabaseName". To mitigate this CirrusSearch provides the $wgCirrusSearchIndexBaseName configuration parameter which one needs to set, e.g. $wgCirrusSearchIndexBaseName = 'mywikidatabasename';.

钩子

CirrusSearch extension defines a number of hooks that other extensions can make use of to extend the core schema and modify documents. 以下钩子可用:

API

CirrusSearch features can be used in API queries. Searching happens via the normal search API, action=query&list=search; you can use CirrusSearch-specific features, such as the morelike: special prefix to find pages related to Marie Curie and radium: api.php?action=query&list=search&srsearch=morelike:Marie_Curie%7Cradium&srlimit=10&srprop=size&formatversion=2 Custom APIs and parameters are provided for querying CirrusSearch configuration and debug information:

參見

General links
Debugging

Local development

Elastic Search service can be run with the Vagrant role (cirrussearch) and MediaWiki Vagrant.

For Docker, you can use a command like docker run -d --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:6.8.2. Then follow the installation and configuration directions. If your web host is in a container you'll want to make sure the above container is on the same network, and in LocalSettings.php you will want to reference elasticsearch as the host name. This will not have the WMF plugins but can be sufficient for basic testing.