Extension:Wikispeech/zh

This page is a translated version of the page Extension:Wikispeech and the translation is 16% complete.
This is the documentation for installing and configuring Wikispeech. If you are looking for help on how to use it, see Help:Extension:Wikispeech .
MediaWiki扩展手册
Wikispeech
发行状态: 测试版
实现 页面操作 , Ajax, API , 特殊页面 , 数据库
描述 Reads page text out loud using text-to-speech
作者 Sebastian Berlin, André Costa, Karl Wettin and Igor Leturia
最新版本 0.1.10 (2023-03-08)
MediaWiki >= 1.39
数据库更改
wikispeech_utterance
许可协议 GNU通用公眾授權條款2.0或更新版本
下載
帮助 Help:Extension:Wikispeech/zh
示例
  • $wgWikispeechSkipBackRewindsThreshold
  • $wgWikispeechSpeechoidUrl
  • $wgWikispeechSpeechoidHaproxyFrontendSvName
  • $wgWikispeechSpeechoidHaproxyOverloadFactor
  • $wgWikispeechRemoveTags
  • $wgWikispeechFeedbackPage
  • $wgWikispeechSpeechoidHaproxyStatsUrl
  • $wgWikispeechSymbolSetUrl
  • $wgWikispeechListenMetricsJournalFile
  • $wgWikispeechMinimumMinutesBetweenFlushExpiredUtterancesJobs
  • $wgWikispeechUtteranceFileBackendContainerName
  • $wgWikispeechKeyboardShortcuts
  • $wgWikispeechPronunciationLexiconConfiguration
  • $wgWikispeechSpeechoidResponseTimeoutSeconds
  • $wgWikispeechSpeechoidHaproxyFrontendPxName
  • $wgWikispeechUtteranceUseSwiftFileBackendExpiring
  • $wgWikispeechSpeechoidHaproxyQueueUrl
  • $wgWikispeechUtteranceFileBackendName
  • $wgWikispeechSpeechoidHaproxyBackendSvName
  • $wgWikispeechContentSelector
  • $wgWikispeechListenDoJournalMetrics
  • $wgWikispeechVoices
  • $wgWikispeechSpeechoidHaproxyBackendPxName
  • $wgWikispeechUtteranceTimeToLiveDays
  • $wgWikispeechProducerMode
  • $wgWikispeechHelpPage
  • $wgWikispeechNamespaces
  • $wgWikispeechListenMaximumInputCharacters
  • $wgWikispeechSegmentBreakingTags
  • wikispeech-listen
  • wikispeech-read-lexicon
  • wikispeech-edit-lexicon
季度下載量 7 (Ranked 132nd)
前往translatewiki.net翻譯Wikispeech扩展
Vagrant角色 wikispeech
問題 开启的任务 · 报告错误

The Wikispeech project aims to create an open source text-to-speech tool to make Wikimedia's projects more accessible for people that have difficulties reading for different reasons. Wikispeech will be available as a MediaWiki extension. More information can be found on the project page; this page is just about the Wikispeech extension itself.

Speechoid

Diagram over some typical interaction between the components in Wikispeech.

Documentation

Installation instructions

The extension uses a service for TTS operations, such as creating audio for utterances called Speechoid. Speechoid consists of a main server, a lexicon server, TTS engines and any additional components that may be required for certain languages.

To prepare an utterance for playing, the extension sends a request to the service. This request contains the utterance as text, which language it is in and which voice to use. The service processes the text using a lexicon and one of the installed TTS engines, depending on what voice is being used. Once the audio has been generated, a response is returned with audio data along with some information that will enable highlighting and skipping. This is then used by the extension to actually play the utterance to the user and the process is repeated for the following utterances as needed.

Main Wikispeech Server

资料库

The main server has a web API that includes an endpoint for generating speech. It handles internal communication between the underlying servers, listed below.

Pronlex

资料库

A lexicon server with its own API. Holds information about lexicon entries and has endpoints for lookup and manipulation of them. When processing an utterance, words are looked up in the lexicon and if there is a matching entry it is used for the pronunciation.

TTS engines

The server supports having multiple TTS engines. Which one is used for a certain utterance depends on which voice is given in the request.

MaryTTS

资料库

Comes with support for Arabic, English and Swedish.

Additional Components

Mishkal

资料库

Symbolset

资料库

Symbolset is a repository for handling phonetic symbol sets and mappers/converters between different symbol sets and languages.

安裝

  • 下载文件,并将解压后的Wikispeech文件夹移动到extensions/目录中。
    开发者和代码贡献人员应从Git安装扩展,输入:cd extensions/
    git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/Wikispeech
  • 将下列代码放置在您的LocalSettings.php 的底部:
    wfLoadExtension( 'Wikispeech' );
    
  •   完成 – 在您的wiki上导航至Special:Version,以验证已成功安装扩展。

Setting up Speechoid

The Wikispeech extension requires Speechoid to generate audio. Detailed instructions for installing Speechoid can be found on Installing Speechoid.

基本配置

For the Wikispeech extension to be able to communicate with Speechoid, you need to specify the service's URL. You can do this by adding the following line to LocalSettings.php :

$wgWikispeechSpeechoidUrl = 'URL';

where URL is the URL to your Speechoid instance.

Running as producer

If you want use your wiki to enable Wikispeech on other wikis you can enable "producer mode". One use case for this is to run Wikispeech as a gadget on other wikis.

Normally Wikispeech gets the text to synthesise from pages on the wiki it's installed on. If WikispeechProducerMode is true the wikispeech-listen action can take the parameter consumer-url. consumer-url should be set to the script path of the consumer wiki, e.g. for this wiki https://www.mediawiki.org/w. When the request is made Wikispeech will get the content from the consumer wiki and synthesise as normal. The utterance is stored with the extra parameter wsu_remote_wiki_hash in the database to keep track of which wiki it was generated from.

To enable Wikispeech with a gadget or user script see Help:Extension:Wikispeech#As gadget or user script.

Complete list of configuration options

选项 默认值 帮助文档
WikispeechSpeechoidUrl
""
The URL to use for the Speechoid service.
WikispeechSymbolSetUrl
""
The URL to use for the Symbol set service.
WikispeechSpeechoidResponseTimeoutSeconds
null
Default number of seconds to await an HTTP response from Speechoid. Falsy value defaults to MediaWiki default.
WikispeechListenMaximumInputCharacters
2048
Maximum number of characters in the input (a segment) sent to the Speechoid service.
WikispeechRemoveTags
{
    "span": "mw-editsection",
    "table": true,
    "sup": "reference",
    "div": [
        "thumb",
        "toc"
    ]
}
Map of HTML tags that should be removed completely, i.e. including any content. Keys are tag names and the values determine whether a tag should be removed, as follows:
  • If true, remove all tags of that type.
  • If an array, remove tags whose class matches any of the strings in the array.
  • If false, tags of that type will not be removed. This can be used in "LocalSettings.php" to override default criteria.
WikispeechSegmentBreakingTags
[
    "h1",
    "h2",
    "h3",
    "h4",
    "h5",
    "h6",
    "p",
    "br",
    "li"
]
HTML tags that will break text in segments. This ensure that, for example a header text without punctuation suffix will not be merged to the same segment as the text content of a preceding paragraph.
WikispeechNamespaces
[
    0
]
List of the namespace indices, for which Wikispeech is activated.
WikispeechKeyboardShortcuts
{
    "playStop": {
        "key": 13,
        "modifiers": [
            "alt",
            "shift"
        ]
    },
    "skipAheadSentence": {
        "key": 39,
        "modifiers": [
            "alt",
            "shift"
        ]
    },
    "skipBackSentence": {
        "key": 37,
        "modifiers": [
            "alt",
            "shift"
        ]
    },
    "skipAheadWord": {
        "key": 40,
        "modifiers": [
            "alt",
            "shift"
        ]
    },
    "skipBackWord": {
        "key": 38,
        "modifiers": [
            "alt",
            "shift"
        ]
    }
}
Shortcuts for Wikispeech commands. Each shortcut defines the key pressed (as key code[1]) and any modifier keys (ctrl, alt or shift).
WikispeechSkipBackRewindsThreshold
3.0
If an utterance has played longer than this (in seconds), skipping back will rewind to the start of the current utterance, instead of skipping to previous utterance.
WikispeechHelpPage
"Help:Wikispeech"
Help page for Wikispeech. If defined, a button that takes the user here is added next to the player buttons.
WikispeechFeedbackPage
"Wikispeech feedback"
Feedback page for Wikispeech. If defined, a button that takes the user here is added next to the player buttons.
WikispeechContentSelector
"#mw-content-text"
The selector for the element that contains the text of the page. Used internally, but may change with MediaWiki version.
WikispeechVoices
{
    "ar": [
        "ar-nah-hsmm"
    ],
    "en": [
        "dfki-spike-hsmm",
        "cmu-slt-hsmm"
    ],
    "sv": [
        "stts_sv_nst-hsmm"
    ]
}
Registered voices per language. System default voice falls back on the first registered voice for a language if not defined by Speechoid.
WikispeechMinimumMinutesBetweenFlushExpiredUtterancesJobs
30
Minimum number of minutes between queuing jobs that automatically flushes expired utterances from the utterance store. The job will be queued during creation of a new utterance, given that enough minutes has passed since the previous queuing of the job. Disable automatic flushing by setting value to a falsy value (0, false, null, etc). To avoid running the flush job too often, see the MW job documentation.
WikispeechUtteranceTimeToLiveDays
31
Minimum number of days for an utterance to live before being automatically flushed from the utterance store. More or less the cache flush setting for synthesized text. Setting this value too low will save disk space but cause frequently requested text segments to be re-synthesized more often with a CPU cost. Setting this value too high will block improvements to the voice synthesis. Setting this value to 0 will in effect turn off the cache and thus flush all utterances as soon as possible.
WikispeechUtteranceFileBackendName
""
FileBackend group defined in LocalSettings.php used for utterance audio and metadata files. If not defined in LocalSettings.php, a FSBackend will be created that work against a temporary directory. See log warnings for exact path.
WikispeechUtteranceFileBackendContainerName
"wikispeech_utterances"
Container name used in FileBackend for utterance audio and metadata files.
WikispeechUtteranceUseSwiftFileBackendExpiring
false
In case of the file backend is Swift and this value is set true,the Wikispeech will set the 'X-Delete-After' header when creating files in Swift and the utterance flushing mechanism will not invoke delete command in Swift. I.e. the actual flushing of utterances is moved to the Swift layer. In order for this to make sense, the Swift file backend must be set to accept these headers. For more information on how to do this see https://docs.openstack.org/swift/latest/overview_expiring_objects.html. This feature will be officially supported by Wikispeech as of the first LTS release of MediaWiki after 1.35 (i.e. probably 1.39).
WikispeechPronunciationLexiconConfiguration
"Wiki+Speechoid"
Controls how the pronunciation lexicon is persisted and accessed. 'Speechoid' must be a part of the chain in order to make an impact to the speech synthesis. Possible values are: 'Speechoid', access only the underlying lexicon in Speechoid. No revision history. 'Wiki+Speechoid', access the lexicon stored as articles in NS_PRONUNCIATION_LEXICON for revision history and passed down to Speechoid. 'Wiki', access only the lexicon stored as articles in NS_PRONUNCIATION_LEXICON. 'Cache', transient storage in MediaWiki WAN cache. For development only. 'Cache+Speechoid', transient storage in WAN cache, passed down to Speechoid. For development only.
WikispeechProducerMode
false
Run Wikispeech in producer mode. This allows other wikis (consumers) to use this wiki to generate utterances. When an API requests includes the parameter `consumer-url`, page content is retrieved from the consumer wiki on that URL.

CSS

This is a subset of the CSS rules that are most interesting for a non-developer.

Selector 默认值 帮助文档
.ext-wikispeech-highlight-sentence
background-color: rgb( 200, 170, 255 );
The visual highlighting for the sentence that is currently being recited.
.ext-wikispeech-highlight-word
background-color: rgb( 255, 200, 140 );
The visual highlighting for the word that is currently being recited.

參考資料