Help:Extension:Wikispeech/Lexicon editor
This page is currently a draft.
|
Wikispeech has a pronunciation lexicon that is used when reading text. Every time a sentence is read, the words are looked up in the lexicon. If a word has a pronunciation defined it will be used and if not the pronunciation will be guessed based on the spelling. This means that it's not necessary to have a predefined pronunciation, but it helps. It's especially useful in cases where pronunciations are irregular such as with loan words and proper nouns.
This lexicon can be edited through Special:EditLexicon. With this special page you can look up words in the lexicon and edit their pronunciations. You can also add new pronunciations for words that don't have any. The pronunciation is entered using IPA.
Edit a word
editSelect the language, enter the word and click next.
At the next page you will see a field, id, and a at the bottom a list of existing entries matching the word. Select the id of the entry that you want to edit and click next. If the entry you're looking for doesn't exist, instead see "Add a word".
The next page has fields to modify the selected entry. Transcription contains the phonetic transcription in IPA, see "Transcriptions". After entering a transcription you can click preview to have it read out. Note that it may take a few seconds to generate the preview. If there is an error during the preview generation you will see a popup dialogue, see "Transcription preview errors". You can also select whether the entry should be preferred or not. A preferred entry will be prioritised when the lexicon is used. Click save to save the changes to the lexicon.
Add a word
editSelect the language, enter the word and click next.
At the next page you will see a field, id, and a at the bottom a list of existing entries matching the word. Select the "New" for the id and click next.
The next page has fields for the new entry. Transcription contains the phonetic transcription in IPA, see "Transcriptions". After entering a transcription you can click preview to have it read out. Note that it may take a few seconds to generate preview. If there is an error during the preview generation you will see a popup dialogue, see "Transcription preview errors". You can also select whether the entry should be preferred or not. A preferred entry will be prioritised when the lexicon is used. Click save to save the changes to the lexicon.
Transcriptions
editTranscriptions tells Wikispeech how a word should be pronounced. IPA is used when entering the transcription in the lexicon editor.
Entering a transcription can be done using the keyboard if the Universal Language Selector is installed. In that case you will briefly see a keyboard icon next to the text field when you type. Clicking it and selecting "International Phonetic Alphabet - X-SAMPA" or pressing Ctrl+M will switch the input.
You can also copy a transcription from e.g. an article or a Wiktionary entry.
Transcription preview errors
editIf the transcription that you entered was invalid you'll get an error popup. It will tell you what went wrong.
ERROR: failed mapping transcription : found unknown phonemes in transcription...
editOne or more of the phonemes used were not in the symbol set. A symbol set is a list of phonemes that can be used and varies from language to language. At the end of the message there will be a list of the unknown phonemes inside square brackets, e.g. [a e].
Solution
editReplace the unknown phonemes with ones from the symbol set. A list of the available phonemes can be found under "Symbol sets".
Symbol sets
editEnglish
editPhoneme | Example word | Unicode |
---|---|---|
p | pin | U+0070 |
t | tin | U+0074 |
k | kin | U+006B |
b | bin | U+0062 |
d | din | U+0064 |
g | give | U+0067 |
t⁀ʃ | chin | U+0074 U+2040 U+0283 |
d⁀ʒ | gin | U+0064 U+2040 U+0292 |
f | fin | U+0066 |
v | vim | U+0076 |
θ | thin | U+03B8 |
ð | this | U+00F0 |
s | sin | U+0073 |
z | zing | U+007A |
ʃ | shin | U+0283 |
ʒ | measure | U+0292 |
h | hit | U+0068 |
l | long | U+006C |
m | mock | U+006D |
n | knock | U+006E |
ŋ | thing | U+014B |
r | wrong | U+0072 |
w | wasp | U+0077 |
j | yacht | U+006A |
ɒ | pot | U+0252 |
ɔ | cause | U+0254 |
u | lose | U+0075 |
i | ease | U+0069 |
æ | pat | U+00E6 |
ʌ | cut | U+028C |
ɛ | pet | U+025B |
ɪ | pit | U+026A |
ʊ | put | U+028A |
ə | allow | U+0259 |
ɝ | furs | U+025D |
a⁀ʊ | rouse | U+0061 U+2040 U+028A |
ɔ⁀ɪ | noise | U+0254 U+2040 U+026A |
o⁀ʊ | nose | U+006F U+2040 U+028A |
e⁀ɪ | raise | U+0065 U+2040 U+026A |
a⁀ɪ | rise | U+0061 U+2040 U+026A |
. | syllable delimiter | U+002E |
ˈ | primary stress | U+02C8 |
ˌ | secondary stress | U+02CC |