About this board

This is a MediaWiki.org user page.

If you find this page on any site other than MediaWiki.org, you are viewing a mirror site. Be aware that the page may be outdated, and that the user this page belongs to may have no personal affiliation with any site other than MediaWiki.org itself. The original page is located at https://www.mediawiki.org/wiki/User_talk:Shirayuki.

The MediaWiki logo
The MediaWiki logo
Previous page history was archived for backup purposes at User talk:Shirayuki/LQT Archive 1 on 2015-07-10.

Splitting translatable paragraphs

15
Amire80 (talkcontribs)

Shirayuki,

I'm asking you one last time.

Please stop splitting paragraphs in translatable pages into many units. Especially if they are already translated, and even if they are not.

The page Global templates/Proposed specification was completely translated to several languages. You messed it up with your "translation tweaks".

You didn't bother fixing the translations. You didn't even translate the page into your language. You are just doing it over and over again, and you call it "tweaks".

It's not a "tweak". A tweak is supposed to improve something. This is not an improvement. This is making a mess.

You contribute a lot to translatable pages, which is good, but this incessant splitting is not desirable. I'll propose to revoke your translation administrator rights if you keep doing this.

Want (talkcontribs)

Sorry, but I feel the need to defend Shirayuki's actions. As a translator, developer and editor of translation wiki pages, I am fully aware of why he does this. Technical texts, if they are longer, are much more difficult to translate in large blocks.

Next.

Do you know what revision looks like in the database? In that case, you know that is much advantageous for it, if the content of the page consists of a larger number of smaller TUs than vice versa.

Why?

  • A paragraph composed from several TU is better for update. Only one sentence usually is add or changed, not the whole paragraph.
  • Shortly text is better to understand. Easy TU can be fastly translated and the code expert translate only complicated TU.
  • Databases can work more efficiently with repetitives datas and shortly TUs on MW are often repeated.
  • When you need to change the link in TU on the origin page, you don't need to revoke the entire paragraph (and do big change). And someone may be actualized it and not must be understand the language – just look at the differences before.

More reasons I can give for it, but reaction in talk page isn't not right place for it, because it limited chars.

Amire80 (talkcontribs)

I am one of the developers of the Translate extension, so yes, I know how it looks like in the database, and I have no reason to think that what you say is technically correct.

Want (talkcontribs)

In that case, you should explain how I'm wrong. I also manage the server my wiki runs on, so I'm looking at this as root as well.

As far as I know, create template for each language subversion after tagging for translation, and the translations messages are incorporate into it then interpreted. Each edition is a unique revision. Not all are loaded, but only the currently valid ones. If I do change in small TU is revision minimal. It must be effect to database work.

Want (talkcontribs)
Shirayuki (talkcontribs)

I generally avoid translating long translation units composed of multiple sentences. This is because there is a high risk that they will be modified later, rendering the translations invalid.

Additionally, there are translation administrators who invalidate the entire paragraph’s translation for trivial edits, such as changing quotation marks.

From major changes to trivial ones, the task of investigating each modification and reflecting it in the translation is a painstaking task.

Amire80 (talkcontribs)

This quotation marks example is not so good for two reasons:

  1. I'm also updating the translations myself, so no one actually has to worry about it. (I haven't finished it yet for all languages because there's a lot of other mess to clean up.)
  2. Even if I didn't, addressing those updates is trivial: just look at the list of outdated messages and checking the diffs.

Updating translations after splitting a paragraph is much harder. You do this splitting, but you don't bother to update the translations. Please stop.

Want (talkcontribs)

Sorry, but you obviously have no idea what the effect of marking a page for translation on a multilingual wiki is. I intend to make it clear in that essay, but I'll be off the internet in a little while.

BenyElbis (talkcontribs)

I fully agree with both Shirayuki and Want.

I am a translator. It is much easier for me to work with a shorter text than with a long one with several sentences. I'd rather register incorrect text or finesse of the language. Also, a shorter text is more advantageous when using text hints in the Translator. Long texts are actually not displayed at all and the advantage of the hints is greatly suppressed. I don't understand at all why a text made up of several long sentences should be divided three times by semicolons. Another problem is the marking of untranslatable texts ('shape'). I get lost in it in long paragraphs, just like marking with different texts used to be. It just doesn't suit me and I like to use shorter sentences. I propose a very extensive discussion on this topic with a large number of people involved in translations and preparation for translations. It is a really complex topic and solutions cannot be made with regard to already translated texts. That's the development tax.

Amire80 (talkcontribs)

I agree that translatable units should be short. The question is how short should they be and how to achieve it technically.

The right way to do it is to write shorter paragraphs from the start. The responsibility for this should be on the original page's author. A ten-sentence paragraph is a bad idea in any case, but a five-sentence paragraph is OK.

Taking a paragraph that is already short and splitting it into even smaller parts is a bad idea. I haven't seen anyone except Shirayuki doing this. It's an exaggeration that does more harm than good.

I started a discussion about this a couple of weeks ago on this page: m:Talk:Translatability

Shirayuki (talkcontribs)

Several edits prior to my revision Special:Diff/6624825/6626346 added variables to translation units consisting of multiple sentences (as well as splitting them into multiple translation units).

I believe it is better to split them before adding variables and making them complex.

Shirayuki (talkcontribs)

BenyElbis: Please refrain from adding variables to translation units consisting of multiple sentences.

BenyElbis (talkcontribs)

I don't understand, please give an example. Thank you

Shirayuki (talkcontribs)

Thank you for reading the discussion. See the reply immediately above (at this point in time).

This is related to your addition of variables to complex translation units, for example in Special:Diff/6624825/6626346, and my subsequent partial reversion and splitting of those translation units in Special:Diff/6627645.

I had thought it would be better to split them first before adding variables, but if you hadn't added the variables, I wouldn't have felt the need to actively split them.

Amire80 (talkcontribs)

No, there is nothing wrong about adding variables to translation units consisting of multiple sentences.

There is something wrong about adding <translate> tags in the middle of paragraphs, especially if they were already translated to multiple languages, and forcing people to re-translate them for no reason at all.

Reply to "Splitting translatable paragraphs"

Question about code of the Template:Languages

3
Want (talkcontribs)

Hi Shirayuki. I found out that you added 17.7.2019 a change Special:Diff/3319554/3319557 to the Template:Languages, which I don't understand how it works.

I don't know nothing about attributes for the tag 'languages', but code use inline attribut 'exists'. Why? Function #ifeq four arguments, accepted, but result is still false (because tag 'languages' element with attribut not exists). Could you please clarify this for me? Thank you. -- Want (talk) 15:20, 15 July 2024 (UTC)

Shirayuki (talkcontribs)
Want (talkcontribs)

Your believe is a wrong. I prepared as example three pages to you on my wiki:

Note that the #ifeq function returns FALSE forever if used tag languages, regardless of whether the page is being translated or not.

Only test for the existence of a language version subpage of the marked page works. It returns TRUE for 'test-translate' because the test-translate/cs page exists - the page has been marked for translation, but for 'test-insert' return FALSE code because subpage test-insert/cs not exist. It's exactly the same for the translated template. -- Want (talk) 05:07, 16 July 2024 (UTC)

Reply to "Question about code of the Template:Languages"

Splitting paragraphs for translation

17
Amire80 (talkcontribs)

Hi,

Sometimes you do it by splitting the paragraph several lines like here, and sometimes you add <translate> tags in the middle of the paragraph like here.

Is there a suggestion documented anywhere to split paragraphs into smaller translation units? Or is it just something that you do yourself?

Shirayuki (talkcontribs)

I thought I wrote it down somewhere, but I couldn't find it.

  • By making translation units as small and simple as possible, translations become easier and are less likely to be left untranslated.
  • Additionally, fewer types of variables ($1, $2, etc.) are needed.
  • Moreover, simplifying variable names increases the likelihood of hitting the translation memory, thereby making the translation process more efficient.
Shirayuki (talkcontribs)
Amire80 (talkcontribs)

> By making translation units as small and simple as possible, translations become easier and are less likely to be left untranslated.

Have you measured it? Are pages where you did this split actually more likely to have more translations?

> Moreover, simplifying variable names increases the likelihood of hitting the translation memory, thereby making the translation process more efficient.

It's not significant. Sentences within paragraphs of body text are not likely to be in translation memory anyway.

Making translation units shorter is generally a good idea, but the way in which you do it is not great. It adds a lot of markup, which makes the page hard to edit.

It's better to make the paragraphs shorter and rely on the Translate extension's capability to mark each paragraph for translation.

Shirayuki (talkcontribs)
  • MediaWiki.org pages are frequently updated, and shorter translation units are less susceptible to changes. Longer translation units can lead to the invalidation of entire translations.
Amire80 (talkcontribs)

That's OK, but is it worth adding so much markup to achieve that? Just making sure that paragraphs are no longer than four sentences achieves a much better balance of translation ease and markup heaviness than splitting everything to single sentences. Besides, keeping paragraphs shorter is a good thing in general for readability in the source language.

Want (talkcontribs)

Yes. Markup is only signal to parser. Every text unit is numbered. Experience editors have not problem with orientation in wikicode of this type. You must first understand as functioned this concept of translations. Text unit is really a wiki page from Translation namespace. Every change has own id. If a change only one char, TU waiting to new revision of the volunteer translator. And more senteces in one TU complicate it.

Amire80 (talkcontribs)

Experience editors have not problem with orientation in wikicode of this type - actually, they do. I do, and lots of other people do. It's way too much code, and it makes editing the source page harder.

And more senteces in one TU complicate it. - paragraphs of five senteces is not too much. More than that is, and when I prepare pages for translation, I try to reduce paragraphs to five sentences or fewer. But splitting a paragraph to single sentences doesn't help much.

Want (talkcontribs)

I am sorry, but page Extension:DynamicPageList_(disambiguation) isn't simple page, because combine very complexity wikicode:

  • outline text where paragraph can't be split as common, because new line don't as continuation of the outline paragraph, but new paragraph for the same outline level, without outline char and tab (must be used colon for it)
  • parametrized multilanguage links created by Template:ll
  • and syntax example code, which must be protected before parser interpretation

I have multilanguage wiki and use it very much – see my Main Page, which is generated by it. For it I know specifics of the wiki code markup of it for using by multilanguage page very good. I don't use any lua modules for it.

I have question. You use for editation pages visual editor, or do edit plain wikicode? You call as editor for simple code, but translator wants simple TU. Text which uses more sentences want a lot more knowledge of the translator than you think. The consequence is that translators are few and far between, because translation a large chunks of text with a minor changes, not easy. They do not have time to update such pages and often remain untranslated a long time for it.

Amire80 (talkcontribs)

I am sorry, but page Extension:DynamicPageList_(disambiguation) isn't simple page, because combine very complexity wikicode - I'd argue that it's not so important to translate this page in the first place. I only used as an example because it was easy to find it. People who install extensions are more likely to know English. Some of them don't know English, so it's still useful to translate them, but it's less important than the Visual Editor user guide, for example.

For editing pages, I use the visual editor when I can and I wikitext when visual editor wouldn't work well. It's obvious that translators want simple translation units; what I say is that a translation unit with five sentences is usually simple enough.

Want (talkcontribs)
Want (talkcontribs)

No, you are wrong. Read m:Translatability page, please. But use translate markup as you see fit. For me, translating and preparing pages for translation on MediaWiki.org is a peripheral matter, for which no one pays me and which I do when I have the mood and the time, that be not create duplicity manuals on my wikis.

it's still useful to translate them, but it's less important than the Visual Editor user guide, for example. – Visual Editor user guide is important only for wiki where is use. My wiki's not use it. By my experience, users what use Visual Editor (default on Wikimedia wikis'), don't know a basic wiki markup and has problem with template using. My wikis have not a lot users, and I have not time to revision code of another user. I disable VE and users now better understanding code.

Amire80 (talkcontribs)

Visual Editor user guide is important only for wiki where is use. My wiki's not use it. - I'm not sure to which wiki do you refer when you say that your wiki doesn't use Visual Editor. If you are talking about the Czech Wikipedia, then it definitely does use the Visual Editor: since January 2023, the level of Visual Editor usage in cs.wikipedia.org is above 20% most of the time. On mediawiki.org and meta, it's only about 1%, but that's understandable because they are very technical.

As for the page m:Translatability, it says: "Don't put too much text within one translation unit. Create more translation units instead." I generally agree with this, but the way it's written now is too vague and ambiguous: it doesn't say how should more translation units be created. I started a discussion about it on its talk page.

Want (talkcontribs)

cs:Wikipedia isn't multilanguage wiki. MediaWiki.org support as source language only english. Combination more language source pages is more complicated. For it I create more specially templates, unfortunatelly unused to here, because here it's solved by Lua, not by wiki code and PHP extensions as be do it common.

However, I have the advantage that I am not only a sysop, but also a host administrator.

Want (talkcontribs)

I meant wikis https://www.thewoodcraft.org and https://wiki.control.fel.cvut.cz

When I do marking, I follow my intuition. Some languages ​​say in one sentence what others say in two. TU must be logical and created in such a way that the translator is not in doubt. I remarked your change on m:Translatability, marked for translation and translated into my language. It's logical and to the point. But I split TU No. 3 and created two new ones. Why?

  1. TU No. 3 you expanded. That's ok and easy to understand.
  2. New TU No. 16 contains information about code that does not belong to TU.
  3. And vice versa, TU No. 17 now is about wiki code that can be part of the TU.
Shirayuki (talkcontribs)

Don't you think enabling a syntax highlighter would improve understanding of wiki markup? It's indispensable for me when editing complex sources with translation tags. I don't like the visual editor and prefer to edit the source directly.

Amire80 (talkcontribs)

@Shirayuki, if you're asking me, then no, syntax highlighter doesn't really help. Splitting a single paragraph into many translation units makes it look too much like code. It shouldn't be like that. It's not code, it's text.

Reply to "Splitting paragraphs for translation"
Kaganer (talkcontribs)

About your revert of my edit.

If you have specific comments/claims to my edit, please explain them. If not, please return the template to my version. Now on the main page the phrase "Other languages:" is always displayed in English. After my edit, this is synchronized with the displayed language version (if translations is provided), or displayed in the user interface language.

Shirayuki (talkcontribs)

When I checked the source, the Template:Languages did not have a parameter '2'. That is why I reverted your changes and suggested using sandbox pages for prototypes of such high-impact pages.

However, I agree with the idea of synchronizing the phrase with the user interface language.

Kaganer (talkcontribs)

This change is on the one hand small and on the other hand difficult to verify outside of the current usage chain. That's why I did it on the Main Page directly, and also in the real templates.

Using {{ll|Project:Language policy|{{int:tpt-languages-legend/{{SUBPAGENAME}}}}}} in Template:Languages also justified - IMHO, there is necessary to first use a translation that matches the current language version of the subpage being called (regardless of the interface language). And then, if there is no translation, in the user interface language.

Shirayuki (talkcontribs)
Reply to "Template:Main page"
JWBTH (talkcontribs)

Hello, thanks for marking the article for translation. Why did you remove <translate>...</translate> tags from the first sentences though?

Shirayuki (talkcontribs)
JWBTH (talkcontribs)
Reply to "Translation tags on OOjs"

Renaming MediaWiki Users and Developers Conference 2024 page

3
MyWikis-JeffreyWang (talkcontribs)

Hello,

Per the consensus from conference organizers, we need to rename the "MediaWiki Users and Developers Conference 2024" page into "MediaWiki Users and Developers Conference Spring 2024" because there will probably be a MediaWiki Users and Developers Conference Fall 2024. Your help would be much appreciated.

CC: @CCicalese (WMF) in case a +1 is needed.

Shirayuki (talkcontribs)

Yes Done

MyWikis-JeffreyWang (talkcontribs)

Thanks!

Reply to "Renaming MediaWiki Users and Developers Conference 2024 page"
WikiForMen (talkcontribs)

Is there a handout to understand this stuff?

Shirayuki (talkcontribs)
WikiForMen (talkcontribs)

OK, in that diff I can not see any fragmented or "patchwork" messages and also is not related to the <translate> stuff.

One thing I don't understand is when a <translate> tag is set and when it is skipped.

Shirayuki (talkcontribs)
  • When you use empty lines within the ‎<translate>...</translate> tags, it is split into different translation units at those positions.
<translate>
== heading1 ==

sentence1.

== heading2 ==

sentence2.
</translate>
  • If you want each sentence to be a separate translation unit, you cannot omit the ‎<translate> tags.
<translate>
== heading ==
</translate>
<translate>sentence1.</translate>
<translate>sentence2.</translate>
WikiForMen (talkcontribs)

"If you want each sentence to be a separate translation unit, you cannot omit the ‎‎<translate>...</translate> tags."

This is my point. In many part there are no ‎<translate>...</translate> tags, but only <!-- T:xy--> tags. But you told me

Do not add unwanted html comments like <!---->.

so I get confused. :-(

Shirayuki (talkcontribs)

Below, there is my reply beginning with "Additional explanation," but does it answer your question?

WikiForMen (talkcontribs)

So far, it has turned out that every time I thought I had understood it, I was wrong again.

The whole thing is far too complicated, error-prone and contains too many "extra tricks", especially when it comes to the links to be translated.

I guess I'll keep making mistakes. But your explanations have brought some clarity. Thank you!

For today at least, the goal has been achieved and the documentation for an - in my opinion important - extension has been successfully made translatable. Also thanks to a lot of help from your side. :-)

Shirayuki (talkcontribs)

Additional explanation seems necessary to answer your question in Special:Diff/6296870.

In translation pages, untranslated or fuzzy translation units are enclosed with either <div> or a <span>.

If there is no newline character immediately before the translation unit marker (<!--T:XX-->), it is considered inline and enclosed in a <span>; otherwise, it is enclosed in a <div>.

In my edit, inline style was used to display the version numbers on the same line as the preceding text.

Shirayuki (talkcontribs)
  1. When the original text is cut off mid-sentence, as in The extension was tested with MediaWiki, it is unclear that version numbers follow. This leads to difficulties in accurately translating it into many languages.
  2. Punctuation marks should also be translatable.
WikiForMen (talkcontribs)

The idea is not to touch the existing translations if only the version numbers are updated.

WikiForMen (talkcontribs)
Reply to "wrong markup"

New Page to be Marked for Translation

13
Joris Darlington Quarshie (talkcontribs)

Hello Shirayuki please can you mark this header for translation Wikimedia Tech Safari Program/Header. Also the final gif banner uploaded is currently on reflecting on the overview page and the header page but the other sub pages currently have the old gif banner. If that can be fixed for me i will be greatful.

Shirayuki (talkcontribs)

I was wondering about the margin of this image, but it turns out to be an animated GIF. In my environment, the animation doesn't play, and only the outline of the African continent is displayed.

Joris Darlington Quarshie (talkcontribs)

From my end it works perfectly, tried viewing it on commons and it also works perfectly. The African image is static and the other content is animated.

Shirayuki (talkcontribs)
Joris Darlington Quarshie (talkcontribs)

Oh okay i can now see it via this URL. I don't know how this can be fixed.

Shirayuki (talkcontribs)
Joris Darlington Quarshie (talkcontribs)

oh okay i do understand it now.

WikiForMen (talkcontribs)
Joris Darlington Quarshie (talkcontribs)

Hello @Shirayuki please can the pages be marked for translation again

Right now the only changes i will do is for the schedule, discussion, organizers and team. So the overview can be marked for the overview and participate can be marked for translation.

Joris Darlington Quarshie (talkcontribs)
Joris Darlington Quarshie (talkcontribs)
Shirayuki (talkcontribs)

Pages that need to be marked for translation are listed on Special:PageTranslation, and in most cases, they are marked by one of the translation administrators within 24 hours.

So, there's no need to notify each time.

Joris Darlington Quarshie (talkcontribs)

Oh okay well noted with lots of thanks

Reply to "New Page to be Marked for Translation"
Aaron Liu (talkcontribs)

Recently, you undid my revision to {{main}} to make it use {{ll}} instead to automatically localize link names, with the summary "It does not localize". I am confused. As you can see on {{localized link/ja}}, the pagenames are automatically changed into their Japanese translations. The changes also hadn't been propagated to the translated versions yet.

Shirayuki (talkcontribs)

Passing a second parameter to {{ll}} overrides the automatic localization of link names, so they are not automatically localized. Using that template compromises readability.

Aaron Liu (talkcontribs)

But if the corresponding parameters are empty, as {{main}} is often used, the second parameter would not be passed. By readability, if you mean that of the template code, that can easily be fixed by enabling syntax highlighting.

Aaron Liu (talkcontribs)

I'll go ahead with restoring the edit in a day.

Reply to "Would not localize"
W like wiki (talkcontribs)

Hi Shirayuki, Thank You for your edit. Can I ask you, if it is generally possible, instead of 1, 2, 3, 4 to use something like 5, 10, 15, 20. Than there is space for later passages to insert. Maybe in this case it doesn't matter, but in more complex texts it makes it a bit difficult from the perspective of an editor to insert later a passage, E.G. in this edit I was happy, that there was T:39 still free for use, but if not..T:170?. Another idea could be, starting for each chapter with a new 10. eg. 1, 2, 3, - 10, 11, 12, - 20, 21, 22,.. or with more space inbetween: 2, 4, 6, - 10, 12, 14, 16, - 20, 22, 24, 26, ... That will keep enouph flexibility for future changings. Is this a usefull idea? Regards

Shirayuki (talkcontribs)

Do NOT manually add translation unit markers (T:NN). They are automatically added by the system, not by translation administrators.

For example, even if T:39 appears to be free, it may have been previously used for an entirely different English sentence, or translations may exist for it, causing confusion.

See .

Someone has assigned the translation unit T:39 for "Description" to the unrelated "Maps is a single data structure..." text!

As a result, incorrect translations for "Description" exist in languages other than English, causing inconvenience to readers and translators in other languages!!

W like wiki (talkcontribs)

hmm, ok. So using T:39 was wrong, I had to use T:172 ?

By the way: Did we accidentally edit the same time or do you not like the Capital Letters? They were intended as a navigational aid. Regards

Shirayuki (talkcontribs)

I just didn't want the braces to be included in the link, similar to {{Mbox templates}}.

Feel free to add the capital letters if needed.

W like wiki (talkcontribs)

ok, done. funny by the way, "layout!" - "style!" :-D Best Regards By the way, 2009/10 I was living for one year in Osaka, was nice! My beginning was the total solar eclipse in 2009 Amami Oshima. Where are you from or living?

Shirayuki (talkcontribs)
Warning Warning: NEVER add translation unit markers (T:NN) manually!

See .

Instead of the correct English text "Maps is a single data structure...", "説明" (the Japanese translation of "Description") has been mistakenly inserted.

W like wiki (talkcontribs)

hmm, ok. Sry for the chaos!!

Reply to "T:1, T:2, T:3, T:4"