帮助:条目翻译/正在翻译/翻译质量

This page is a translated version of the page Help:Content translation/Translating/Translation quality and the translation is 36% complete.

创建翻译时,必须在发布内容之前审阅内容。 您需要确保生成的内容以合适的方式改变原意,并检查它在目标语言中的阅读是否自然。 帮助:条目翻译/正在翻译/初始的机器翻译提供了一个有用的工具,它有助于加快翻译过程,该工具鼓励用户查看和大量编辑初始内容。

使用各种机制致力于确保翻译人员适当地编辑初始的机器翻译。 翻译编辑器会跟踪用户修改了多少初始的机器翻译,并借此定义不同的限制:阻止发布、鼓励用户进一步查看内容。

通过这种方式,该工具可以让用户充分利用机器翻译,同时防止创建未经过大量审查的低质量结果。 下面将进一步介绍这些限制的工作原理、如何根据每种语言的需求进行调整,以及如何衡量使用该工具生成的内容的质量。

鼓励审校翻译的限制

检查用户对初始机器翻译结果进行修改的百分比。 系统由此得知初始翻译被修改,删除,或者添加了多少单词。 系统在段落和全文两个层面上进行检查。 接下来解释每个层面上的差别阈值

全文范围的阈值

 
当尝试发布含有过多未修改内容的机器翻译时将显示错误信息。 This threshold was adjusted for Indonesian based on feedback from their editors.

Publication is blocked if 99% or more of the whole document consists of unmodified, machine translated content. This limit prevents near-raw machine translations and circumvents clear vandalism. It also prevents users from merely adding content, without editing the machine translation portion. As detailed below, this limit can be adjusted on a per language basis.

每个段落的限制

 
显示特定段落的警告,其中未修改的机器翻译超出限制。

The percentage of user modifications is also measured for each paragraph. A paragraph is considered problematic when it contains more than 85% of the initial machine translation (or, when copying the contents from the source document, it contains more than 60% of unmodified content).

The translation editor will show a warning for each paragraph that is considered problematic, encouraging further edits by the user. In some cases users are still able to publish, but the resulting page may get added to a tracking category of potentially unreviewed translations for the community to review. In other cases, users may not be allowed to publish at all.

The following are some of the factors considered for determining whether to allow users to publish or not (some of which are still in development):

  • The number of problematic paragraphs. Users are prevented from publishing translations with 50 or more problematic paragraphs.

Publication of translations with less than 50 problematic paragraphs is permitted, but those with 10 to 49 problematic paragraphs will be added to a tracking category of potentially unreviewed translations for the community to review.

  • Previous deleted translations. To prevent recurring problems, the tool identifies users whose published translations were deleted in the last 30 days, and imposes much more strict limits upon their subsequent translation efforts.

For users in this class, translations with 10 problematic paragraphs or more are prevented from publishing, while those with 9 or less problematic paragraphs are added to a tracking category of potentially unreviewed translations for the community to review.

  • User confirmation. A less strict threshold is considered for paragraphs that a user marks as resolved – taken as a signal that the user reviewed and confirmed the status of the translation.

For paragraphs where the unmodified content warning is shown, but the user marks it as resolved, a less strict threshold is applied (accepting 95% of Machine translation or 75% of source content). This will provide a way to accommodate cases where the automatic translation was exceptionally good, but still avoid potential abuse of the feature (i.e., not blindly following a user's confirmation).

Contents not affected by the limits

Some content is not expected to be edited significantly, and thus is not considered when applying the limits described above. Very short section titles, citations, or the list of references are excluded from review. Otherwise, users could receive misleading warnings about translating content that should not be, such as book titles appearing in references or other proper nouns.

调整限制

上述限制提供了一组通用机制,但它们可能需要调整每个维基的特定需求。 根据初始评估,初始机器翻译所需的修改量可以在10%到70%之间,具体取决于语言对。 在某些维基上,默认限制可能过于严格,会产生不必要的干扰或阻止发布完全有效的翻译。 在其他维基上,限制可能不够严格,允许发布编辑不够的翻译。

調整不同的閾值可使每個wiki根據其特定需求調整工具的限制。 母语人士的反馈对于正确调整限制至关重要。 如果根据您创建或审核翻译的经验,目前的限制似乎不能很好地运作,请s分享您的反馈,我们可以探索如何更好地调整它们。

When providing feedback about adjusting the thresholds, we recommend that you first create several example translations (make sure to check the publishing options if your test is not intended to be published as regular content). When testing how the limits work for your language, it is useful to keep in mind the following:

  • Check for both cases. Make sure to check how the limits work for both: translations where the content has not been edited enough, versus where it has been edited enough.

In this way, you can more easily find the right balance for the tool's limits feature. Checking only one type of problem can lead to moving the thresholds too far in the opposite direction.

  • Check different content. Content in our wikis is highly diverse, and machine translation may work much better for some cases compared to others.

For example, content that is full of numeric data or technical names may require less editing by users than content with more descriptive text. Make sure to test by translating of a variety of different article types, of varying lengths, with disparate content.

  • Prepare to iterate. Adjusting the thresholds is an iterative process.

It may require custom adjustments to the thresholds or that you improve your general approach. In any case, after each change, further testing may be needed to verify the improvements made.

Adjusting the limits in collaboration with editors has proven to be effective. For example, initial results show that the Indonesian community was able to significantly reduce the number of problematic translations they were receiving by restricting the publication of translations with more than 70% of unmodified machine translation content. Similar adjustments have been made for Telugu and Assamese language wikis. There is no automatic tool that is infallible, and these limits are not an exception.

The process of content review by the community is still essential, but these limits provide communities with a tool to reduce the number of translations they have to focus on, making the review process much more effective. Please share your feedback and we can explore how to better adjust them.

Tracking potentially unreviewed translations

A tracking category with the name "cx-unreviewed-translation-category" is provided for communities to easily find articles that have been published with some content exceeding the recommended limits.

You can find this category in the list of tracking categories on each wiki. Using it, you can track articles that passed the limits preventing publication, but that still had some paragraphs that were edited less than expected. For example the Indonesian Wikipedia's category includes articles that have less than 40% of machine translation overall, but which have some paragraphs with more than 80% of unmodified machine translation.

衡量翻译质量

自动评估内容质量并非易事。 删除率提供了一个有用的估计,即创建的内容是否足够好以便编辑所在的社群不会删除它。 Based on the analysis of deletion ratios, articles that are created as translations are less likely to be deleted when compared with articles created from scratch. This suggests that it may not be practical to set the limits for participation through translating much higher than those set for other ways of article creation.

Find published translations

Content translation adds a contenttranslation edit tag to the published translations. This allows communities the ability to use Recent changes, and similar tools, to focus on pages created using the translation tool. In addition, data on published translations and the statistics for machine translation use are available for anyone to analyze.

Inspect a specific translation

Translation debugger example

The Translation debugger is a tool that allows the inspection of some metadata for a given translation, including the percentage of machine translation used for the whole document, and the translation service used for each paragraph.

基于用户权限的其他限制

 
显示基于用户权限的发布限制时出错。 此示例基于英语维基百科社群决定仅限于扩展确认用户直接向主名字空间发布条目。

一些维基已经基于用户权限实施了其他翻译限制,以减少低质量翻译的创建。 For example, English Wikipedia requires users to be extended confirmed, which means they need to make 500 edits on English Wikipedia before they are allowed to publish a translation as an article. Newer editors can still publish translated articles in the User: or Draft: namespaces, and then move the article to the mainspace.

This restriction was created before the system of limits described in this page was available, and it is not the recommended approach to encourage the creation of good quality translations.

Before adding restrictions that do not take into account the content created, consider going through the process of adjusting the limits of unmodified content as described above. The limits can be made as strict as needed to prevent low quality translations, while still allowing publication by editors making good translations.