帮助:内容翻译/翻译/翻译质量

This page is a translated version of the page Help:Content translation/Translating/Translation quality and the translation is 61% complete.
PD 注意:當您編輯本頁面時,即同意以CC0協議授權您的貢獻。您可以在公有領域帮助页面找到更多信息。 PD

创建翻译时,必须在发布内容之前审阅内容。 您需要确保產出的内容不會無意地改变原意,并检查其在目标语言中是否能自然地阅读。 初始机器翻译有助于加快翻译过程的起步,但该工具鼓励用户仔细查看和大量编辑初始内容。

各种机制致力于确保翻译人员适当地编辑初始机器翻译。 翻译编辑器会追蹤用户修改了多少初始机器翻译,并借此定义不同的限制:阻止发布,或鼓励用户进一步審閱内容。

通过这种方式,该工具可以让用户充分利用机器翻译,同时防止创建未经审查的低质量產物。 下文将进一步介绍这些限制的工作原理、如何根据每种语言的需求进行调整,以及如何衡量使用该工具產出的内容的质量。

鼓励审校翻译的限制

內容翻譯工具衡量用户对自動初始机器翻译產物进行修改的百分比。 系统由此得知初始翻译被移除、修改、或添加了多少单词。 系统在段落和全文两个层面上进行检查。 接下来解释每个层面上的差别阈值

全文范围的阈值

 
当尝试发布含有过多未修改内容的机器翻译时将显示错误信息。 此阈值已根据编辑的反馈针对印度尼西亚语进行了调整。

如果整个文档的95%或更多包含未经修改的机器翻译内容,则禁止发布。 此限制可防止接近原始的机器翻译,并规避明显的故意破坏行为。 它还可以防止用户仅添加内容,而不编辑机器翻译部分。 如下所述,可以根据每种语言调整此限制。

每个段落的限制

 
显示特定段落的警告,其中未修改的机器翻译超出限制。

还针对每个段落测量用户修改的百分比。 当一个段落包含超过85%的初始机器翻译时(或者,当从源文档复制内容时,其包含超过60%的未修改内容),则被认为是有问题的。

翻译编辑器将针对每个被认为有问题的段落显示警告,鼓励用户进一步编辑。 在某些情况下,用户仍然可以发布,但生成的页面可能会被添加到可能未经审核的翻译的跟踪类别中,供社区审核。 在其他情况下,可能根本不允许用户发布。

以下是确定是否允许用户发布时考虑的一些因素(其中一些仍在开发中):

  • The number of problematic paragraphs. Users are prevented from publishing translations with 50 or more problematic paragraphs.

允许发布少于50个有问题的段落的翻译,但那些有10到49个有问题的段落的翻译将被添加到可能未经审核的翻译的跟踪类别中,以供社群审查。

  • Previous deleted translations. To prevent recurring problems, the tool identifies users whose published translations were deleted in the last 30 days, and imposes much more strict limits upon their subsequent translation efforts.

对于此类用户,包含10个或更多问题段落的翻译将被禁止发布,而具有9个或更少问题段落的翻译将被添加到可能未经审核的翻译的跟踪类别中,以供社区审阅。

  • User confirmation. A less strict threshold is considered for paragraphs that a user marks as resolved—taken as a signal that the user reviewed and confirmed the status of the translation.

对于显示未修改内容警告但用户将其标记为已解决的段落,将应用不太严格的阈值(接受95%的机器翻译或75%的源内容)。 这将提供一种方法来适应自动翻译非常好的情况,但仍避免潜在的滥用该功能(即,不盲目遵循用户的确认)。

不受限制影响的内容

某些内容预计不会进行大量编辑,因此在应用上述限制时不考虑。 非常短的章节标题、引文或参考文献列表被排除在审查之外。 否则,用户可能会收到有关翻译不应翻译的内容的误导性警告,例如出现在参考文献或其他专有名词中的书名。

Limits on the mobile experience

For the mobile experience the initial set of limits follow a simpler approach. At the moment, only the overall percentage of unmodified machine translation for the whole translation is considered. On mobile, the whole translation consist of just one section of the article.

特别是,当未修改的机器翻译的百分比超过整个部分的85%时,会显示警告,当未更改的机器翻译百分比超过95%时,会阻止发布。

关于限制系统如何在移动环境中工作的反馈对于确定如何发展这种初始方法非常有用。

Publication of fast unreviewed translations

Campaigns and contests can result in spikes of translations where some user unfamiliar with the community policies may focus on making many translations and not pay enough attention to review their contents. In order to emphasize quality over quantity, a mechanism has been defined to limit the publication of fast unreviewed translations.

After a user translates a large article, the next translation can only be started after some time has passed. The waiting period estimation considers 1 minute per paragraph up to 10 minutes. That is:

  • For articles with 10 paragraphs or less, we want to make sure that users spent translating it at least N minutes (one minute per paragraph)
  • For articles with more than 10 paragraphs we want to make sure that users spent translating it at least 10 minutes.


This has been applied on mobile initially since it is a space with less activity, and after measuring the impact we'll consider expanding it to desktop too.


调整限制

上述限制提供了一组通用机制,但它们可能需要调整每个维基的特定需求。 根据初始评估,初始机器翻译所需的修改量可以在10%到70%之间,具体取决于语言对。 在某些维基上,默认限制可能过于严格,会产生不必要的干扰或阻止发布完全有效的翻译。 在其他维基上,限制可能不够严格,允许发布编辑不够的翻译。

調整不同的閾值可使每個wiki根據其特定需求調整工具的限制。 母语人士的反馈对于正确调整限制至关重要。 如果根据您创建或审核翻译的经验,目前的限制似乎不能很好地运作,请s分享您的反馈,我们可以探索如何更好地调整它们。

When providing feedback about adjusting the thresholds, we recommend that you first create several example translations (make sure to check the publishing options if your test is not intended to be published as regular content). 在测试限制如何适用于您的语言时,请记住以下几点很有用:

  • Check for both cases. Make sure to check how the limits work for both: translations where the content has not been edited enough, versus where it has been edited enough.

通过这种方式,您可以更轻松地为工具的限制功能找到合适的平衡。 仅检查一种类型的问题可能会导致阈值在相反方向上移动得太远。

  • Check different content. Content in our wikis is highly diverse, and machine translation may work much better for some cases compared to others.

例如,与具有更多描述性文本的内容相比,充满数字数据或技术名称的内容可能需要用户进行较少的编辑。 确保通过翻译各种不同长度、不同内容的不同文章类型进行测试。

  • Prepare to iterate. Adjusting the thresholds is an iterative process.

It may require custom adjustments to the thresholds or that you improve your general approach. In any case, after each change, further testing may be needed to verify the improvements made.

Adjusting the limits in collaboration with editors has proven to be effective. For example, initial results show that the Indonesian community was able to significantly reduce the number of problematic translations they were receiving by restricting the publication of translations with more than 70% of unmodified machine translation content. Similar adjustments have been made for Telugu and Assamese language wikis. There is no automatic tool that is infallible, and these limits are not an exception.

The process of content review by the community is still essential, but these limits provide communities with a tool to reduce the number of translations they have to focus on, making the review process much more effective. Please share your feedback and we can explore how to better adjust them.

Tracking potentially unreviewed translations

A tracking category with the name "cx-unreviewed-translation-category" is provided for communities to easily find articles that have been published with some content exceeding the recommended limits.

You can find this category in the list of tracking categories on each wiki. 使用它,您可以跟踪超过阻止发布限制的文章,但其中一些段落的编辑量仍低于预期。 For example the Indonesian Wikipedia's category includes articles that have less than 40% of machine translation overall, but which have some paragraphs with more than 80% of unmodified machine translation.

衡量翻译质量

自动评估内容质量并非易事。 删除率提供了一个有用的估计,即创建的内容是否足够好以便编辑所在的社群不会删除它。 Based on the analysis of deletion ratios, articles that are created as translations are less likely to be deleted when compared with articles created from scratch. This suggests that it may not be practical to set the limits for participation through translating much higher than those set for other ways of article creation.

Find published translations

Content translation adds a contenttranslation edit tag to the published translations. This allows communities the ability to use Recent changes, and similar tools, to focus on pages created using the translation tool. In addition, data on published translations and the statistics for machine translation use are available for anyone to analyze.

Inspect a specific translation

Translation debugger example

The Translation debugger is a tool that allows the inspection of some metadata for a given translation, including the percentage of machine translation used for the whole document, and the translation service used for each paragraph. For specific types of content such as templates, the Content Translation Server API can be queried to check how templates will be transferred across languages.

基于用户权限的其他限制

 
显示基于用户权限的发布限制时出错。 此示例基于英语维基百科社群决定仅限于扩展确认用户直接向主名字空间发布条目。

一些维基已经基于用户权限实施了其他翻译限制,以减少低质量翻译的创建。 For example, English Wikipedia requires users to be extended confirmed, which means they need to make 500 edits on English Wikipedia before they are allowed to publish a translation as an article. Newer editors can still publish translated articles in the User: or Draft: namespaces, and then move the article to the mainspace.

This restriction was created before the system of limits described in this page was available, and it is not the recommended approach to encourage the creation of good quality translations.

Before adding restrictions that do not take into account the content created, consider going through the process of adjusting the limits of unmodified content as described above. The limits can be made as strict as needed to prevent low-quality translations, while still allowing publication by editors making good translations.