About this board

Pginer-WMF


Previous discussion was archived at User talk:Pginer-WMF/Archive 1 on 2015-07-10.

Mohmad Abdul sahib (talkcontribs)

Hi, I'm mohamed from arabic content. report bugs in the (Content Translation), let me assign you the task of fixing them, please:


- First Problem: the problem is that diacritics are isolated with spaces that are not combined with the word, eg (مرحبًا كيف ، حالك ؟) while the correct ones should be (مرحبًا، كيف حالك؟). These are the diacritics and letters that must be fixed: (( . ، ؛ : ؟ ! و () {} [] /\ ~ - ٪ % × + ÷ $)) ...I don't know if there are other signs!.

- The second problem: when a translated text is shown to me using the "Content translation" tool, some words will appear that lead to links, and among these links there are links that lead to arabic articles that have already been translated into arabic, and links that lead to articles that have not been translated into Arabic After that, it will appear in red. therefore, I ask to develop the tool and make it automatically put such links to non-translated articles in this template Template:Interlanguage link. Mohmad Abdul sahib

Pginer-WMF (talkcontribs)

Thanks for the feedback, @Mohmad Abdul sahib.

Regarding the first issue, is the problem occurring for the initial machine translation provided or when typing contents you try to write directly? If it is about the machine translation, is it occurring when using Google translate or also with other services?

Regarding the second request, we need to explore how to better support the practices of different communities. One challenge with adding the interlanguage links template is that it is available in 89 languages from the 300+ the tool supports. In addition, we need to consider the additional backlog of links to update that is generated once the corresponding articles are created.

Reply to "Content Translation"
Snævar (talkcontribs)

Hi, as the listed product manager for the Wikimedia Language team I would like to make an complaint. UOzurumba has several times now tried to get community permission for deployments by posting on user pages. This is in clear violation of WMFs own policy at wikitech:Wikimedia site requests and meta:Requesting wiki configuration changes. The proper way is to use an Village pump of each project in question.

Pginer-WMF (talkcontribs)

Thanks for flagging this @Snævar. While most of the communications were done through the village pumps, there were some problems in a few cases resulting in the publication on the wrong place. UOzurumba is reviewing those and will publish then in the corresponding Village pumps.

Navigating a cross-language environment can be often a complex task. Thanks for understanding, and for helping catch these glitches.

Reply to "communication issues"
Nguyentrongphu (talkcontribs)

It has been 2 months. I want to know what's your solution to this, and what progress has been made to address the problem so far? Thank you!

Pginer-WMF (talkcontribs)

Hi @Nguyentrongphu. After receiving input from the Vietnamese community we adjusted the limits to the amount of unmodified Machine Translation allowed. I have also created a ticket to explore how to detect copied content out of Content Translation.

Right now the team is focused right now on the mobile support for Section Translation, but the whole area of Machine Translation limits is something we want to focus in the near future.

Nguyentrongphu (talkcontribs)

I just come up with a new idea that would be easier to implement. The current limit right now for Vi Wikipedia is 90%. If unmodified content is higher than 90%, copying is disabled. In other words, one can only copy if they meet the threshold % to publish. I don't think this solution would negatively affect good-faith users, or at least, its benefit outweighs its little inconvenience (one can always manual translate first before trying to copy things).

Pginer-WMF (talkcontribs)

Thanks for the suggestion. That's an approach we can consider. There are a couple of aspects that we need to consider in more detail:

  • Initial and intermediate states. When adding the first paragraph users will be at a 100% Mt temporarily until they start editing the paragraph. Even users making a good use o the tool would have copying disabled initially which can affect their workflow. To reduce this issue we can apply this limitation only when the translation has already 2 paragraphs added.
  • Feedback. Making copying disabled in some cases can result in working intermittently for the user. We may need to provide some feedback to clarify why the copy functionality was not working.

In general I think that including a tag into the copied text is a measure that is less intrusive while it can be effective, but this suggestion is an alternative to consider as part of the exploration.

Nguyentrongphu (talkcontribs)

"Initial and intermediate states. When adding the first paragraph users will be at a 100% Mt temporarily until they start editing the paragraph. Even users making a good use o the tool would have copying disabled initially which can affect their workflow. To reduce this issue we can apply this limitation only when the translation has already 2 paragraphs added." -> Now that I think about it, it's not a good idea because one can bypass it easily by copying each paragraph one by one.

Like I said, a little inconvenience is a good trade-off to prevent abuse. How hard is it really to translate 10% of contents before starting to copy things? Not really hard in my opinion.

Nguyentrongphu (talkcontribs)

There are pro's and con's in both methods. Tag method is less intrusive, sure but, it also has multiple implications and serious consequences:

First: it leaves wide open for abusers to continue to abuse the CT system.

Second: it continues to place heavy work load on patrollers in recent changes or new pages. It's not clear at what % they start to cheat (copy and publish right away); for example, at 100%, 94%, or 99%...? It's very time consuming to determine this; one has to read each individual article to determine how good is the translation. It becomes impossible to do when many people are abusing this. Vi Wikipedia has caught many different users that have been abusing this loophole for years. The reason is that people didn't notice this loophole for years until recently. On average, each user has 5k of badly translated articles. Let's say there are 10 users like that. It would mean around 50k of badly translated articles!

Pginer-WMF (talkcontribs)

I agree there are pros and cons with each approach, and I think it is important to surface them as part of the exploration. Thanks for sharing your thoughts on this.

Regarding your first point, any approach is about adding barriers to make it hard for users to do the wrong thing, but there may always be a workaround. When the barrier is based on a generic mechanism that others may also use, it is more likely to have also generic solutions available such as this browser extension. However, including a more specific tag is something that requires a specific effort to detect.


Regarding your second point, the less intrusive solution allows to be more flexible about the percentage to catch. Maybe any content copied can result in the page being added to a category but those with higher percentages of unmodified MT are blocked directly.


This is a complicated space. I think we need to explore different options in detail.

Nguyentrongphu (talkcontribs)

I doubt many people know about that browser extension. Sure, there are always ways to cheat regardless of barrier. However, a barrier is good enough when it can stop most of the abusers. We (Vi Wikipedia) can deal with 1 or 2 remaining abusers that find a way to cheat the barrier. Currently, the barrier we have is not sufficient, so we need a better barrier.


"Regarding your second point, the less intrusive solution allows to be more flexible about the percentage to catch. Maybe any content copied can result in the page being added to a category but those with higher percentages of unmodified MT are blocked directly." -> I like this idea. However, I'm not sure how feasible this is (technical aspect). Also, a tag is sufficient, no need to add to a category. And if this method is too hard or impossible in technical aspect, my method is a sound alternative.

Nguyentrongphu (talkcontribs)

Feedback: you guys can put a "warning" somewhere in CT, easy to see, saying: "you have to translate at least 10% contents before able to copy".

Reply to "Any news?"

About the enablement of Section translation in Wikipedia

8
Rodney Araujo (talkcontribs)

Hello Pginer, i have a question, ¿when Section translation could be enabled for Spanish Wikipedia? Thanks.

Pginer-WMF (talkcontribs)

Thanks @Rodney Araujo for your interest in the tool and your help on the project.

We just enabled Section Translation on Bengali Wikipedia as an early release, and we plan to improve the tool further before considering other wikis. There are several areas of the tool that need improvement and we want to hear from a smaller group of editors first to iterate and make the tool better. So it may still take several weeks until we can move to the next stage.

Hearing about the interest on the tool is very encouraging for us, and it is very useful to know where there are users interested in the tool that can help us make it better. So we'd definitely consider Spanish Wikipedia as we plan for enabling more wikis.

Thanks!

Rodney Araujo (talkcontribs)
Rodney Araujo (talkcontribs)

@Pginer-WMF: Are you going to communicate with Spanish Wikipedia community about Section translation?

Pginer-WMF (talkcontribs)

Right now the tool is in a very early stage with several important aspects still missing. Once we complete a few cycles of development we'll be doing wider announcements.

Currently we have just announced the enablement on Bengali Wikipedia. Once we get feedback from this community to improve the tool we'll consider expanding (and communicating) to other wikis.

DRIS92 (talkcontribs)

Bonjour, Je vois que la traduction de es à fr est possible.

Pginer-WMF (talkcontribs)

Our tools integrate multiple translation services. You can get a list of the services and the supported languages in the documentation. Due to limitations of the test instance for Section Translation, only those languages supported by Apertium, Matxin and OpusMT are available. However, all services will be available when the tool is available on a real wiki.

Pginer-WMF (talkcontribs)

@Rodney Araujo Section translation is now available on Test Wikipedia. This is a test environment better integrated with our infrastructure, where more machine translation services are available. Thus, you can try the tool using Google translate or Yandex in more languages and without the need to create a separate account. The results are still published in the test instance (not the real wiki) but you can copy the resulting content anywhere else.

I hope this is useful until we enable the tool in more Wikipedias.

Reply to "About the enablement of Section translation in Wikipedia"

আপনার জন্য একটি পদক!

1
RIT RAJARSHI (talkcontribs)
রোজেত্তা পদক
For working at Content Translation Project.
Reply to "আপনার জন্য একটি পদক!"
RZuo (talkcontribs)
Reply to "A barnstar for you!"

Please join the discussion on my proposal

2
HaussmannSaintLazare (talkcontribs)
HaussmannSaintLazare (talkcontribs)

Hello Pginer-WMF !!!

Please write your impressions about my proposal that I introduced the other day.

Thank you.


Reply to "Please join the discussion on my proposal"

About the specifications of MediaWiki

1
HaussmannSaintLazare (talkcontribs)
Reply to "About the specifications of MediaWiki"
Charminku (talkcontribs)
Reply to "A cupcake for you!"
Charminku (talkcontribs)
Best Wishes My side.... From India !!! Charminku (talk) 14:41, 4 April 2020 (UTC)
Reply to "A cupcake for you!"