User:Trizek (WMF)/sandbox/Wikimedia Apps/Team/Android/Machine Assisted Article Descriptions/es
Project background
editSince 2017 , the Wikipedia Android app has offered pathways for adding short descriptions. In 2019, the team released a short description adding tool to app users that are logged in. You can read more about the current state of the Suggested Edits tool for adding Short Descriptions here .
In March 2022, the Android team created and increased a gate for editors to be eligible for editing short descriptions, as a result of feedback received from English Wikipedia users regarding a desire for users, especially new users, to improve the quality of their article short descriptions.
In May 2022, the Research arm of the Swiss Institute, also known as EPFL, built a machine model called Descartes to suggest short descriptions for Wikipedia articles. After testing the Descartes model independently, EPFL reached out to the Wikimedia Foundation offering the model to Wikipedians to aid editors in the creation of article short descriptions. With consideration of requests to improve the quality of short descriptions, especially from new editors, the Android team determined that it could offer machine suggestions using the Descartes model to support users, if and only if the Descartes tool yielded similar promising results with Wikipedians that it did in EPFL’s initial testing.
In January 2023-April 2023 the team built the UI and client side changes to the Descartes model, while putting in guardrails for risks, then deployed a modified version of the model mid-May 2023 through mid-June 2023. The version of the model users saw in the Android app included quality controls. Additionally, suggestions were concealed behind an affordance where users had to actively click to view machine suggestions then decide to leverage one or manually type out their short description.
By the end of June the feature was removed from the app. In Mid June thru the end of July, the Android team conducted outreach to various language communities inviting them to serve as graders and patrollers for the edits produced through the tool during the experiment duration. Volunteers that responded to the call for support via onwiki email participated in the experiment as evaluators. Experienced editors evaluated what app editors published during the experiment time with machine suggestions, in conjunction with human generated short descriptions. Volunteers were encouraged to revert errors. Volunteer grading ended in early August 2023.
After an initial review of the data, the team had enough insights to suggest migrating the Descartes model from a temporary holding space on Cloud VPS to LiftWing as a permanent host space. After more in-depth data analysis from our team’s data analyst, our research team and the EPFL research team, we felt confident enough in the results to offer the a modified version of the model permanently in the app as suggestions to human editors, only after approval from language communities and having sessions to understand if further changes could be made to improve the model to meet its intended purpose of supporting editors with writing higher quality short descriptions.
Current Status
editAs of August 2024, we’ve posted the results of the experiment, what modifications app side changes were made to the cloud hosted version of the model for quality purposes, and offered suggestions of if the language community should consider adopting the feature. We will be conducting outreach in September 2024 through November 2024 to inform the Android product team of what further improvements can be made and if communities would like to adopt the feature. While we have recommendations of if the feature might be useful based on data, adoption is a decision left solely to language community members. We recognize short descriptions are handled differently across language communities and steps towards ensuring articles have short descriptions over the years have evolved, which is why we are taking a community by community outreach approach.
Antecedentes del experimento
editEl equipo de Android está colaborando con Research y EPFL para mejorar las descripciones de los artículos, también conocidas como descripciones breves.
Actualmente los usuarios de la aplicación Android pueden crear y editar descripciones de artículos a través de ediciones sugeridas. Las descripciones de los artículos se envían a Wikidata, a excepción de las descripciones de los artículos de la Wikipedia en inglés. The Android team has received feedback that new users produce low-quality article descriptions (T279702). In 2022, the team placed a temporary restriction on Suggested Edits for users that had less than 3 edits for English Wikipedia users (T304621) with the intent on finding methods of improving the quality of article descriptions by new users.
EPFL y Research se pusieron en contacto con el equipo de Android con un modelo llamado Descartes, que es un modelo que puede generar descripciones con un rendimiento a la par con los editores humanos. Descartes takes the information on a Wikipedia article page and provides a short description of the article while adhering to the guidance of what makes an article description helpful. Durante la evaluación inicial del modelo, se prefirió más del 50 de las veces que las descripciones de artículos generadas por humanos. Además, Descartes obtuvo un 91,3% de precisión en las pruebas. A pesar de estos resultados tan prometedores, el equipo quiso actuar con la debida diligencia realizando una prueba ABC para garantizar que las sugerencias mejoraran la calidad de las descripciones de los artículos cuando se sugirieran a nuevos editores, sin introducir o aumentar el sesgo existente. We created an API which is hosted on Toolforge and will integrate the model into our existing interface in order to conduct our experiment. We will patrol edits made through the experiment in partnership with volunteers to not burden patrollers.
Requisitos del producto
edit- Los usuarios pueden dar su opinión sobre sugerencias concretas si detectan problemas.
- Accommodate two machine generated suggestions to test which beam is more accurate
- Onboard users to Machine Generated suggestions
- Reminder popups of checking for bias when clicking a suggestion on a biography
- Only experienced users will see suggestions for biographies
- Ability for users to write in their own response and edit a suggestion
- Incorporate icon that identifies the product uses machine learning
- Multilingual compatibility with mBART25
Objetivo e indicadores
editComo primer paso en la implementación de este proyecto, el equipo de Android desarrollará un MVP con el propósito de:
- # Determinar si las sugerencias realizadas a través del modelo Descartes aumentan la calidad de las adiciones y ediciones de descripciones de artículos realizadas utilizando la aplicación Android de Wikipedia. Para entender cómo la descripción del artículo sugerida cambia el comportamiento del usuario evaluaremos:
- If introduction of suggestions alters the stickiness of the task type across editing tenure
- Variability in task completion time relative to quality of edits
- How often users modify suggestions before hitting publish
- The optimal design and user workflow to encourage accuracy and task retention
- What, if any, additional measures need to be in place to discourage bad or bias suggestions
- Determine if the algorithm holds up when exposed to more users:
- Does the accuracy and preference rate change when exposed to more users
- Does the accuracy and preference rate of using the suggestion vary greatly across languages
- Is the algorithm introducing bias (e.g. misgendering) or not accurately representing critical nuance for Biographies of Living Persons
- How does the accuracy rate and performance change when showing more than one suggestion
Si el experimento de 30 días muestra resultados prometedores basados en los indicadores anteriores, el equipo introducirá la función a todos los usuarios y eliminará nuestro requisito de 3 ediciones para las ediciones sugeridas. También tomaremos medidas para ampliar el número de idiomas a mBART 50 y migrar la API de toolforge a un hogar más permanente.
Evaluadores voluntarios
editEl equipo se asociará con voluntarios para patrullar las ediciones realizadas durante el tiempo del experimento y asignar una calificación a la edición.
Esto servirá para determinar si la calidad de las ediciones aumenta cuando se utilizan descripciones de artículos generadas automáticamente. Los evaluadores voluntarios pueden inscribirse a continuación o ponerse en contacto con ARamadan-WMF.
El compromiso para servir como evaluador voluntario es de hasta una hora a la semana durante cuatro semanas.
Decision to be made
editThis A/B test will help us make the following decision:
- Expand the feature to all users
- Use suggestion as a means to train new users and remove 3 edit minimum gate
- Migrate model to more permanent API
- Show 1 or 2 beams
- Expand to mBART 50
ABC Logic Explanation
- Experiment will include only logged in users, in order to stabilize distribution.
The only users that will see the suggestions are those in mBART25
- Of those in mBART25 half will see suggestions (B: Treatment) and half will not see suggestions (Control)
- Of those in mBART25 only users that have more than 50 edits can see suggestions for Biographies of Living Persons, and if the users are in the non-BLP group, they will remain in it, even if they cross 50 edits during the experiment.
Additionally, we care about how the answers to our experiment will differ by language wiki and user experience (<50 New vs. 50+ Experienced).
Decision to be made
edit- If the accuracy rate for edits that came from the suggestion is less than those manually written, we will not keep the feature in the app. The accuracy rate will be determined based on manual patrolling.
- If the accuracy rate for edits that came from the suggestion is less than 80%, we will not keep the feature in the app. The accuracy rate will be determined based on manual patrolling.
- If the time spent to complete the task using the suggestion is double the average rate as those that do not see suggestions we will need to compare it to reports to see if there are performance issues
- If time spent to complete the task using the suggestion is less than the average without a negative impact to accuracy rate, we will consider it a positive indicator to expand the feature to more users
- If users that see the suggestion modify the suggestions more often than submitting it without modification, we will evaluate its accuracy rate compared to users that did not see the suggestions and determine if the suggestion is a good starting point for users and how it differs by user experience
- If users that see the suggestion modify the suggestions more often than submitting them without modification, we will look for trends in the modification and offer a recommendation to EPFL to update the model
- If beam one is chosen more than 25% of the time than beam two while having an equal or higher accuracy rate, we will only show beam one in the future
- If users that see treatment return to the task multiple times (1,2,7,14 days) at a rate 15% or more than the control group without a negative impact to accuracy, we will take steps to expand the feature
- If our risks are triggered we will implement our contingency plan
- If users that see the treatment do not select a suggestion more than 50% of the time after viewing the suggestions, we will not expand the feature
In aggregate, there should be at least 1500 people with a stretch goal of **2,000 people** and 4,000 edits included in the A/B test across the following mBART25 wikis: English, Russian, Vietnamese, Japanese, German, Romanian, French, Finnish, Korean, Spanish, Chinese (sim), Italian, Dutch, Arabic, Turkish, Hindi, Czech, Lithuanian, Latvian, Kazakh, Estonian, Nepali, Sinhala, Gujarati, and Burmese.
Gestión de riesgos
editCada vez que se utiliza Machine Learning introducimos una mayor cantidad de riesgos de los que ya implica el desarrollo de software. For that reason, we are tracking and managing risks associated with this project alongside our Security and Legal team.
Riesgo | Causa | Nivel | Respuesta | Acción de respuesta | Disparador | Plan de contingencia |
---|---|---|---|---|---|---|
El algoritmo difama a las personas vivas | El algoritmo extrae aspectos controvertidos de una persona viva y los incluye en la descripción. | Bajo | Mitigar | Monitorizaremos el resultado de lo que se publica y veremos qué se informa para hacer ajustes en el modelo de aprendizaje. En las pruebas no hemos visto ningún caso de esto, más bien al contrario, vemos casos del algoritmo blanqueo de la historia. Como precaución adicional, sólo permitiremos que editores experimentados editen biografías de personas vivas. | Difamación detectada durante el patrullaje | Eliminar por completo las sugerencias sobre BLP |
Abrumar a los patrulleros | La nueva función aumenta el interés por el tipo de tarea y el algoritmo no aumenta la calidad de las ediciones | Medio | Mitigar | Tendremos un equipo dedicado de personas que patrullarán las ediciones de esta característica para no abrumar a los patrulleros voluntarios, y dar aviso anticipado a Wikidata y a la comunidad inglesa. | El personal no puede seguir el ritmo del patrullaje | Restringir el número de tareas con sugerencias en un día |
Propone contenido NSFW | Hay contenido NSFW en el artículo que se sugiere para la descripción | Bajo | Mitigar | El algoritmo se basa principalmente en el primer párrafo. Disponemos de un mecanismo de notificación y vigilaremos las ediciones. | Si el 2% o más de los usuarios señalan un problema | Bloquearemos las palabras en función del filtro de abuso |
Los usuarios abandonan la tarea por problemas de rendimiento | El modelo está en un host temporal y mostrar más de una opción puede tardar un poco en generarse. | Medio | Mitigar | Cargar las respuestas en segundo plano antes de que los usuarios pulsen el botón para mostrar sugerencias. | 4/10 usuarios expresan problemas de rendimiento durante las pruebas de usabilidad | Mostrar una opción o realizar otros cambios en la interfaz de usuario |
Género erróneo o alucinaciones étnicas | El algoritmo asigna un género incorrecto a las personas o proporciona una etnia incorrecta | Medio | Mitigar | Durante el experimento, esto es algo que buscaremos deliberadamente en los informes de patrullaje y monitoreo. | Si se declaran más del 2% de las veces | Pondremos en pausa la función y los recordatorios de código duro y reduciremos la sugerencia a una sugerencia |
Cómo seguir
editHemos creado T316375 como nuestro Phabricator Epic para seguir este trabajo. Te animamos a colaborar allí o en nuestra página de discusión.
También habrá actualizaciones periódicas de esta página a medida que avancemos. También puede probar el modelo en https://ml-article-descriptions.toolforge.org/. Please keep in mind, there are a bunch of filters that are being added client side to improve the quality of the model. Those safeguards can be read in the Risk Management portion of this page.
Actualizaciones
editJuly 2024: API available through LiftWing
editWe appreciate everyone's patience as we've worked with the Machine Learning team to migrate the model to LiftWing. In August we will clean up the client side code to remove test conditions and add in improvements mentioned in the January 2024 update. In the following months we will reach out to different language communities to make the feature available to them in the app.
If you are a developer and would like to build a gadget using the API, you can read the documentation here.
January 2024: Results of Experiment
editLanguages Included in grading:
edit- Arabic
- Czech
- German
- English
- Spanish
- French
- Gujarati
- Hindi
- Italian
- Japanese
- Russian
- Turkish
Additional languages monitored by staff that did not have community graders:
- Finnish
- Kazakh
- Korean
- Burmese
- Dutch
- Romanian
- Vietnamese
Was there a difference between Machine Accepted and Human Generated Edit Average and Median Grades:
editGraded Edits | Avg Grade | Median Grade |
Machine Accepted Edits | 4.1 | 5 |
Human Generated Edits | 4.2 | 5 |
- Note: 5 was the highest possible score
How did the model hold up across languages?
editLanguage | Machine Accepted
Edits Avg. Grade |
Human Generated
Edits Avg. Grade |
Machine Avg.
Grade Higher? |
Recommendation of if feature should be enabled |
ar* | 2.8 | 2.1 | TRUE | No |
cs | 4.5 | Not Applicable | Yes | |
de | 3.9 | 4.1 | FALSE | 50+ Edits Required |
en | 4.0 | 4.5 | FALSE | 50+ Edits Required |
es | 4.5 | 4.1 | TRUE | Yes |
fr | 4.0 | 4.1 | FALSE | 50+ Edits Required |
gu* | 1.0 | Not Applicable | No | |
hi | 3.8 | Not Applicable | 50+ Edits Required | |
it | 4.2 | 4.4 | FALSE | 50+ Edits Required |
ja | 4.0 | 4.5 | FALSE | 50+ Edits Required |
ru | 4.7 | 4.3 | TRUE | Yes |
tr | 3.8 | 3.4 | TRUE | Yes |
Other language communities | Not Applicable | Not Applicable | Not Applicable | Can be enabled upon request |
- Note: We will not enable the feature without engaging communities first.
* Indicates language communities where there weren’t many suggestions to grade which we believe had an impact on the score
How often were Machine Suggestions Accepted, Modified or Rejected?
editEdit type | % of Total Machine Edits |
Machine suggestion accepted | 23.49% |
Machine suggestion modified | 14.49% |
Machine suggestion rejected | 62.02% |
- Note: Rejection means the machine suggestion was not selected though it was available. Machine suggestions were behind an affordance that read "Machine Suggestions". Users that did not view the machine suggestions at all would count in the "rejected" bucket. Rejected is intended to communicate the user had a preference of typing out their article short description instead.
What was the distribution of Machine Accepted Article Descriptions with a score of 3 or higher?
editScore | Percent Distribution |
< 3 | 10.0% |
>= 3 | 90.0% |
How did the Machine Accepted Article Descriptions scoring change when taking editor experience into account?
editEditor Experience | Average Edit Grade | Median Edit Grade |
Under 50 Edits | 3.6 | 4 |
Over 50 Edits | 4.4 | 5 |
Our experiment tested two beams to see which was more accurate and performant. To avoid bias, the placement of the suggestion to the user switched positions each time. The results are:
editBeam Selected | Average Edit Grade | % Distribution |
1 | 4.2 | 64.7% |
2 | 4.0 | 35.3% |
- Note: When rereleasing the feature we will only display beam 1.
How often are people making edits (modifications) to the machine suggestion before publishing?
editEdit Type | Modification Distribution |
Machine Accepted Not Modified | 61.85% |
Machine Accepted Modified | 38.15% |
How do users modifying the machine suggestion impact accuracy?
editMachine Graded Edits | Avg. Score |
Not Modified | 4.2 |
Modified | 4.1 |
- Note: Due to there not being an impact on accuracy if a user modifies the suggestion or not we do not see a need to require users to make a change to the recommendation, but we should still maintain a UI that encourages edits to the machine suggestion
How often did a grader say they would revert vs rewrite an edit based on if it was Machine Suggested or Human Generated?
editGraded Edits: | % edits would revert | % edits would rewrite |
Editor accepted suggestion | 2.3% | 25.0% |
Editor saw suggestion but wrote out their own description instead | 5.7% | 38.4% |
Human edit no exposure to suggestion | 15.0% | 25.8% |
- Note: We defined revert as the edit is so inaccurate it is not worth trying to make a minor modification to improve it as a patroller. Rewrite was defined as a patroller would just modify what was published by the user to improve it. Over the course of the experiment only 20 machine edits were reverted across all projects, which was not statistically significant so we could not compare actual reverts, instead we went based on recommendations by graders. Only two language communities have their article descriptions live on Wikipedia, which means patrolling is less frequent for most language communities due to descriptions being hosted on Wikidata.
What insights did we gain through the feature’s report function?
edit0.5% of unique users reported the feature. Below is a distribution of the type of feedback we received:
Feedback/Response | % Distribution of feedback |
Not enough info | 43% |
Inappropriate suggestion | 21% |
Incorrect dates | 14% |
Cannot see description | 7% |
"Unnecessary hook" | 7% |
Faulty spelling | 7% |
Does the feature have an impact on retention?
editRetention Period | Group 0
(No treatment) |
Groups 1 and 2 |
1-day average return rate: | 35.4% | 34.9% |
3-day average return rate: | 29.5% | 30.3% |
7-day average return rate: | 22.6% | 24.1% |
14-day average return rate: | 14.7% | 15.8% |
- Note: Users exposed to Machine Assisted Article Descriptions had a marginally higher return rate as compared to users not exposed to the feature
Next Steps:
editThe experiment was run on Cloud Services, which is not a sustainable solution. There are enough positive indicators to make the feature available to communities that desire it. The apps team will work in partnership with our Machine Learning to migrate the model to Liftwing, once it has been migrated and sufficiently tested for performance, we will re-engage our language communities to determine where to enable the feature and what additional improvements can be made to the model. Modifications that are currently top of mind include:
- Restrain Biographies of Living Persons (BLP): During the experiment we allowed users with over 50 edits to add descriptions to Biographies of Living Persons with the help of Machine Assistance. We recognize there are concerns about permanently suggesting article descriptions on these articles. While we did not see evidence of issues related to Biographies of Living Persons, we are happy to not show suggestions on BLPs.
- Only use Beam 1: Beam 1 consistently outperformed Beam 2 when it came to suggestions. As a result, we will only show one recommendation, and it will be from Beam 1.
- Modify Onboarding & Guidance: During the experiment we had an onboarding screen about machine suggestions. We would add back in guidance around machine suggestions when rereleasing the feature. It would be helpful to hear feedback from the community about what guidance they would like us to provide to users about writing effective article descriptions so that we can improve onboarding.
If there are other obvious errors, please leave a message on our project talk page so that we can address it. An example of an obvious error is displaying incorrect dates. We noticed this error during testing on the app and added a filter that prevents recommendations descriptions that include dates that are not mentioned themselves in the article text. We also noticed that disambiguation pages were recommended by the original model, and filtered out disambiguation pages client side, which is a change we plan to maintain. Other things such as capitalization of the first letter would also be a general fix that we could do because there is a clear heuristic we could use to implement it.
For languages where the model is not performing well enough to deploy, the most useful thing is adding more article descriptions in that language so that retraining of the model will have more data to go on. There isn't a set date or frequency at this point, however, for which the model will be retrained but we can work with the Research and Machine Learning team to get this prioritized as communities request it.
July 2023: Early Insights from 32 Days of Data Analysis: Grading Scores and Editing Patterns
editWe can not complete our data analysis until all entries have been graded so that we have an accurate grading score. However we do have early insights we can share. These insights are based on 32 days of data:
- 3968 Articles with Machine Edits were exposed to 375 editors.
- Note: Exposed does not mean selected.
- 2125 Machine edits were published by 256 editors
- Editors with 50+ edits completed three times the amount of edits per unique compared to editors with less than 50 edits
May 2023: Experiment Deactivated & Volunteers Evaluate Article Short Descriptions
editThe experiment has officially been deactivated and we are now in a period of edits being graded.
Volunteers across several language Wikis have begun to evaluate both human generated and machine assisted article short descriptions.
We express our sincere gratitude and appreciation to all the volunteers, and have added a dedicated section to honor their efforts on the project page. Thank you for your support!
We are still welcoming support from the following language Wikipedias for grading: Arabic, English, French, German, Italian, Japanese, Russian, Spanish, and Turkish languages.
If you are interested in joining us for this incredible project, please reach out to Amal Ramadan. We look forward to collaborating with passionate individuals like you!
April 2023: FAQ Page and Model Card
editWe released our experiment in the 25 mBART languages this month and it will run until mid-May. Prior to release we added a model card to our FAQ page to provide transparency into how the model works.
-
Suggested edits home
-
Suggested edits feed
-
Suggested edits onboarding
-
Active text field
-
Dialog Box
-
What happens after tapping suggestions
-
Manual text addition
-
The preview
-
Tapping the report flag
-
Confirmation
-
Gender bias support text
This is the onboarding process:
-
Article Descriptions Onboarding
-
Keep it short
-
Machine Suggestions
-
Tooltip
Enero de 2023: Diseños actualizados
editTras determinar que las sugerencias podían integrarse en las descripciones de los artículos existentes, el equipo de Android actualizó nuestro diseño.
-
Tooltip para la incorporación de la función
-
Una vez que el tooltip es descartado el teclado se activa
-
Aparece un diálogo con sugerencias cuando los usuarios tocan "mostrar descripciones sugeridas"
-
Al tocar una sugerencia, se rellena el campo de texto y se activa el botón de publicación.
Si un usuario informa de una sugerencia, verá el mismo cuadro de diálogo que propusimos en nuestra actualización de agosto de 2022 como lo que se verá si alguien hace clic en No estoy seguro.
Este nuevo diseño significa que permitiremos a los usuarios publicar sus ediciones, como podrían hacerlo sin las sugerencias generadas por la máquina. Sin embargo, nuestro equipo controlará las ediciones que se realicen a través de este experimento para garantizar que no abrumamos a los patrulleros voluntarios. Además, los nuevos usuarios no recibirán sugerencias para biografías de personas vivas.
Noviembre de 2022: Desarrollo de la API
editEl equipo de investigación puso el modelo en toolforge y probó el rendimiento de la API. Los primeros datos revelaron que se tardaban entre 5 y 10 segundos en generar sugerencias, lo que también variaba en función del número de sugerencias que se mostraban. El rendimiento mejoraba a medida que disminuía el número de sugerencias generadas. Para solucionar este problema se precargaron algunas sugerencias, se restringió el número de sugerencias que se mostraban cuando se integraban en las descripciones de los artículos y se modificaron los flujos de usuario para garantizar que las sugerencias se generaran en segundo plano.
August 2022: Initial Design Concepts and Guardrails for Bias
editHistoria de usuario para el descubrimiento
Cuando utilizo la aplicación de Wikipedia para Android, he iniciado sesión y descubro un tooltip sobre una nueva función de edición, quiero que me informen sobre la tarea para poder probarla. Pregunta abierta: ¿Cuándo debe verse esta información sobre herramientas en relación con otras?
Historia de usuario para la educación
Cuando quiero probar la función de descripción de artículos, quiero que me informen sobre la tarea, para que mis expectativas sean correctas.
Historia de usuario para añadir descripciones
Cuando utilizo la función de descripción de artículos, quiero ver los artículos sin descripción, quiero que se me presenten dos descripciones adecuadas y una opción para añadir una descripción propia, de modo que pueda seleccionar o añadir una descripción para varios artículos seguidos.
-
Concepto para seleccionar una descripción de artículo sugerida
-
Concepto de diseño para un usuario que decide que la descripción debe ser una alternativa a lo que aparece en la lista
-
Concepto de diseño para que un usuario edite una sugerencia antes de pulsar publicar
-
Concepto de diseño para lo que ven los usuarios al pulsar otros
-
Pantalla que muestra opciones para cuando un usuario dice que no está seguro de cuál debería ser la descripción correcta del artículo.
Guardrails for bias and harm
editThe team generated possible guardrails for bias and harm:
- Harm: problematic text recommendations
- Guardrail: blocklist of words never to use
- Guardrail: check for stereotypes – e.g., gendered language + occupations
- Harm: poor quality of recommendations
- Guardrail: minimum amount of information in article
- Guardrail: verify performance by knowledge gap
- Harm: recommendations only for some types of articles
- Guardrail: monitor edit distribution by topic
Gratitude and Appreciation for the Dedicated Volunteers of the Article Description Grading Project
editWe want to take a moment to express our deepest gratitude and heartfelt appreciation to each and every dedicated volunteer who has selflessly offered their time and unwavering support to the article description grading project. As the Apps Team at the Wikimedia Foundation, we are truly humbled by your invaluable contribution.
Your unwavering commitment to diligently patrol edits made during the experiment and wholeheartedly assign grades to them has played an irreplaceable role in helping us understand the impact of machine-generated article descriptions on the quality of edits. Your exceptional vigilance and unwavering dedication are the cornerstones of our collective efforts.
Your active participation in this project goes far beyond mere involvement; it represents a genuine and profound commitment to advancing the Wikimedia mission. Through your tireless efforts, our platforms continue to evolve and improve, creating an enhanced and enriching user experience for millions around the globe.
The boundless enthusiasm and unyielding passion you bring to the table truly inspire us. Together, we are forging a path toward a future where knowledge and accessibility know no bounds within the realm of our Wikipedia Android App.
Once again, we want to express our sincerest thanks for your extraordinary support and unwavering dedication. Your invaluable contributions are the lifeblood of this project and the broader Wikipedia community.
VatBatCat | Umasoyee |
Bernilein111 | Moha Elkotsh |
Harouna674 | Anupamdutta73 |
Barke11 | Shayi ngolu |
Terio legal | Mndetatsin |
Beheme | CptViraj |
And countless users who preferred to remain anonymous.