النمو/التعزيز الإيجابي

This page is a translated version of the page Growth/Positive reinforcement and the translation is 45% complete.

تصف هذه الصفحة العمل على «التعزيز الإيجابي» كجزء من حُزْمَة مزايا فريق النموّ. تحتوي هذه الصفحة على الموجود والتصاميم والأسئلة المفتوحة والقرارات.

سيتم نشر معظم الأخبار الخاصة بتدرّج العمل في الصفحة العامة لـأخبار فريق النمو، مع نشر بعض الأخبار الهامّة أو المفصّلة هنا.

الوضع الحاليّ

ملخّص

كان فريق النموّ مركّزاً على إنشاء «تجرية متماسكة لدى الوافدين الجدد» التي توفّر حصول الوافدين الجدد على عناصر تُساعدهم على الاندماج في تقاليد مجتمع ويكيبيديا. على سبيل المثال، مع مهام الوافدين الجدد، وفّرنا لهم فرصاً للمساهمة، ومع وحدة الإرشاد، وفّرنا لهم الحصول على إرشاد. تمكنت التعديلات المقترحة من جذب المزيد من الوافدين الجدد لإجراء تعديلاتهم الأولى. مع هذا النجاح، نريد اتخاذ إجراءات لتشجيع الوافدين الجدد على الاستمرار في إجراء المزيد من التعديلات. هذا يلفت انتباهنا إلى عنصر غير مطور يحتاج الوافدون الجدد إلى الوصول إليه: تقييم الأداء. أطلقنا على هذا المشروع اسم «التعزيز الإيجابي».

نريد أن يفهم الوافدون الجدد أن هناك تقدمًا وقيمة للمساهمات المستمرة على ويكيبيديا، مما يزيد من الاحتفاظ بهؤلاء المستخدمين الذين اتخذوا الخطوة الأولى في إجراء التعديل.

سؤالنا الكبير هنا هو: كيف يمكننا تشجيع الوافدين الجدد الذين زاروا لوحة المستخدم وجربوا مزايانا لمواصلة التعديل والإنشاء على وتيرتهم؟

معلومات أساسية

عندما تم نشر لوحة المستخدم للوافد الجديد في عام 2019، كانت تحتوي على «وحدة تأثير» أساسيّة، والتي أدرجت عدد مشاهدات الصفحة للصفحات التي قام المستخدم الجديد بتحريرها. هذا هو الجزء الوحيد من مزايا النمو التي تمنح الوافد الجديد أي إحساس بتأثيره، ولم نقم بتحسينه منذ أن تم نشره أوّل مرة.

 
لقطة شاشة لوحدة التأثير على الويكيبيديا الانقليزية

مع هذا كنقطة انطلاق، قمنا بتجميع بعض المعلومات الهامة حول التعزيز الإيجابي:

  • لقد سمعنا تعليقات جيدة من أعضاء المجتمع حول الوحدة، مع محررين متمرسين يقولون إنه مثير للاهتمام وقيِّم لهم.
  • وقد ثبت أن التقدير الآتي من المستخدمين الآخرين يزيد من الاستبقاء، كما في حالة «شكرًا» (هنا وهنا) وفي تجرِبة من ويكيبيديا الألمانية. نحن نعتقد أنّ هذه التعزيزات من قبل أشخاص حقيقيّين ستكون أكثر فعالية من تلك التي تأتي من النظام الآلي.
  • أوضح أعضاء المجتمع أن الانتقال إلى مهام أكثر قيمة بعد البَدْء بمهام سهلة، يمثل أولوية قصوى للوافدين الجدد، بدلاً من التعثر في القيام بمهام سهلة.
  • تستخدم المنصات الأخرى، مثل Google وDuolingo و Github، العديد من آليات التعزيز الإيجابي مثل الشارات والأهداف.
  • المجتمعات حذرة من تحفيز التحرير الغير صحي. لقد رأينا أنه في مسابقات التحرير التي تقدم جوائز نقدية، أو فقط عندما تعتمد مستويات المستخدمين المتقدّمة مثل «المؤكّدين تِلْقائيًا» على عدد التعديلات، يمكن أن تحفّز الأشخاص على إجراء العديد من التعديلات التي تنطوي على مشكلات.

شخصية المستخدم

 
رسم تخطيطي لرحلة المحرر الجديد يوضح فرص التعزيز الإيجابي

هناك أجزاء كثيرة من رحلة الوافد الجديد يمكننا أن نحاول فيها زيادة الاستبقاء. يمكننا التركيز على الوافدين الجدد الذين توقفوا عن التحرير بعد تعديل واحد أو بضعة تعديلات، أو يمكننا التركيز أكثر في الرحلة على الوافدين الجدد الذين توقفوا عن التحرير بعد أسابيع من النشاط. بالنسبة لهذا المشروع، قررنا التركيز على هؤلاء الوافدين الجدد الذين أكملوا جلسة التحرير الأولى، والذين نريدهم أن يعودوا لجلسة ثانية. يوضح الرسم البياني هذا بنجمة صفراء.

نريد التركيز على الوافدين الجدد في هذه المرحلة، فهذه هي المرحلة التالية من مسار التحويل الذي يمكننا من خلاله المساعدة في تحسين الاستبقاء. إنه أيضًا المكان الذي نشهد فيه معدل تناقص كبير جدًا حاليًا، لذلك إذا تمكنا من المساعدة في استبقاء الوافدين الجدد في هذه المرحلة، فيجب أن يكون هناك زيادة ذات مغزى في نموّ المحرر بمرور الوقت.

البحوث والتصميم

تم إجراء بحث حول الآليات المختلفة التي تم توظيفها لتشجيع الناس على المساهمة بالمحتوى في منتجات الويكي وخارجها. فيما يلي بعض النتائج الرئيسية للبحث:

  • دوافع محرري ويكيبيديا متعددة الأوجه، وتتغير بمرور الوقت ومع اكتساب الخبرة. غالبًا ما يكون الدافع وراء المحررين الجدد هو الفضول والتواصل الاجتماعي أكثر من الأيديولوجيا.
  • تركز المشاريع الداخلية على الحوافز الجوهرية، وتناشد الدوافع الإيثارية، ولا يتم تطبيقها بشكل دائم.
  • قد يؤدي توسيع الجذب إلى ما وراء الدوافع الأيديولوجية إلى تحسين تنوّع المحرّرين المحتفظ بهم على ويكيبيديا.
  • أثبتت الرسائل الإيجابية من المستخدمين والمرشدين ذوي الخبرة فعاليتها في الاستبقاء على المدى القصير.

لمشاهدة ملخّص لأفكار التصميم الحالية الخاصة بالتعزيز الإيجابي، راجعوا موجز التصميم هذا. ستتطور تصميماتنا بشكل أكبر من خلال تعليقات المجتمع والعديد من جولات اختبار المستخدم.

أفكار

لدينا ثلاث أفكار رئيسية للتعزيز الإيجابي. قد نتابع أفكارًا متعددة أثناء عملنا في هذا المشروع.

التأثير

  • التأثير: إصلاح شامل لوحدة التأثير عبر دمج الإحصائيات والرسوم البيانية ومعلومات المساهمة الأخرى. ستوفر وحدة التأثير المنقحة مزيدًا من السياق للمحررين الجدد حول تأثيرهم، فضلاً عن تشجيعهم على مواصلة المساهمة. تشمل مجالات الاستكشاف ما يلي:
    • تُعدّ التعديلات المقترحة مرحلةً هامّة، لحث المستخدمين على تجرِبة التعديلات المقترحة.
    • إحصائيات حول مقدار التحرير الذي قام به المستخدم بمرور الوقت (على غرار ما هو موجود في X Tools).
    • أهمية «تم الشكر»، لإبراز القدرة على تلقي تقدير المجتمع.
    • نشاط التحرير الأخير - يشمل الأيام المتتالية التي قام أثناءها الوافدون الجدد بالتحرير («متسلسلة») لتشجيع المشاركة المتواصلة أو تذكير الأشخاص بالرجوع إلى المساهمة.
    • عرض نشاط القراءة على المقالات التي قام الوافدون الجدد بتحريرها بمرور الوقت (على غرار المعلومات الموجودة على Wikipedia:Pageview_statistics).

التطوير

  • التطوير: من المهم للمجتمعات أن يتقدم الوافدون الجدد إلى مهام أكثر قيمة. بالنسبة لأولئك الذين يقومون بالعديد من المهام السهلة، نريد دفعهم نحو تجربة مهام أكثر صعوبة. يمكن أن يحدث هذا بعد إكمال عدد معيّن من المهام السهلة، أو عن طريق التشجيع على لوحة المستخدم الخاصة بهم. تشمل مجالات الاستكشاف ما يلي:
    • سيرى الوافد الجديد رسائل النجاح بعد التحرير، تحفزهم على إجراء المزيد من التعديلات على نفس المستويات أو مستويات صعوبة مختلفة.
    • في وحدة التعديلات المقترحة، توفير فرص لإجراء تعديلات أكثر صعوبة، حتى يصبح الوافدون الجدد محررين أكثر مهارة.
    • في وحدة التأثير، تضمين عدّاد المراحل أو منطقة الجائزة.
    • في لوحة المستخدم، إضافة وحدة جديدة ذات تحدّيات محدّدة للحصول على بعض المكافآت (وسام/شهادة).
    • إضافة إخطارات لمطالبة الوافدين الجدد بتجربة مهمة أكثر صعوبة.

الثناء المشخّص

  • الثناء المشخّص: تظهر البحوث أن الثناء والتشجيع من المستخدمين الآخرين يزيدان من استبقاء الوافدين الجدد. نريد التفكير في كيفية تشجيع المستخدمين ذوي الخبرة على شكر الوافدين الجدد ومكافأتهم على المساهمات الجيدة. ربما يمكن تشجيع المرشدين على القيام بذلك بواسطة لوحات المرشدين الخاصة بهم أو بواسطة الإشعارات. يمكننا الاستفادة من آليات الاتصال الحالية التي أثبتت الدراسات السابقة أن لها درجة من التأثير الإيجابي. تشمل مجالات الاستكشاف ما يلي:
    • ظهور رسالة شخصية من مرشد الوافد الجديد في لوحة المستخدم.
    • إشعار صدى من المرشد أو من فريق النموّ ويكيميديا.
    • «شكر» من أجل تعديل معيّن.
    • شارة لبلوغ مرحلة هامة جديدة يمنحها المرشد أو فريق نموّ ويكيميديا فيما يتعلق بتعديل معيّن.

النقاشات المجتمعية

ناقشنا مشروع التعزيز الإيجابي مع أفراد المجتمع من ويكيبيديا العربية، ويكيبيديا البنغالية، ويكيبيديا التشيكية و ويكيبيديا الفرنسية، وهنا على mediawiki.org.

تلقينا تعليقات مباشرة حول الأفكار الرئيسية الثلاثة، إلى جانب العديد من الأفكار الأخرى لتحسين الاحتفاظ بالمحرر الجديد.

يوجد أدناه ملخص للمواضيع الرئيسية من التعليقات، جنبًا إلى جنب مع كيف نخطط للتحسين بناءً على التعليقات.

التأثير

سمعنا... خطط للتحسين بناءً على التعليقات
😊 تبدو جيدة! تبدو هذه الفكرة الأقل إثارة للجدل والأكثر دعمًا. سنخطط لبدء التطوير في هذا الموضوع أولاً، والسماح بمزيد من الوقت لتنقيح الأفكار الأخرى.
😐 ستكون وحدة التأثير أكثر فاعلية إذا تم توسيع نطاقها مع اكتساب المحررين للخبرة. نخطط للتركيز على الوافدين الجدد في الوقت الحالي، ولكن سيتم إنشاء وحدة التأثير الجديدة بطريقة قابلة للتوسيع لاستيعاب التحسينات في المستقبل.

التطوير

سمعنا... خطط للتحسين بناءً على التعليقات
😊 يضمن التطوير/رفع المستوى عدم «عُلُوق» الوافدين الجدد في المهام السهلة بمجرد حصول المستخدمين على عدد معين من التعديلات من نوع واحد غير مسترجعة، يجب أن نقترح عليهم تجرِبة مهام أكثر صعوبة.
😊 غالبًا ما يتوق الوافدون الجدد إلى الحصول على الجوائز إذا قدمنا جوائز، فيجب أن تكون ذات مغزى للوافدين الجدد، ومن الناحية المثالية يمكن مشاركتهم إما على الويكي (على صفحة المستخدم الخاصة بهم) أو خارج الويكي.
❌ قد تكون الحوافز المستندة إلى هدف، ذات مشكلة، وقد تؤدي إلى تعديلات منخفضة الجودة قد تكون الحوافز التي تتضمن عنصرًا مستندًا إلى الوقت (على غرار خدمة الجوائز) نهجًا فعالاً لأنها لا تؤثر فقط في عدد عمليات التحرير، بل في طول الوقت المسجل. Certain "quality gates" could help slow down and guide newcomers if they are making edits that are getting reverted. نحن نخطط لتقليل الاهتمام بجزء الجائزة في «التطوير» في الوقت الحالي، والتركيز أكثر على تشجيع المستخدمين على تجرِبة أنواع المهام الأكثر صعوبة لأنهم ينجحون في المهام الأسهل.
❌ قد تكون الأهداف اليومية مرهقة ومثبطة للهمم لبعض الناس سنراجع هذه الفكرة بشكل أكبر ومن المحتمل أن نسمح بتخصيص الهدف إذا تابعنا هذه الفكرة.

الثناء المشخّص

سمعنا... خطط للتحسين بناءً على التعليقات
😊 قد يساعد نشر المديح والإيجابية في زيادة الاحتفاظ بالوافدين الجدد. ما زلنا نعمل على تحسين التصميمات لكيفية تشجيع المزيد من الشكر والقضاء الشخصي للوافدين الجدد، ولكن نأمل في الحصول على أفكار تصميمية أخرى لتقديمها قريبًا.
😐 قد يكون توسيع نطاق المديح المشخّص أمرًا صعبًا لأنه يستغرق وقتًا أطول للمحررين ذوي الخبرة. المرشدون هم مشغولون فعلًا، لذلك نأمل أن نجد طريقة لإبراز المتعلّمين «الجديرين بالثناء». سنقوم أيضًا بعصف ذهني لأفكار أخرى لا تعتمد فقط على المرشدين.
😐 يجب أن نستخدم الأنظمة الموجودة (شكرًا، ويكي الحب، إلخ.) لم يتم الانتهاء من الخطط، لكننا بالتأكيد نخطط للاستفادة من الأنظمة الحالية.

أفكار أخرى:

اقترح أعضاء المجتمع عدة أفكار أخرى لتحسين مشاركة الوافدين الجدد والاحتفاظ بهم. نعتقد أن هذه، كلّها أفكار قيمة (بعض منها نستكشفها فعلًا أو نرغب في العمل عليها في المستقبل) ولكن الأفكار التالية لن تتناسب مع نطاق المشروع الحالي:

  • إرسال رسائل استقبال وترحيب عبر البريد الإلكتروني للوافدين الجدد (يقوم فريق النموّ حاليًا باستكشاف رسائل البريد الإلكتروني المُشَرِّكة بالتعاون مع فرق التسويق وجمع التبرعات).
  • عرض مشاريع الويكي للوافدين الجدد التي تتعلق باهتماماتهم.
  • تضمين عنصر واجهة مستخدم قابل للتخصيص في لوحة المستخدمين للوافدين الجدد للسماح لمواقع الويكي بالترويج لمهام أو أحداث معينة للوافدين الجدد.
  • إرسال إشعارات إلى المستخدمين الذين يرحبون بالوافدين الجدد بمجرد وصول الوافد الجديد إلى مراحل تحرير معينة (للمساعدة في حثّ المستخدم على تقديم الشكر أو ويكي الحبّ).

الاستشارة المجتمعية الثانية:

في فبراير 2023، أكملنا استشارة مجتمعية استعرضنا فيها أحدث تصميمات التطوير باستخدام مواقع الويكي النموذجية لفريق النموّ. This consultation was completed in English on MediaWiki, and at Arabic Wikipedia, Bengali Wikipedia, Czech Wikipedia, and Spanish Wikipedia. (T328356) In general, feedback was quite positive. These two tasks help address feedback mentioned by those that responded to our questions:

  • التطوير: اﻹعداد المجتمعي (T328386)
  • التطوير: التحسين الثاني للتصميم الخاصّ بـحوار «جرّبوا مهمّةً جديدة» (T330543)

In March 2023, we completed a community consultation in which we reviewed the most recent Personalized praise designs with the Growth Pilot wikis. This consultation was completed on English Wikipedia, Arabic Wikipedia, Bengali Wikipedia, Czech Wikipedia, French Wikipedia, Spanish Wikipedia, and at MediaWiki in English. (T328356) Most feedback was supportive of Personalized praise features, but several further improvements were requested. We've created Phabricator tasks to address these further improvements.

  • في ويكيبيديا العربية ومواقع الويكي الأخرى ذات التعديلات المعلّقة، لا يرغب المرشدون في رؤية عدد التعديلات التي أكملها المستخدم فحسب، بل يريدون المزيد من التفاصيل حول حالة مراجعة التعديلات (T333035)
  • يرغب المرشدون في أن يكونوا قادرين على عرض عدد أو نسبة الاسترجاعات التي حصل عليها متدربهم، وتخصيص عدد الاسترجاعات لديهم حتّى يمكن اعتبارهم جديرين بالثناء (T333036)
  • سيقدر المرشدون معرفة التعديل الذي تم شكر المتدرب عليه (T51087)

اختبار المستخدم

إلى جانب المناقشة المجتمعية، أردنا التحقق من صحة تصميماتنا وفرضياتنا الأولية والإضافة إليها عن طريق اختبار التصاميم مع القراء والمحررين من عدة بلدان. لذلك أجرى فريق بحث التصميم لدينا اختبارًا على التعزيز الإيجابي للمستخدم يهدف إلى فهم تأثير المشروع بشكل أفضل على مساهمة الوافدين الجدد عبر عدّة لغات مختلفة.

اختبرنا العديد من تصاميم التعزيز الإيجابي الثابتة مع قراء ومحرري ويكيبيديا باللغات العربية والإسبانية والإنقليزية. Along with testing Positive Reinforcement designs we introduced data visualizations from xtools as a way to better understand how these data visualizations are perceived by newcomers.

 
ملخص اختبار التعزيز الإيجابي للمستخدم

نتائج اختبار المستخدم

  • Make impact data actionable: Impact data was a compelling feature for participants with more experience editing, which several related to their interest in data—an unsurprising quality for a Wikipedian. For those new to editing, impact data, beyond views and basic editing activity, may be more compelling if linked to goal-setting and optimizing impact.
  • Evaluate the ideal editing interval: Across features, daily intervals seemed likely to be overly ambitious for new and casual editors. Participants also reflected on ignoring similar mechanisms on other platforms when they were unrealistic. Consider consulting usage analytics to identify “natural” intervals for new and casual editors to make goals more attainable.
  • Ensure credibility of assessments: Novice editor participants were interested in the assurance of their skills and progress the quality score, article assessment, and badges offer. Some hoped that badges could lend credibility to their work reviewed by more experienced editors. With that potential, it could be valuable to evaluate that the assessments are meaningful measures of skill and further explore how best to leverage them to garner community trust of newcomers.
  • Reward quality and collaboration over quantity: Both editor and reader participants from esWiki were more interested in recognition of their knowledge or expertise (quality) than the number of edits they have made (quantity). Similarly, some Arabic and English editors are motivated by their professional interests and skill development to edit. Orienting goals and rewards to other indicators of skilled edits, such as adding references or topical contributions, and collaboration or community involvement may also help mitigate concerns about competition overtaking collaboration.
  • Prioritize human recognition: While scores and badges via Growth tasks is potentially valued, recognition from other editors appears to be more motivational. Features which promote giving, receiving, and revisiting thanks seemed most compelling, and editors may benefit from selecting impact data which demonstrates engagement with readers or editors most compelling to them.
  • Experiment with playfulness of designs: While some positive reinforcement features can be seen as the product of “gamification”, some participants (primarily from EsWiki) felt that simple, fun designs were overly childish or playful for the seriousness of Wikipedia. Consider experimenting with visual designs that vary in levels of playfulness to evaluate broader reactions to “fun” on Wikipedia.

التصميم

 
تصاميم وحدة التأثير

فيما يلي التصاميم الحالية للتعزيز الإيجابي. لقد قمنا بتحسين الأفكار الرئيسية الثلاثة الموضحة أعلاه، لكن نطاق الخطط والتصاميم الفعلية تطورت بناءً على التعليقات من مناقشات المجتمع واختبار المستخدم.

التأثير

The revised impact module provides new editors with more context about their impact. The new design includes far more personalized info and data visualizations than the previous design. This new design is fairly similar to the design we shared previously when discussing this feature with communities. You can view the current engineering progress at beta wiki, and we hope to release this feature to Growth pilot wikis soon.

التطوير

تركز ميزات التكوير على تشجيع الوافدين الجدد للتقدم نحو مهام أكثر قيمة. تتضمن الأفكار أيضًا بعض المطالبات للمحررين الجدد لتجربة التعديلات المقترحة، حيث تم إظهار المهام المهيكلة على أنها تعمل على تحسين التنشيط والاحتفاظ بالوافدين الجدد.

  • رسالة حوار ما بعد التعديل لـ«التطوير»: تمت إضافة نوع جديد لرسالة حوار ما بعد التحرير لتشجيع الوافدين الجدد على تجرِبة نوع جديد لمهمة. نأمل أن يشجع هذا بعض المستخدمين على تعلم مهارات تحرير جديدة أثناء تقدمهم في مهام مختلفة أكثر صعوبة.
  • Post-edit dialog for non-suggested edits: Introduce newcomers who complete ‘normal’ edits to suggested edits. We plan to experiment by showing newcomers a prompt post 3rd and 7th edit. Desktop users who click through to try a suggested edit will also see their Impact module, which we hope helps engage newcomers and provides a small degree of automated positive reinforcement. We will carefully measure this experiment, and ensure there aren't any unintentional negative effects.
  • New notifications: New echo notifications to encourage newcomers to start or continue suggested edits. This acts as a proxy to “win-back” emails for those who have an email address and settings on to receive email notifications.

الثناء المشخّص

Personalized praise features are based on research results that show that encouragement and thanks from other users increases editor retention.

  • Encouragement from Mentors: We will add a new module to the Mentor dashboard, that is designed to encourage Mentors to send personalized messages to newcomers who meet certain criteria. We will allow Mentors to customize and control how and when "praise-worthy" mentees are surfaced.
  • Increasing Thanks across the wiki: We plan to fulfill the community wishlist item to Enable Thanks Button by default in Watchlists and Recent Changes (T51541, T90404). We hope this will increase Thanks and positivity across the wikis, and hopefully newcomers will benefit from this directly or indirectly.

القيس والنتائج

Hypotheses

The Positive Reinforcement features aim to provide or improve the tools available to newcomers and mentors in three specific areas that will be described in more detail below. Our hypothesis is that once a newcomer has made a contribution (say by making a structured task edit), these features will help create a positive feedback cycle that increases newcomer motivation.

Below are the specific hypotheses that we seek to validate across the newcomer population. We will also have hypotheses for each of the three sets of features that the team plans to develop. These hypotheses drive the specifics for what data we will collect and how we will analyse that data.

  1. The Positive Reinforcement features increase our core metrics of retention and productivity.
  2. Since the Positive Reinforcement features do not feature a call to action that asks newcomers to make edits, we will see no difference in our activation core metric.
  3. Newcomers who get the Positive Reinforcement features are able to determine that making un-reverted edits is desirable, and we will see a decrease in the proportion of reverted edits.
  4. The positive feedback cycle created by the Positive Reinforcement features will lead to a significantly higher proportion of "highly active" newcomers.
  5. The Positive Reinforcement features increase the number of Daily Active Users of Suggested edits.
  6. The average number of edit sessions during the newcomer period (first 15 days) increases.
  7. "Personalized praise" will increase mentor’s proactive communication with their mentees, which will lead to increase in retention and productivity.

Experiment plan

Similarly as we have done for previous Growth team projects, we want to test our hypotheses through controlled experiments (also called "A/B tests"). This will allow us to establish a causal relationship (e.g. "The Leveling Up features cause an increase in retention of xx%"), and it will allow us to detect smaller effects than if we were to give it to everyone and analyze the effects pre/post deployment.

In this controlled experiment, a randomly selected half of users will get access to Positive Reinforcement features (the "treatment" group), and the other randomly selected half will instead get the current (September 2022) Growth feature experience (the "control" group). In previous experiments, the control group has not gotten access to the Growth features. The team has decided to move away from that (T320876), which means that the current set of features is the new baseline for a control group.

The Personalized Praise feature is focused on mentors. There is a limited number of mentors on every wiki, whereas when it comes to newcomers the number increases steadily every day as new users register on the wikis. While we could run experiments with the mentors, we are likely to run into two key challenges. First, the limited number of mentors could mean that the experiments would need to run for a long time. Second, and more importantly, mentors are well integrated into the community and communicate with each other, meaning they are likely to figure out if some have access to features that others do not. We will therefore give the Personalized Praise features to all mentors and examine activity and effects on newcomers pre/post deployment in order to understand the feature’s effectiveness.

In summary, this means we are looking to run two consecutive experiments with the Impact and Leveling up features, followed by a deployment of the Personalized Praise features to all mentors. These experiments will first run on the pilot wikis. We can extend this to additional wikis if we find a need to do that, but it would only happen after we have analyzed the leading indicators and found no concerns.

Each experiment will run for approximately one month, and for each experiment we will have an accompanying set of leading indicators that we will analyze two weeks after deployment. The list below shows what the planned experiments will be:

  1. Impact: treatment group gets the updated Impact module.
  2. Leveling up: treatment group gets both the updated Impact module and the Leveling up features.
  3. Personalized praise: all mentors get the Personalized praise features.

Leading indicators and plan of action

While we believe that the features we develop are not detrimental to the wiki communities, we want to make sure we are careful when experimenting with them. It is good practice to define a set of leading indicators together with plans of what action to take based if a leading indicator suggests something isn't going the way it should. We have done this for all our past experiments and do so again for the experiments we plan to run as part of this project.

Impact

Impact of the Impact module - results published on Jan 24, 2023.
Indicator Expected result Plan of action Results
Impact module interactions No difference or increase If Impact module interactions decrease, then this suggests that we might have performance or compatibility issues with the new Impact module. If the proportion of newcomers who interact with the new Impact module is significantly lower than the old module we investigate the cause, reverting back to the old module if necessary. Significant decrease
Mentor module interactions No difference The new Impact module takes up more screen real estate than the old module, which might lead to newcomers not finding the Mentor module as easily as before. If the number of newcomers who interact with the Mentor module is significantly lower for those who get the new Impact module, we investigate the need for design changes. No signifiant difference
Mentor module questions No difference Similar concerns as for interactions with the Mentor module, if the number of questions asked to mentors is significantly lower for newcomers who get the new Impact module, we investigate the need for design changes. No signifiant difference
Edits and revert rate No difference in both edits and reverts, or an increase in edits and a decrease in revert rate If there is an increase in the revert rate, this may suggest that newcomers are making unconstructive edits in order to inflate their edit or streak count. If the revert rate of newcomers who get the new Impact module is significantly higher than the old, we investigate their edits and decide whether changes are needed. No signifiant difference (once outliers are removed)

Impact module interactions: We find that the proportion of newcomers who interact with the old module (٦٫١%) is significantly higher than for the new module (٥٫٠%):   This difference showed up early on in the experiment, and we have examined the data more closely understand what is happening. One issue we identified early on was that not all interaction events were instrumented, which we subsequently resolved. Examining further, we find that many of those who get the old module click on links to the articles or the pageviews. In the new module, a graph of the pageviews is available, thus removing some of the need for visiting the pageview tool. As a result, we decided that no changes were needed.

Mentor module interactions: We find no significant difference in the proportion of newcomers who interact with the Mentor module. The proportion for newcomers who get the old module is ٢٫٤%, for those who get the new module it's ٢٫٢%. A Chi-square test finds this difference not significant:  

Mentor module questions: We do not see a substantial difference in the number of questions asked between the old module (٢٦٩ edits) and the new module (٢٨١ edits). The proportion of newcomers who asks their mentor a question is also the same for both groups, at ١٫٥%.

Edits and revert rate: We do not see a substantial difference in the number of edits nor in the revert rate between the two groups measured on a per-user average basis. There are differences between the groups, but these are driven by some highly prolific editors, particularly on the mobile platform.

Levelling up

Leading indicators for the Levelling up experiment
Indicator Expected result Plan of action Results
Levelling up post-edit dialog: interactions No difference or increase The percentage of users who click / tap on a Levelling up post-edit dialog should be similar or higher than the percentage of users who click / tap on the standard post-edit dialog. If there is a decrease, then we need to investigate what causes this difference. Higher on mobile, no difference on desktop
Levelling up post-edit dialog: "Try a suggested edit" click through >١٠% click through to suggested edits If the "try a suggested edit" dialog isn't resulting in more newcomers exploring suggested edits, then this notice is just extra noise for newcomers and we should investigate or consider removing the feature. Significantly higher than ١٠%
Levelling up post-edit dialog: "Increase your skill level" click through >١٠% click through to Try new task If the "increase your skill level" isn't resulting in more newcomers trying more difficult tasks, then this notice is just extra noise for newcomers and we should investigate or consider removing the feature. Significantly higher than ١٠%
Levelling up notifications: "get started" click through >٥% of users who view this notification click on it We don't have a great baseline to compare this to, but if this number is too low we should investigate if there are technical issues or an issue with the language used. More than ٥% on desktop, less than ٥% on mobile
Levelling up notifications: "keep going" click through >٥% of users who view this notification click on it We don't have a great baseline to compare this to, but if this number is too low we should investigate if there are technical issues or an issue with the language used. More than ٥% on desktop, less than ٥% on mobile
Activation No difference or increase If we see a significant decrease in the treatment group, similar to what we discovered for the New Impact Module experiment, then we examine monitoring and event data to try to identify a cause of this difference. Decrease

Levelling up post-edit dialog interactions: We find a higher proportion of newcomers interacting with the post-edit dialog in the Levelling Up group (٩٠٫٨%) compared to the standard post-edit dialog (٨٦٫٥%). This is largely driven by mobile where the Levelling Up interaction proportion (٨٨٫٠%) is a lot higher than the other group (٨١٫٦%). The proportion is still higher for the Levelling Up group on desktop (٩٣٫٦%) compared to the control (٩٢٫٢%), but we regard it as "virtually identical" because the high proportion in the control group means there is little room for an increase.

Try a suggested edit click through rates: ٢١٫٩% of newcomers who see the "Try a suggested edit" post-edit dialogue chooses to click through, which is significantly higher than the threshold set. The proportion is higher on desktop (٢٤٫٠%) than on mobile (١٩٫٧%), but in neither case is there a reason for concern.

Increase your skill level click through rates: We find that ٧٣٫١% of newcomers who see the "increase your skill level" dialog click through to see the new task, which is a lot higher than our expected threshold of less than ١٠%. Proportions are high on both desktop (٧١٫١%) and mobile (٧٧٫٣%).

Get started click through rates: ٣٫٨% of newcomers who get the "Get started" notification clicks through to the Homepage. Users who registered on desktop are more likely to click the notification (٥٫٥%) than those on mobile (٢٫٥%). Because the threshold of ٥% is met, we are investigating further to understand this difference between desktop and mobile behaviour, particularly to understand if our ٥% threshold is reasonable.

Keep going click through rates: We find that ٩٫٦% of users who get the "Keep going" notification clicks through to the Homepage. Similarly as we do for the "Get started" notifications, we find a much higher proportion on desktop (١٦٫٢%) compared to mobile (٤٫٧%). Our investigations into differences in notification behaviour by platform will hopefully give us more insight into this difference.

Activation: We find a decrease in constructive article activation (making a non-reverted article edit within 24 hours of registration) of ٢٧٫٠% compared to ٢٧٫٧%. As soon as we noticed this we opened T334411 to investigate the issue, with a focus on patterns in geography (countries and wikis) and technology (devices and browsers). We did not find clear patterns explaining the issue. The investigation of this decrease in activation will be investigated further: T337320.

Personalized praise

Leading indicators for the Personalized praise experiment
Indicator Expected result Plan of action Results
Personalized praise notification click through At least ١٠% of Mentors who view a Personalised praise notification click on it If this number is much lower than the click through on other notifications, then we should investigate if there are technical issues or an issue with the language used. 73% of Mentors who received a notification clicked on it
Personalized praise mentor dashboard module click through At least ١٠% of Mentors who view a Personalised praise suggestion on their Mentor dashboard end up clicking through to send praise If this threshold is met then we should investigate if there are technical issues or an issue with how Mentors are interpreting this call to action. 27.5% of Mentors who view a Personalized praise list click through

Data was gathered on 2023-06-13, from the four pilot wikis where the feature is deployed (Arabic Wikipedia, Bengali Wikipedia, Czech Wikipedia, and Spanish Wikipedia).

Personalized praise notification click through: Although this is still a relatively small sample, results seem healthy and show that Mentors are indeed receiving notifications and clicking through to view their praise-worthy mentees.

Personalized praise mentor dashboard module click through: Only 27.5% of Mentors are clicking through to a mentee's talk page, however it's to be expected that some of the mentees we are surfacing aren't deserving of praise. Based on this data and feedback from Mentors, the Growth team will pursue the following tasks to help improve this feature:

  • Add revert scorecard to Personalized praise module on Mentor dashboard (T337510)
  • Exclude blocked accounts from the Personalized praise suggestions (T338525)

Experiment Results

Many of the experiments that the Growth team runs will focus on the same set of key metrics (commonly referred to as KPIs), and this includes all of the Positive Reinforcement experiments. The key metrics are defined as follows:

  • Constructive activation is defined as a newcomer making their first edit within 24 hours of registration, and that edit not being reverted within 48 hours.
    • Activation is similarly defined as constructive activation, but without the non-revert requirement.
  • Constructive retention is defined as a newcomer coming back on a different day in the two weeks after constructive activation and making another edit, with said edit also not being reverted within 48 hours.
    • Retention is similarly defined as constructive retention, but without the non-revert requirements.
  • Constructive edit volume is the overall count of edits made in a user's first two weeks, with edits that were reverted within 48 hours removed.
  • Revert rate is the proportion of edits that were reverted within 48 hours out of all edits made. This is by definition 0% for users who made no edits, and we generally exclude these users from the analysis.
Impact module experiment results
 
The New Impact module reduced activation for mobile web newcomers

We initially found a significant decrease in constructive activation for newcomers who registered on mobile web and got the New Impact module. There was no difference in activation for newcomers who registered on desktop. This was quite surprising as the empty state for the old Impact module was nearly identical to the empty state of the new Impact module.

First-day activity correlates strongly with later activity, and as a result we also found a significant decrease in edit volume for mobile web users. Again, there was no difference for desktop users.

We found no difference in retention rates and revert rates. While there are features in the New Impact module that focuses on staying active and making good contributions, such as the number of thanks received and the streak counter, we often do not see significant impacts on metrics unless there's a clear call to action or we are able to isolate a specific subgroup motivated by the feature.

 
Activation is identical between the experiment and control group

As soon as we learned about the decrease in activation we started investigations into probable causes of this in T330614. Unfortunately we could not identify a specific reason and we also found that the issue was not replicated in another dataset. We decided to add activation as a leading indicator to the Levelling Up experiment so that we could take action more quickly. When we noticed that the issue persisted, we started a new investigation in T334411 and created an "epic" task that connects all relevant subtasks: T342150. We restarted experiment data collection after making several small changes, and we now see that activation is identical between the experiment and control group, which is what we would expect.

Although we are pleased that we have received positive feedback from new editors regarding the new Impact module, we have found that the Impact module alone hasn't resulted in significant changes in newcomer retention, edit volume, or revert rates. Our next experiment will combine the new Impact module with the Leveling up features. We hope that this combination of Positive Reinforcement features will lead to substantial improvements in activation, retention, and edit volume. We will soon publish a detailed report that highlights the outcomes of this experiment.

Levelling up experiment results

For this experiment, we completed both an analysis of the overall effects across the whole newcomer population, and individually for each of the four components of the Levelling up features. These consist of the two notifications sent to newcomers 48 hours after registration, and two post-edit dialogues. The notifications are based on the number of suggested edits a newcomer might have done. If the newcomer has not made any suggested edits they get the "Get started" notification, and if they have made one to four suggested edits they get the "Keep going" notification. Newcomers who have made five or more suggested edits do not get any notifications.

The post-edit dialogues are shown after completed edits to articles based on certain criteria. If a newcomer has made three or eight article edits and not yet made any suggested edits, they get the "Try suggested edits" dialogue asking them if they want to try that feature. If a newcomer has completed five suggested edits of a specific task type, they get the "Try new task" dialogue suggesting a different type of task.

 
The Get Started notification increased editing within one week after receiving it.

Our overall analysis did not find any significant effects on the team's key metrics (described above), and so we focus instead on the individual components. For the "Get started" notification, we find that this is sent to the vast majority of newcomers as making suggested edits is fairly uncommon. In our dataset, more than 97% of newcomers got this notification. We find that the notification leads to a significant increase in newcomer activity in the week following the notifications being sent. Newcomers are more likely to return and make an edit, which also increases the average number of edits made during that week. We also find that this effect is lower for those who registered on mobile web, and reduced or negative for highly active newcomers. Based on this, we decided to introduce a threshold so that those who make ten or more edits will not receive the notification (that work was tracked in T342819).

 
The Keep Going notification increased editing within one week for desktop users.

When it comes to the "Keep going" notification, we again find a significant increase in newcomer activity in the week following notifications being sent for those who registered on the desktop platform. For users who registered on mobile web, we find that it does not increase their probability of returning to edit but does increase the average number of edits made.

For the "Try suggested edits" dialogue, our analysis finds that while it has a reasonably high click-through rate it does not lead to newcomers successfully completing suggested edits. In our leading indicators report above, the click-through rate was 21.9%, and in a dataset from late July 2023 we found the rate to be higher at 25.3%. Using event data, we find that few newcomers find a task they are interested in, and subsequently only a fraction of newcomers go through and complete an edit. We plan to make a few improvements to this "Try suggested edits" dialog to see if we can increase the percentage of editors who click through and go on to complete an edit (this work is tracked in T348205).

For the "Try new task" dialogue, which is shown to users who complete five suggested edits of a given task type, we find both high click-through rates and a reasonably high rate of completed edits. We reported a click-through rate of 73.1% in our leading indicators, and in our more recent dataset from late July 2023 the rate is 81.9%. Our analysis of subsequent edits shows that 33.3% of desktop users and 20.0% of mobile web users go through and complete a suggested edit of the new task type. One thing to keep in mind is that this dialogue is not shown to a large number of newcomers, and we therefore cannot draw conclusions about whether there are meaningful differences between platforms. What we can conclude, is that this dialogue is successful in introducing new task types. In order to show the dialogue to a larger number of newcomers, we decided to reduce the number of edits needed to see it from five to three (this work was tracked in T348814).

Personalized praise experiment results

For this experiment, we focused on the effect of praise on newcomer retention and productivity. Since praise is a response to editing activity, it means there will be some time period between registration and receiving a praise message. We therefore started with an analysis of the time between registration and a mentor clicking the "Send praise" button. In that analysis, we found that most newcomers get it within 30 days of registration. This led us to redefine the time period for retention and productivity to also use this 30-day period (instead of our default of 14 days).

The Personalized praise feature was deployed to the Arabic, Bangla, Czech, and Spanish Wikipedias in late May 2023. We analyzed the Spanish Wikipedia separately from the other three because on the Spanish Wikipedia 50% of newcomers are randomly assigned a mentor, which means the feature is part of a controlled experiment. All newcomers are assigned a mentor on the other three Wikipedias.

Using a Difference-in-Differences analysis approach, we compared a three-month period prior to deployment (January through March) with a similar period after deployment (June through August), and compared data from 2023 with data from 2022 and 2018. We use two comparison time periods as a robustness check since 2022 was affected by the COVID pandemic.

For the Arabic, Bangla, and Czech Wikipedia, we found no significant impact of Personalized praise on neither retention nor productivity. Digging further into this we found that usage of the feature was limited (we're not releasing specific counts in accordance with our data publication guidelines). In discussions with wiki ambassadors we learned that sending praise is a time-consuming process as the mentors need to check a mentee's edits, thus explaining why the feature isn't more widely used.

 
Personalized praise increased number of non-reverted edits made within 30 days of registration on Spanish Wikipedia.

When it comes to the Spanish Wikipedia, we found the feature has been more widely used. While we again found no significant impact on retention, we found a significant positive impact on newcomer productivity. This finding is encouraging since our preliminary analysis of mentorship found conflicting results of none or a negative impact.

As these results were not positive enough to justify the time investment from Mentors, we have decided to start conversations with our ambassadors and communities and consider further improvements before releasing the feature more widely. We will consider improvements related to reducing the amount of work needed by Mentors, potential design improvements, and improvements to how newcomers are selected to be displayed in the Personalized praise module.