Wiki-Highlights

Wiki-Highlights is a test project to validate or invalidate a hypothesis that if the global youth are offered automated, human-reviewed, visual articles as an alternative reading experience in third-party platforms, then we will increase their awareness and engagement of Wikimedia projects as readers and contributors.

Background edit

One of the Wikimedia Foundation product and tech departments identified areas of work in 2023-24 is the Future Audiences objectives and key results. The Future audience bucket will explore ways for the movement to become the essential infrastructure of the ecosystem of free knowledge by making knowledge available to everyone wherever they are on the internet.

The above work aims to reach the global youth who consume information on other platforms with our content to increase their awareness and engagement with our projects (Future Audience KR 2.1).

The Wiki-Highlights project aligns with Future Audience KR 2.1 and will validate or invalidate one of the hypotheses in the Inuka team's 2023-24 annual plan.

If we offered automated, human-reviewed, visual article summaries generated from Wikipedia articles as an alternative reading experience for younger readers and have them indexed by viable search platforms (Google, Tiktok), we would be able to test whether their interactions with our content & projects would increase engagement. Human-reviewed summaries would be generated for multiple articles and surfaced to a group of readers on mobile, to evaluate their preference for different content formats and topics. We would evaluate if global youth readers show significant interest towards Wikipedia article summaries and measure engagement based on time spent, number of summaries consumed, summary completion rate and topics with the highest readership.

Rationale for Wiki-Highlights edit

Trends show us that information consumption patterns for global youth audiences are evolving rapidly and affecting overall traffic to our projects. These patterns are grouped into 3:

  • Content formats: text, image, video, audio (static and interactive)
  • Content length: short form
  • Content destination: on search, socials and other 3rd party platforms.  

Insights from young audiences (18-24 year olds) indicate that:

  • Wikipedia ranks high for school assignments, fact-checking, and staying informed.
  • Usage of Wikipedia is very low for social and engagement needs.
  • Wikipedia average usage is under 50% in comparison to other brands.
  • Articles being too long is the 3rd highest reason why they don't use Wikipedia.
  • Improvements like including more images and making articles shorter would spur more usage.

The Wikipedia articles are long and heavy on text; the appeal of this type of content is declining among the younger audience, and there is a need to innovate this content while still maintaining the vital information, and reach the global audience where they are in other platforms.

The goal edit

Our main goal with this project is to test if automated, human-reviewed visual summaries generated from Wikipedia (Wiki Highlights) are viable reading experiences for global youth audiences on third-party platforms.

Experiment approach edit

The team will use microsites to conduct A/B testing between different content formats (standard Wikipedia articles and Wiki-Highlights format) to evaluate the engagement and the experience of the targeted audience on mobile devices:

  • For this 1st iteration of this experiment, we will not attempt to test this portion of the hypothesis "....have them indexed by viable search platforms (Google, Tiktok)..." as we'd like to validate the viability of this approach before engaging with a partner.
  • The Wiki-Highlights content will be generated by extracting a concise overview of facts (2-4 sentences, or 300 characters or less) from the sections of lengthy Wikipedia articles, combined with relevant images sourced from Commons. This first experiment will focus on using English articles selected from the following categories and topics.
Categories Topics Quality of articles
History Art, monuments, sites and artifacts Articles must be rated "Featured" or "Good"
Life style Food, fashion, language, travel, media
Places Countries, cities, islands
Personalities Biographies, personalities
Sports Sport, games and recreation
Topical Climate, sustainability, equity, health, social justice
Nature Plants, animal, water and land bodies

Measurement metrics edit

Engagement metric measurement
Metric What to track
Primary metric:
Users willingness to consume the article summaries (Wikihighlights) Total time spent per session
Secondary metrics:
Users' willingness to complete a summary Number of summaries consumed per session
Users' willingness to view subsequent summaries. Number of summaries per articles viewed and number of sessions
Topics with majority of reads. Number of read summaries per article based on topics.

Product design edit

The team considered different layouts, interaction styles and experience design prototypes and subjected them to rounds of user testing exercises on Userlytics, Instagram, and Tiktok to come up with the strongest preference of design for the microsites as shown in this link.

Timeline edit

2022/23 2023 2024
Q4 Q1 Q2 Q3
JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR
Ideating and scoping
Design: Iterative Prototype and Usability testing
Microsite development, testing and deployment
Experimentation with focus group
Evaluation & Reporting

Experiment results edit

The WMF developed 2 prototype test sites that were shown to 2400 participants aged between 18-24 year olds across 6 countries (US, Brazil, Germany, India, Indonesia, Nigeria). Half the participants were shown the article microsite, the other half shown the highlights microsite.

The testing happened in 2 ways;

1. Survey testing: edit

We ran a series of survey questions before and after participants interacted with the 2 sites. Half the participants were shown the article view, and the other half shown the wiki-highlights view. The survey result document has the outcome in detail, and below are summarized insights from the survey:

Learnings Suggestions/ Recommendations
Overall Wiki-Highlights had more appeal, and is seen as more unique, though only marginally. Although differences in appeal & uniqueness are statistically significant, they are slight differences.
  • This suggests that more could be done to differentiate Wiki-Highlights from the current Wikipedia reading experience.
Audience related feedback Wiki-Highlights appeals more to 23-24 year-olds & among more frequent Wikipedia users. With technology changing so fast, 23-24 year-olds may have a different relationship with it versus 18-19 year-olds.
  • The difference suggests further development should be targeted at the needs of the younger group to truly future-proof Wiki-Highlights.
The Wiki-Highlights appealed more to users in Nigeria and Indonesia and much less for users in Germany & US. We see Wikipedia overall less appealing in more developed markets - perhaps due to more established digital environments & brands.
  • It may be that in these markets there would need to be more difference between Wiki-Highlights and current Wikipedia experience to have an impact.
Feature related feedback Wiki-Highlights outperformed controlled site on perceptions of 'fun' & 'easy' 23-24 yr olds find Wiki- Highlights more fun, perhaps due to less use of apps like TikTok.
  • The above suggests a bigger incremental change is needed for younger users.

Also, heavier Wikipedia users are more likely to find Wiki- Highlights easy.

  • There may be a need to compare it versus other sites to explore how to improve Wiki-Highlights to bring in lighter users.
Users liked the topics & imagery of both sites; Wiki- Highlights stood out on simplicity of language & content length. The current lack of perceived difference of imagery on Wiki- Highlights versus Control suggests this could be an area of further development.
  • The shorter content length works well, and should be retained.
Users would improve the navigation and they desired more topics. Review the back button (this may have been impacted by the survey buttons, but we don't believe so given the verbatim).
  • We could explore additional topics relevant to this audience - e.g. movies, music, tech.

2. A/B testing: edit

During the survey testing, we instrumented the microsites to capture specific metrics for the 2 groups to gauge the depth of engagement further.

  • The Experiment Group participants were shown the wiki-highlights/ summarised content as seen here.
  • The Control Group participants were shown long form Wikipedia Article-type of content as seen here.

Overall metrics and observations are documented in the table below:

Overall Metrics Observations
Time on site (session length) Time on Homepage + Content page: Overall, the experiment group seemed to stay longer than the control group.
Time on Homepage: Experiment group spent the same amount of time as the control group on the homepages.
Time on Content Page: Experiment group stayed longer than the control group on the content pages.
Completion rate (willingness to complete the content) Experiment group: 1,658 highlights opened with a 72.2% completion rate.
  • had more highlights read but a lower completion rate.

Control group: had 1,112 articles opened, with a 78.1% completion rate.

Number of content consumed per session (willingness to view subsequent content) Experiment group: 95% of sessions consumed 0 to 4 highlights per session. 
  • The users in the experiment group consumed more content than the users in the control group.

Control group: 95% of sessions consumed 0 to 3 articles per session.

*** 0 means users only viewed the homepage in certain sessions.

Majority reads by content type Experiment group top 3 visited pages:
  • Lionel Messi,
  • Climate Change
  • Elephant.

Control group top 3 visited pages:

  • Lionel Messi
  • Friends
  • Japan.
Majority reads by category type Experiment group top 3 visited categories:
  • Topical
  • Personalities
  • Nature

Control group top 3 visited categories:

  • Lifestyle
  • Personalities
  • Nature

We also evaluated the metrics across the 6 countries we ran the experiment;

Country Metrics Observations
Time on site (session length) Experiment Group: Brazil, India, United States and Nigeria, users spent more time on homepages and content pages.

Control group: Indonesia and Germany, users spent more time on homepages and content pages.

Experiment Group: Brazil, Nigeria, and India, users spent more time on homepages.

Control group: Indonesia and the United States, users spent more time on homepages.

***In Germany, users spent similar time on homepages for both the experiment and control groups.

Experiment Group: Brazil, India, Indonesia, Nigeria, and US had more users spending more time on content pages.

Control group: Germany, in 90% of the sessions, the control group had more users spend more time on content pages.

***Nigeria had significantly much longer time spent on content pages compared to other countries.

Completion rate (willingness to complete the content) Experiment Group:Content completion rate in experiment groups is lower in every country except India and Indonesia.
Number of content consumed per session (willingness to view subsequent content) Experiment Group: Brazil, India, Nigeria, and the United States, users viewed more content in the experiment group.

Control group: Germany, users viewed fewer content in the experiment group than the control group per session.

***In Indonesia, users viewed a similar amount of content per session in both groups.

Majority reads by content type
  • Content, such as Lionel Messi and Climate change, appears in the top-viewed lists of most countries.
  • The rest of the top-viewed content differs from country to country.
Majority reads by category type

Experiment Learnings: edit

  1. Learnings from the Survey testing:
    • Wiki-Highlights had more appeal, and is seen as more unique, though only marginally. Although differences in appeal & uniqueness are statistically significant, they are slight differences.
      • This suggests that more could be done to differentiate Wiki-Highlights from the current Wikipedia reading experience.
  2. Learnings from the A/B Testing:
    • Users spent more time and consumed more Wiki-Highlights content than articles. However, more articles were completed over Wiki-Highlights which could be attributed to the fact that users did not expand each article section to and this may have affected the reading time of the control group.
      • Brazil, India, and Nigeria users favored Wiki-Highlights;
      • Germany users favored articles where they spent more time, consumed more and completed more content.

Status updates edit

Q3 January to March 2024

March 2024 edit

February 2024

January 2024

Q2 October to December 2023

December 2023

Microsite development tasks

November 2023

Microsite development tasks

October 2023

Microsite development tasks: