Reading/Strategy/Strategy Process/Testing

About edit

We have narrowed down our strategic possibilities for the Wikimedia Reading audience vertical to the following four main strategies.

  • Optimize the user experience of Wikipedia for contemporary technologies, making Wikipedia content equally easily accessible, across different platforms, with an enjoyable reading experience across different devices and channels.
  • Allow readers to interact with content and each other, adding a further layer of engagement by offering different types of interaction with content in addition to passive reading.
  • Create deep-dive (guided) educational experience, making it easy for knowledge seekers to find information they are looking for easily and in an understandable way. For example: being able to learn efficiently about semiconductors without being an engineer.
  • Create a reading experience tailored for users in the Global South

How can you contribute? edit

Below are tests that we plan to conduct to determine the viability of these strategies. Please take a moment to read and see if it makes sense. Would you like to suggest an alternative test or modify an existing planned test? Please discuss on the talk page.

Tests marked in bold are tentatively planned for Q2 FY 2015-2016 (October - December, 2015).

Tests edit

Optimize the user experience of Wikipedia for contemporary technologies edit

Condition to test Short term test Medium test Large effort test (definitive)
Innovative user experiences and well designed apps and sites will drive more traffic

Test: Research large internet sites, find out which have increased their traffic by only updating the user experience and user interface of their projects, in the last 2-3 years?

If/Then Hypothesis: If we found examples on how good design demonstrate radical increased traffic, then improved design is likely to drive traffic.

Standard of proof: TBD

Test: Redirect 1% of desktop users from desktop to mobile web.

If/Then Hypothesis: If redirected users engage more than the usual group, then improvements are likely to have an impact.

Standard of proof: Data must support the hypothesis on a daily basis for 7 consecutive days.

Objective: Test user interface Impact on beta.

Test: Test a cleaner user interface, on beta, for more users than our current beta users.

If/Then Hypothesis: If users prefer the beta user interface experience, as demonstrated by usage and surveys, then improvements are likely to have an impact

Standard of proof: TBD

Wikipedia can develop modern experiences faster than intermediate providers Objective: Deliver a new reading feature on both the desktop and mobile web

Test: Successfully deliver "Read More" in Q2 FY 2015-2016 (October-December, 2015)

If/Then Hypothesis: If a new reading feature can be delivered in one quarter, then delivery is rapid despite dependencies

Standard of proof: feature launched on web beta or non-EN wikis

Objective: Deliver a substantive reading feature

Test: Push something that changes the article header onto mobile web within a quarter

If/Then Hypothesis: If a more substantive reading feature can be delivered in one quarter, then environmental factors support introduction of such features on a sufficiently rapid timeline

Standard of proof: feature launched on web beta or non-EN wikis

A Community of Readers: Allow readers to interact with content and each other edit

Condition to test Short term test Medium test Large effort test (definitive)
If our content is interactive enough for readers, then they will visit our sites directly. Objective: Determine if interactive content drives traffic in general

Test: Review research in this field

If/Then Hypothesis: If interactive content (comments, highlights) drives traffic in general, then it might as well drive traffic for WP

Standard of Proof: Research shows sites who add interactive features see a significant boost in traffic.

Objective: See if readers think we have forums/comments and ways for users to interact with each other.

Test: Survey readers and ask them if we have discussion pages and how casually interactive our content is

If/Then Hypothesis: If readers already see Wikipedia as interactive, but do not want to interact, then there is no point in building more tools.

Standard of Proof: Most users do not see WP as a place for people to discuss content.

Objective: Gauge impact
Test: Look at how share-a-fact engagement increases overall engagement with site.
If/Then Hypothesis:If share-a-fact drives more time-spent from the users who touch it, then interactivity drives traffic.
Standard of Proof: A user who has not seen share-a-fact- has sessions/session length measured. After 2 weeks they are shown share-a-fact, their engagement must go up (either # session or length of session)"
Our reputation as the source for accessing the sum of all knowledge, on the web, is not threatened by newer platforms Objective: Understand current perceptions of Wikipedia and find if more casual participation models might impact perception
Test: Review existing studies on life-long-learners users. What impacts their perception and use of Wikipedia?
If/Then Hypothesis: If we determined, based on existing research, that learners strongly question content based on the existence of discussions, comments or crowd-sourced data, we should not move forward.
Standard of Proof: If there is evidence that comments and discussions or the ability to interact with content significantly, negatively change how people gauge the quality of content it accompanies, then we should not move forward.
Objective: Determine if users perception of Wikipedia would change if they knew that the contribution model was easy and fun.

Test: Administer short survey to a small set of users in the lifelong learner demographic to gauge how various elements of perception would change if WP contribution model was different.
If/Then Hypothesis: If only a small number of respondents indicate that their perception would change negatively, then we can feel confident about rolling out new models of engagement and contribution to a broad set of users.
Standard of Proof: 80% of respondents must demonstrate that their perception of WP would not negatively change if they knew about the new model of engagement.

Objective: Determine real-life reaction to user generated content

Test: Ask users for perception of Wikipedia, show them page with comments or Q&A and then ask again.
If/Then Hypothesis: If users perceptions of Wikipedia accuracy are diminished by the comments,then this is okay
Standard of Proof: 80% of responses must demonstrate that their perception of Wikipedia would not negatively change if they knew about the new model of engagement.

Create deep-dive (guided) educational experience edit

This refers to a potential strategy where the reading team focuses on learning (comprehension and retention), rather than merely information presentation. Ideas in this theme include a suggested order to reading articles for maximum comprehension, simple or practical versions of articles, quizzes or even games.

Condition to test Short term tests Medium test Large effort test (definitive)
We can differentiate ourselves among other education-tech/learning sites/apps Test: Catalog and analyze competition's features vs. our own.

If/Then Hypothesis: If we can identify features that would give us an advantage, then we should be able to differentiate ourselves from the competition.
Standard of Proof: Comparison of ourselves against top 3 performers.

Objective: Determine if/how we’re failing users trying to educate themselves.
Test: Conduct a survey asking users about gaps in our existing education experience.
If/Then Hypothesis: If we better understand our ability to deliver education, then we can combine that with knowledge of our competition’s features to deliver a superior & differentiating experience.
Standard of Proof: Large survey sample size & language diversity.
Objective: Determine our ability to execute a unique & compelling education experience
Test: Build a prototype which combines user feedback & industry analysis to deliver a proof of concept on our education product.
If/Then Hypothesis: If the prototype can be built and feedback is positive, then we should be able to achieve long-term dominance in this space thanks to a sustainable advantage.
Standard of Proof: Prototype with minimal implementation of differentiating feature(s) that has data showing high engagement and positive user feedback."
The community or AI is able/willing to generate content at all levels of complexity Objective: Gauge interest or resistance
Test: Email wikitech-l and wikimedia-l (It has been pointed out the an RFC may be more appropriate for this test) describing the simple-moderate-complex content tagging concept
If/Then Hypothesis: If feedback is positive, then the community will welcome it
Standard of Proof: The ratio of feedback is 10:1 (positive+neutral:negative)
Objective: Determine real world behavior

Objective: determine if summaries actually help users (by focus group?/survey) Test: Ask users to write article summaries on a set of articles (potentially new people mobile)

If/Then Hypothesis: If users will attempt to write an article summary for elementary students 10% of the time when prompted, then users will generally be comfortable writing article summaries

Standard of Proof: This should hold for the top 5 language Wikipedias

If/Then Hypothesis: If users will attempt to write an article summary for elementary students 10% of the time when prompted, then users will generally be comfortable writing article summaries

Standard of Proof: This should hold for the top 5 language Wikipedias

Readers are interested in simplified versions of articles (+ more exposure --> more creation) Objective: Determine if readers and editors will engage with simplified articles if properly exposed:

Test: promote simple english version of articles and see if this leads to reader satisfaction or comprehension, also see if greater exposure leads to increase in editing

If/Then Hypothseis: if users are made aware that a simpler version of the article exists and choose to see/edit the simpler version, it is worth investing more in supporting this direction

Standard of proof: Users who see Simple English (when promoted) option show greater satisfaction, more comprehensions, or deeper sessions than users who do not have this option. When Simple English is promoted, edits increase proportionally.

Create a reading experience tailored for users in the Global South edit

Important note: Reading is committed to expanding access in the Global South. The strategic tests for the Global South strategic option largely aid in understanding the degree of the challenge.

Condition to test Small directional test Medium test Large effort test (definitive)
There is relevant content available in local languages to readers. 'Objective: find out if assumption about readers preferring/requiring content in their local languages is correct.'If/Then Hypothesis: If assumption does not hold, then local language relevant content will not be a blocking factor

Test: examine existing data/research

Standard of Proof: TBD

Objective: see if machine translations (articles or infoboxes) can generate content that readers want.
Test: show articles to readers with help of design research.
If/Then Hypothesis: If the articles are good enough, readers will find & enjoy content.
Standard of Proof:TBD
Users know what we are & consider us trustworthy Objective: Identify awareness level
Test: Percentage who know Wikipedia
If/Then Hypothesis: If 30% of people who have data know of Wikipedia, then Wikipedia is well known
Standard of Proof: Sample size of people who have data must be at least 30
Objective: Identify trustworthiness
Test: Percentage who trust Wikipedia
If/Then Hypothesis: If greater than half of people who have data and know Wikipedia trust it, then Wikipedia is generally trusted
Standard of Proof: Sample size of people who have data and know Wikipedia must be at least 30
Objective: Verify whether advertising improves odds
Test: Media lift
If/Then Hypothesis: If after advertising test numbers increase to 40% and 70%, respectively, then advertising will be successful
Standard of Proof: Sample sizes must be at least 30 in each test for a randomized sample (should not duplicate call)