Wikipedia Education Program/Brainstorming for RFC for rewrite

General organization edit

An initial issue is the basic organization of the software that will provide similar functionality. Here's a possibility:

  • A component for visualizing streams of user edits.
  • A component for defining workflows, user groups and roles, via some sort of on-wiki schema.
  • A component that depends on the first two components and provides a workflow and UX tailored to needs of the Education Program.

Possible synergies edit

Here are some WMF and MW endeavours that may have some synergies with this work:


Justification for a rewrite edit

  • The current codebase does not use, but should use, ContentHandler.
  • Experience with the current UX has led us to conclude that substantial changes are desireable.
  • The extension is geared specifically to the needs of the Education Program. However, other activities of Wikipedia and its sister projects have similar needs.
  • The class structure and architecture of the current codebase are not ideal.

To address any one of these issues would require rewriting a substantial portion of the codebase. To address all of them through modifications to the codebase would probably be more work than rewriting from scratch.

We should emphasize that we learned a lot from our experience with the current EP extension, so it's an important antecedent of this new work.

General strategy for a rewrite edit

Here's the approach we propose:

  • Create general components that form the basis of related features needed by the Education Program and similar endeavors.
  • Meet the specific needs of the Education Program through a thin layer of customization on top of those general components.
  • Set small goals for minimum viable products that meet the needs of the Education Program and contribute to a new system to be organized as described above.
  • As much as possible, replace parts of the current EP extension gradually with new products as they are completed, while continuing to use the parts of the extension that we don't have replacements for.

Since the current extension will remain in production for some time, we'll have to divide resources between its upkeep and the creation of new software. Work on the current extension should be limited to urgent bugfixes, minor improvements and urgent features that we can port to the new software. We should avoid tasks that involve major changes to the current codebase.

Research towards a general component for courses, outreach and projects edit

This is the component that would provide the basis for replacing most current functionality other than the feed of student edits.

It seems this component might model processes in general (including, but not limited to, relatively straightforward workflows), goals and tasks of various sorts, roles, and associations among all those things and among users, articles and other types of content.

If this is the case, then one type of software we could to look to for inspiration is business process software. There's a lot of work in this field, including modeling languages and development methodologies. Even though Wikipedia and sister projects are not businesses, some bits of organization theory developed for businesses may be relevant for social movements and volunteer organizations.

Here's a tentative initial reading (or skimming) list:

Here are notes on some of the above.

More notes on a general component for courses, outreach and projects edit

  • The Flow team's proposal for a workflow description system mainly targets straightforward workflows with relatively simple, rigid definitions, such as article deletion, help requests and sockpuppet investigations. (See the initial proposal, the RFC and these notes.)
  • At least some parts of the EP workflows are similarly rigid and definable, for example, for students: get an account, log in, and do the training. Some courses might like to set up very clear workflows, like: do the training, edit five articles, choose an article to edit, edit it, review someone else's article. Some instructors will prefer to do more stuff on their own, or have a wizard that they can configure themselves.
  • Some aspects of creating a course, project or event will be workflowish, as in: decide what kind of event it is, set the start and end dates...
  • Featured content discussions are another example of a semi-structured workflow.
  • What characteristics do courses, projects and events share?
    • In all cases, there's a common impetus for a group of people to contribute to Wikipedia or one of its sister projects.
    • There is a description of that common impetus.
    • There may be a set of articles to be worked on.
    • People may have articles that they're assigned to work on.
    • They may also have articles that they're assigned to review.
    • There may be a timeframe.
    • There is probably a need to be notified of and respond to events related to the endeavor.
    • People outside the project may want to find out about it, see who's in it, what articles were worked on, etc.

Notes on a visualization and edits feed component edit

This is the component that would provide the basis for replacing the feed of student edits.

There are synergies with Wikimetrics, which also helps analyze the activities of cohorts of users. If we consider the possibility of analyses deeper than those available in the current EP extension, there are also synergies with Limn, which provides easy setup of data visualizations.

What might this component look like?

  • It might be, or eventually morph into, a general on-wiki visualization system with on-wiki configuration options for visualizing user edits, other user data, log entries, data about articles, data about files, or analyses of article content or of any other kind of content.
  • It might be able to overlay graphs of data about user activity (like bytes added, pages created, thanks given, posts to flow/talk pages, mentions of users) with text and UI elements (including summaries of edits or edit sessions, links to give thanks, buttons for inline diffs).
  • It could provide several ways of defining cohorts, such as CSV files, username entry via a GUI, or the courses, projects or events that users have been associated with.
  • It should provide ways of viewing user activity after a course, project or event has finished, and of comparing the results of such endeavors.
  • It could be configurable by manually editing structured definitions and via a GUI.
  • It should provide a means of displaying visualizations and feeds without exposing users to a smörgåsbord of configuration options (perhaps via templates?).
  • It should interface with Wikimetrics and other standard WMF tools.
  • It should carefully load-balance and queue processing so as not to overload the cluster.
  • It could provide ways of re-using and sharing visualization and processing definitions, including partial definitions (i.e., a definition for just one aspect of a visualization or data transformation).

Stuff to check out edit

Possible outline of RFC edit

  • Introduction
  • Purpose of this RFC
    • What kind of feedback we're seeking and why
  • About the Wikipedia Education Program
    • History, status
  • About the EP extension
    • History, status, install base, number of users
  • Justification for a rewrite
  • Strategy for a rewrite
  • Proposal
    • Overview of proposed components
    • {{Sections with details on each proposed component}}
  • Comments