Analytics/Reportcard/2.0/Requirements
This page is obsolete. It is being retained for archival purposes. It may document extensions or features that are obsolete and/or no longer supported. Do not rely on the information here being up-to-date. |
- "Report card 2.0" is running at http://reportcard.wmflabs.org/
Rationale
editThere is a need for both the foundation and community to see 'at-a-glance' what the current standing is on various projects and a need for a common reporting infrastructure within the foundation for measures we find important. The current report card, as it exists is a monolithic structure and process which ends with very specific data points being summarized for the board and ED. There are parts of data in this report that can be reused and/or combined with other reports to get very specific, relevant reports to other interested parties.
Goals
editThis project will seek to improve the general metrics reporting process in the following ways:
- Data: extraction, transformation and loading is fully automated, there will be zero manual steps.
- Granularity: we start with daily aggregated data.
- Targets: a measurement is benchmarked against a target, this applies to certain reader, mobile and diversity measures.
- API: there will be a simple API so people can fetch the data and analyze and visualize it themselves.
- Embedded: there will be a Mediawiki extension that allows you to embed a particular chart in a Wiki page.
- Modular: the frontend and backend will be very loosely coupled so that it will be possible to just replace the backend in one shot.
- Interactivity: the charts will offer basic interactivity: zoom-in / zoom-out, indexed vs raw count, add other projects to compare, etc.
User Groups
edit1) The monthly metric meeting where Erik will use it to give a high level overview of the state of the community. User Group 1: C-level.
2) The different departments (Community, Global Dev) and different teams (Mobile) want to have more fine-grained control over their charts and want to be able to 'write' their own queries. User Group 2: Different WMF teams
3) The community at large (community members, admin, researchers, whoever), they will probably want to download the raw JSON data and do extra things. User Group 3: Community (broadly defined)
Devices
editWe will limit the reportcard to modern web-browsers on a desktop / laptop.
Metrics
editIn total, there are six themes that will be visualized on the Dashboard: Reader metrics, Editor metrics, Device metrics, Diversity metrics, Media metrics and API metrics.
Reader-centered measurements
edit- Pageviews
- Breakdown by project_language
- Breakdown by project
- Unique visitors
- Breakdown by project_language
- Breakdown by project
- Unique visitors competing web properties (Google, Facebook)
Editor-centered measurements
edit- Count of edits
- Breakdown by project_language
- Breakdown by project
- Count of new editors
- Breakdown by project_language
- Breakdown by project
- Count of active editors (5+ edits per month)
- Breakdown by project_language
- Breakdown by project
- Count of very active editors (100+ edits per month)
- Breakdown by project_language
- Breakdown by project
Already supported by WikiPride
editCurrently only supports Wikipedia by language by month. Could be easy to change.
- Count of edits
- Count of active editors
- Count of very active editors
Low priority
edit- Count of reverted edits
- Count of bot edits
- Logins and/or edit sessions per day/week/month
Article-centric measures
edit- Count of
- total articles
- deleted articles
- new articles per day
- Breakdown by project_language
- Breakdown by project
Device-centered measurements
edit- Count of mobile devices
- Breakdown by manufacturer
- Breakdown by project_language
- Breakdown by project
- Breakdown by geography
- Breakdown by partner (Wikimedia Zero)
- Breakdown by official apps (iOS, Android, Symbian, etc)
Diversity-centered measurement
edit- Percentage of editors from the Global South
- Editors / readers from India (data at state level and for the following languages: English, Hindi, Kannada, Malayalam, Bengali, Marathi, Gujarati, Tamil and Telugu)
- Editors / readers from Brasil
- Percentage of female editors
Media-centered measurements
edit(This applies mainly to Commons)
- Count of binary files (jpg, png, svg, ogg, gif, tiff, pdf, djvu, ogv, mid)
API-centered measurements
edit- Count of different API actions
- Breakdown by language
- Breakdown by project
Proposed Database Design
editYou can find the proposed database design at Analytics/Reportcard/Database_design.
Shelved Features
editThis is a list of features that we want but are outside of the current scope
- Raw data: Instead of storing aggregates, we want to be able to store the raw data.
- Granularity: Hourly data is something to consider.
External Links
edit- Wikimedia_Report_Card_2.0
- Analytics/Wikistats/Analytics Database Feed
- http://strategy.wikimedia.org/wiki/Thread:Talk:Strategic_Plan/Movement_Priorities/%22Community_Health%22_measures#x.22Community_Health.22_measures_5711
- http://strategy.wikimedia.org/wiki/Community_Health/Metrics
- http://etherpad.wikimedia.org/AnalyticsBacklog