Wikimedia Technical Documentation Team/Doc metrics/User guide
Please do not mark this page for translation yet. It is still being drafted, or it contains incomplete translation markup which should be fixed before marking for translation. |
This page explains how to use the test doc metrics (v0) to identify tech docs pages and collections to improve, and how to improve them. To learn more about the process we used to define, design, and implement these metrics, see Doc_metrics/v0.
Access metrics data
editBecause this is an experiment, we don't have a fancy dashboard for you to use to explore the data. Your options are:
- Best option: Prepared spreadsheets with collection-level scores, raw data per-page and per-collection, and a pre-made Pivot table that you can use to browse pages and scores within a collection and across metric categories:
- Self-service option: Raw CSV, download from GitLab. This doesn't have standardized metrics scores, so it may be harder to interpret the data. However, this simpler format may make it easier for you to expore the data using your preferred analysis techniques or software.
Understand key terms
edit- Collection
- A group of technical documentation pages. Different types of collections capture the varying ways in which pages can be related. For example: pages about the same product or software; pages of the same content type (like tutorials, reference, or landing pages); or pages that support a given developer task or user journey.
- Page
- A unit of technical documentation published on a wiki. Generally synonymous with "doc". Tech docs can also be published as HTML pages, but for the purpose of the v0 metrics test, "page" refers only to technical documentation pages on mediawiki.org.
- Doc attribute
- An aspect of a documentation page that can be measured or assigned a value. Page content, page metadata, or traffic and revisions data can all contain doc attributes. For the v0 metrics test, we assessed 30 doc attributes.
- Metric
- An aspect of technical documentation quality that we care about, but can't understand based on a single doc attribute or data point. Metrics represent the numerically-encoded relationship between a doc attribute and its meaning. The v0 metrics proposal had 9 metrics, but we only tested 7 metrics.
- Metric category
- The high-level technical documentation quality objective to which a metric corresponds. We don't necessarily care about metrics for their own sake: we care about them as tools for tracking progress towards these (even harder to measure) goals of making our tech docs Relevant, Accurate, Usable, and Findable.
Understand constraints
editThis data was generated manually by humans who looked at pages and/or used existing data sources to gather the input data.
Our test dataset included 140 pages of technical documentation on mediawiki.org. We gathered data manually because:
- The majority of the doc attributes we wanted to assess are not currently available in existing data sources.
- Our current goal is to assess whether having such data for metrics calculation would be useful.
The doc collections, and the pages within them, are manually curated.
- We used the PagePile tool to define and store the list of pages within a collection. Some PagePiles include docs outside of mediawiki.org, but those docs weren't included in this metrics test.
We only assessed content on mediawiki.org.
Interpret metrics scores
edit
Don't compare raw scores across metrics
editThe only scores that you can safely compare across metrics are the standardized scores, which are only provided at the collection level, not for individual pages.
You can't compare the raw scores because each of the metrics is calculated based on a different set of doc attributes, with varying weights. The weights attempt to capture how strong of an indicator a given doc attribute may be for that specific metric. For example:
- If a page uses a list or a table, those elements can help make the page more succinct, since the page is less likely to be prose-heavy. However, whether a list or table is a valid way to format page content varies significantly, and a page could still have walls of text around the lists or tables. So, this is a weak indicator for the "Succint" metric.
- Consequently, in the metrics calculations: if a list or table is present, the page gets a small score increase (10 points), but pages don't get penalized for not having a list or a table, and they get a larger score increase for other, stronger indicators, like page size.
For some attributes, the weights are based on value ranges ("bins"), which were created based on benchmarks from the test dataset. This was necessary in order to make sense of large ranges of real-valued inputs, where a given range of values may have variable meaning -- both for how we interpret the doc attribute itself, and for how that doc attribute's values influence the metrics that use it. The weighting for each metric and its constituent doc attributes is documented in the Reference page.
Scores favor reward over penalty
editThe best practices, formatting options, and ideal content for any given page of technical documentation are often ambiguous, subjective, and context-dependent. The metrics calculations account for this by rewarding pages with score increases if they have certain attributes, but not penalizing pages for lacking those attributes.
Example:
- If a page has any sections differentiated by headings, the ConsistentStructure metric score increases by 50 points. However, pages aren't penalized for not having page sections.
In general, the metrics are fairly conservative about negative scores. The only doc attributes that may reduce a metric's score are:
- See Also section contains more than 6 links
- Page size in bytes: landing pages are penalized for being too long or too short; other doc types are only penalized for being so short that the content likely doesn't need its own page.
- Number of page sections: only pages that are not landing pages nor reference pages are penalized, and only if they have more than 20 sections.
- Incoming links and redirects from same wiki: pages are only penalized if there are 0 incoming links, which means they're an "orphan page".
- More than 50% of edits made by a single top editor, indicating the page's maintenance is at risk due to having a single point of failure (SPOF).
All of the above is documented in more detail in the Reference page.
Zero isn't necessarily a bad score
editPages may score zero for a given doc attribute or metric total for several reasons:
- Some doc attributes were only assessed for certain doc types. So, for example, if a page was not coded as a landing page, we didn't assess whether it had maintainer info on the page.
- Due to the two factors mentioned above (weighting of doc attribute values and preference for reward over penalty): many doc attributes may score 0 if they fall in the middle of a range of weighted values, or if the absence of a given doc attribute can't be consistently interpreted as negative.
Option 1: Explore data by collection
edit- Choose a collection to explore. For this metrics test, we defined 5 collections, of three different types. 6 of the 140 pages are members of more than one collection.
- View the standardized score for your collection on the "Standardized scores by collection" tab. These scores are normalized across metrics, so you can use them to identify how your collection compares to others, and where your collection scored lower than average.
- To explore which pages within your collection influenced its score for a given metric, you have two options:
- Use the "Raw scores by page and collection" tab. This tab contains all the data for all pages, nested by collection.
- Use the tab labeled with the name of your collection. This tab contains all the data only for pages in a single collection.
- Follow the instructions below to identify improvements for pages.
Option 2: Explore data by metric
editBackground info: Where did these metrics come from?
- Choose a metric to explore.
- View the "Standardized scores by collection" tab to see which collections scored highest/lowest for the metric you care about. Higher scores are better. These scores are normalized across metrics, so you can use them to assess how the collections scored for your metric vs. the other metrics. Be aware that some of the metrics, like CodeConnection, have only a couple inputs, while others have 5-7.
- View the "All data by page" tab. To see which pages scored highest or lowest, sort the data by the "Total" score column for the metric you care about. Higher scores are better.
- Follow the instructions below to identify improvements for pages.
Identify improvements for pages
editWhether you're exploring the data by collection, or by metric, the scores are meant to help you identify pages that could be improved, and which types of changes could most improve them. The sections above explained how to drill down to the page level from a collection or metric. Follow this procedure to identify potential doc improvements when you're looking at page-level metrics scores:
If you're not specifically interested in one type of metric:
- Sort or filter the page-level data by "Sum of metric Totals". This shows you the pages that scored highest (best) or lowest (worst) based on the sum of their scores across all metrics.
- Review the Total scores the page received for each of the 7 metrics. In the output data spreadsheet, the metric scores are in columns labeled "Total score for [metric]":
- Total score for Succinct: column K
- Total score for CodeConnection: column N
- Total score for ConsistentFormat: column Q
- Total score for CollectionOrientation: column U
- Total score for consistentStructure: column AA
- Total score for developers: column AI
- Total score for Freshness: column AO
- Choose a metric for which the page had a low score, and investigate which doc attributes impacted that metric's score. Use the Reference page to help you with this.
When you have a specific metric you want to investigate:
Use the columns in the page-level data to determine which doc attributes influenced the Total metric score. For each page, the data includes scores for each doc attribute, followed by the total score for the metric based on those attributes. Refer to the Reference page for details of how the doc attribute values impact the metrics calculations.
- For example: if a page has a low "Developers" total metric score (in column AI), your next step would be to look at the columns preceding that: "developers score from codesamples" (column AB), "developers score from codemulti" (column AC), "developers score from maintainer" (column AD), etc. You don't have to rely only on the field names in the spreadsheet: use the Reference page to identify the names of the columns for each metric and get more info about their role in the metric calculation.
See also
edit- Doc metrics reference - explains metrics computations and provides in-depth details of how attributes are weighted in scores, lists fields in both input and output datasets
- How we defined mappings between doc attributes and metrics