Wikimedia Technical Documentation Team/Doc metrics/v0

Tracked in Phabricator
Task T372102

This page summarizes the first set of metrics (v0) the Tech Docs team plans to test as part of the Doc metrics project.

Proposed metrics

Based on the research and analysis completed during Q1 of FY24-25, TBurmeister proposed the following for a first round (v0) of technical documentation metrics:

Doc characteristic	Relevant	Accurate	Usable	Findable
Are succinct; avoid walls of text #Succinct			✅✅
Use consistent organization and structure #ConsistentStructure			✅	✅✅
Are readable on mobile devices, with all information visible #MobileDevice			✅✅
Use consistent format and style #ConsistentFormat			✅✅
Orient the reader within a collection; Are organized alongside related pages #CollectionOrientation			✅	✅✅
Freshness #Freshness	✅	✅✅	✅
Are translatable / translated #Translation			✅
Align with real developer tasks and problems #Developers	✅✅			✅
Are connected to / findable from / automated by code #CodeConnection	✅	✅		✅

Data elements to measure

Data element	Doc characteristics	Metrics categories
Cross-references between code and on-wiki docs	CodeConnection, CollectionOrientation	Findable, Relevant
Page headings	Succinct, ConsistentStructure, ConsistentFormat	Usable, Findable (see Discussion)
Page title	Succinct, ConsistentStructure, ConsistentFormat	Usable, Findable
Page sections Intro Next steps See also	Succinct, ConsistentStructure, CollectionOrientation	Usable, Findable
Navigation Layout grid Navigation template Links from navigation template	ConsistentStructure, CollectionOrientation, MobileDevice	Findable, Usable
Revisions # unique / repeat editors Frequency of edits Time since original publishing vs. last edit Most revisions	Freshness; Developers	Accurate, Relevant
Tables	Succinct (+); MobileDevice (-)	Usable (+Inclusive)
CSS, HTML	ConsistentFormat (-); MobileDevice (?)	Usable (+Inclusive)
Lists Page length # of paragraphs	Succinct	Usable
Translate markup Pages w/o language links	Translation	Usable (+Inclusive)
Code samples	Developers; CodeConnection	Relevant
Incoming links	Developers	Relevant
Pageviews	Developers	Relevant

Cross-references between code and on-wiki docs

May include:

Links from code repos to wiki pages:
- Could be within README files, other files in subdirectories, or in Github / Gitlab / Gerrit repository summaries
Links to code repos from wiki pages:
- Could be in templates (example) or paragraphs on wiki pages (example).
- Difficult to differentiate general links to code repos from links to the code repo aligned with a specific wiki page. Pages may link to upstream repos or other random code references.

Helps us assess:

Whether docs are connected to / findable from / automated by code (#CodeConnection)
Whether docs and code orient the reader within a collection; Are organized alongside related pages (#CollectionOrientation)

Available data sources:

Special:LinkSearch (includes all links in all namespaces)
- Doesn't capture short links like [[gitlab:__]].
- Example list of links to Gerrit.
Template:Extension has a repo field (among many other fields that may be useful links to code)
For doc.wikimedia.org, we can access referer data through the webrequest pipeline

Unavailable data:

Referer or clickstream data is not available for technical wikis unless the referer is another wiki page
For privacy reasons, pageview tables and the tools that use them limit referer info to coarse buckets (referer_class : Can be none (null, empty or '-'), unknown (domain extraction failed), internal (domain is a wikimedia project), external (search engine) (domain is one of google, yahoo, bing, yandex, baidu, duckduckgo), external (any other))

Page headings

Raw data element	Data aspect	Documentation characteristic(s)
Navigation template	Is there a navigation template on the page?	Orient the reader within a collection Organized alongside related pages
Navigation template	How long is the navigation template?	Avoid walls of text (succinctness)
Incoming links	Are there many incoming links from other wiki pages?	Avoid duplication Use consistent organization and structure
Incoming links	Are there very few incoming links from other wiki pages?	Align with real developer tasks and problems (relevance) Use consistent organization and structure
Incoming links	Are there any incoming links from code repos?	Align with real developer tasks and problems (relevance) Update when code changes

Next steps

We will pick 3-5 of these metrics to test on a sample set of documentation collections. See the project timeline and milestones for more info.