Extension:PageTriage/ban

This page is a translated version of the page Extension:PageTriage and the translation is 1% complete.
MediaWiki extensions manual
OOjs UI icon advanced-invert.svg
PageTriage
Release status: stable
PagesFeedInfoFlyout.png
Implementation Special page , User interface
Description Facilitates reviewing and approving new pages
Author(s) Ryan Kaldari, Benny Situ
Latest version 0.3.0
Compatibility policy Snapshots releases along with MediaWiki. Master is not backward compatible.
MediaWiki >= 1.38.0
Database changes Yes
Tables pagetriage_log
pagetriage_page
pagetriage_page_tags
pagetriage_tags
License MIT License
Download
Example Special:NewPagesFeed on the English Wikipedia
  • $wgPageTriageMaxAge
  • $wgTalkPageNoteTemplate
  • $wgPageTriageStickyStatsNav
  • $wgPageTriageNamespaces
  • $wgPageTriageMarkPatrolledLinkExpiry
  • $wgPageTriageDeletionTagsOptionsMessages
  • $wgPageTriageTagsOptionsMessages
  • $wgPageTriageEnabledEchoEvents
  • $wgPageTriageEnableOresFilters
  • $wgPageTriageInfiniteScrolling
  • $wgPageTriageEnableCopyvio
  • $wgPageTriageProjectLink
  • $wgPageTriageCurationModules
  • $wgPageTriageLearnMoreUrl
  • $wgPageTriageFeedbackUrl
  • $wgPageTriageDraftNamespaceId
  • $wgPageTriageMaxNoIndexAge
  • $wgPageTriageDeletionTagsOptionsContentLanguageMessages
  • $wgPageTriageRedirectAutoreviewAge
  • $wgPtTemplatePath
  • $wgPageTriageEnableCurationToolbar
  • $wgPageTriageEnableEnglishWikipediaFeatures
  • $wgPageTriagePagesPerRequest
  • $wgPageTriageNoIndexUnreviewedNewArticles
  • $wgPageTriageStickyControlNav
Translate the PageTriage extension if it is available at translatewiki.net
Issues Open tasks · Report a bug

PageTriage is an extension that aims to provide a feature-rich interface for triaging newly-created articles. It is intended to replace the new page patrol core function while adding additional functionality for reviewing, tagging, and improving new articles. It adds a Special:NewPagesFeed page, and a page curation toolbar to new pages for those with the 'patrol' permission. It was developed by the Wikimedia Foundation's Features Engineering team. For additional details see Page Curation.

An important note is that some of the configuration and code is specific to the English-language Wikipedia's workflows and as it's constructed now the extension is pretty much impossible to internationalize. (See Phabricator:T50552.)

The extension can be retrieved directly from Git [?]:

  • Browse code
  • Some extensions have tags for stable releases.
  • Each branch is associated with a past MediaWiki release. There is also a "master" branch containing the latest alpha version (might require an alpha version of MediaWiki).

Extract the snapshot and place it in the extensions/PageTriage/ directory of your MediaWiki installation.

If you are familiar with Git and have shell access to your server, you can also obtain the extension as follows:

cd extensions/ git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/PageTriage.git

Installation

  • Download and place the file(s) in a directory called PageTriage in your extensions/ folder.
  • Add the following code at the bottom of your LocalSettings.php :
    wfLoadExtension( 'PageTriage' );
    // These two settings are optional, and will enable the Articles-for-Creation mode.
    $wgExtraNamespaces[118] = 'Draft';
    $wgPageTriageDraftNamespaceId = 118;
    
  • Run the update script which will automatically create the necessary database tables that this extension needs.
  • Yes Done – Navigate to Special:Version on your wiki to verify that the extension is successfully installed.

To users running MediaWiki 1.24 or earlier:

The instructions above describe the new way of installing this extension using wfLoadExtension(). If you need to install this extension on these earlier versions (MediaWiki 1.24 and earlier), instead of wfLoadExtension( 'PageTriage' );, you need to use:

require_once "$IP/extensions/PageTriage/PageTriage.php";

Cron jobs

To make sure old articles are eventually taken out of the new pages feed, you should set up a cron job to run the following file every 48 hours: cron/updatePageTriageQueue.php

Checking for successful install

To actually see the extension working:

  • Add a new stub page as an anonymous user.

The new page should appear, flagged as "Tanpa kategori", "Yatim", etc. To see the page curation toolbar:

  • Login as a user with the 'sysop' permission, or add a group with the "patrol" permission, and add some user to that group, and login as that user.
  • Now you should see a "Tinjau" button next to the new page.
  • Click this and you should see the page curation toolbar on the new page.

Extension configuration

The extension is based on the 'patrol' right. For more information about configuring patrolling, see Manual:Patrolling .

The following configuration variables can be set from your LocalSettings.php file:

Variable Default Description
$wgPageTriageEnableCurationToolbar true Set to false to disable the curation toolbar
$wgPageTriageInfiniteScrolling true Whether or not to use infinite scrolling in the new pages feed
$wgPageTriageMaxAge 90 The age (in days) at which PageTriage allows unreviewed articles to become indexed by search engines (if $wgPageTriageNoIndexUnreviewedNewArticles is true).
$wgPageTriageNamespaces NS_MAIN, NS_USER The namespaces that PageTriage is active in.
$wgPageTriageNoIndexUnreviewedNewArticles false Set this to true if new, unreviewed articles should be set to noindex. In other words, if they should not be indexed by search engines until they are reviewed.

See extension.json for the full list of config variables.

On-wiki configuration

It is possible to configure much of PageTriage on-wiki via the pages MediaWiki:PageTriageExternalDeletionTagsOptions.js and MediaWiki:PageTriageExternalTagsOptions.js, although the structure of the configuration may change in the future (to better accommodate wikis besides English Wikipedia).

You can get a general idea of how the configuration works by looking at the following:

Toolbar section Default file English Wikipedia customization
Curation Bar Icon Add Tags Blue.png Add tags modules/ext.pageTriage.defaultTagsOptions/ext.pageTriage.defaultTagsOptions.js en:MediaWiki:PageTriageExternalTagsOptions.js
Curation Bar Icon Trash Blue.png Nominasi antuk pangapusan modules/ext.pageTriage.defaultDeletionTagsOptions/ext.pageTriage.defaultDeletionTagsOptions.js en:MediaWiki:PageTriageExternalDeletionTagsOptions.js

Both of these files operate in much the same way.

There are two top-level jQuery variables that define the curation templates that are listed in the curation toolbar under the Curation Bar Icon Add Tags Blue.png (add tags) and Curation Bar Icon Trash Blue.png (nominate for deletion) buttons. These are:

$.pageTriageTagsOptions = {};
$.pageTriageDeletionTagsOptions = { Main: {}, User: {} };

The 'Main' and 'User' refer to the namespace of the page being curated. Each sub-item in the three sets above defines the tabs shown at the left side of the toolbar, and has the following form:

{
    label: 'Short title',
    desc: 'A longer description.', // Text only, no HTML or Wikitext markup
    multiple: false, // Whether more than one of the tags can selected at once.
    tags: { tag1 = {}, tag2 = {} }
}

Then the actual templates that are listed are defined under the above tags variable. Each deletion template has the following form:

{
    tag: 'Actual_template_name', // Without the 'Template:' prefix.
    label: 'Friendly template title',
    desc: 'A longer description.', // Text only, no HTML or Wikitext markup
    code: '',
    params: {},
    anchor: '',
    talkpagenotiftopictitle: 'message-name', // The message name (e.g. pagetriage-del-tags-speedy-deletion-nomination-notify-topic-title) used as the section/topic title when posting to the editing user's talk page.  Usually, you can reuse one of the existing messages (currently pagetriage-del-tags-speedy-deletion-nomination-notify-topic-title, pagetriage-del-tags-prod-notify-topic-title, pagetriage-del-tags-xfd-notify-topic-title).  If you need a new one, file a task so $wgPageTriageDeletionTagsOptionsContentLanguageMessages or the PageTriage repository can be updated.
    talkpagenotiftpl: 'Template_name' // The template that will be added to the editing user's talk page, not including the talk page heading (handled by talkpagenotiftopictitle).
}

At the moment, some tags must be present:

  1. $.pageTriageDeletionTagsOptions.Main.xfd.tags.articlefordeletion

Example

So, if you don't want to use any of the built-in deletion templates (which can be imported from NewPagesFeed_Templates.xml, by the way) then you can replace them all with a single one by adding the following at the bottom of your MediaWiki:PageTriageExternalDeletionTagsOptions.js page:

var deletionSection = {
    label: 'Deletion',
    desc: 'Nominate for deletion.',
    multiple: false,
    tags: {
        articlefordeletion: {
            tag: 'delete',
            label: 'Delete',
            desc: 'Nominate this page for deletion.',
            code: '',
            params: {},
            anchor: '',
            talkpagenotiftopictitle: 'pagetriage-del-tags-xfd-notify-topic-title',
            talkpagenotiftpl: 'Deletion notification'
        }
    }
};
$.pageTriageDeletionTagsOptions = { Main: { xfd: deletionSection }, User: { xfd: deletionSection } };

Client-side hooks

PageTriage provides a specialized action queue system to allow other scripts and gadgets to integrate with it. This is similar to mw.hook except that it uses promises. This is done using the mw.pageTriage.actionQueue module. See the comments in the source code for documentation on how the system works.

The actionQueue module is available after the mw.hook ext.pageTriage.toolbar.ready fires. PageTriage will give the action queue handler an Object with the following data, in addition to other data as noted below:

  • pageidID of the page being reviewed.
  • titleTitle of the page, including namespace.
  • reviewerUsername of who is using PageTriage.
  • creatorUsername of the creator of the page.
  • reviewedWhether or not the page is currently or will be marked as reviewed.

Available actions

  • deleteFired when the reviewer tags a page for deletion. The data given to the handler also includes:
    • tagsAn object of all the templates added to the page. The keys are the template title, and the values are an object of metadata, including things like the speedy deletion code.
  • markFired when the review status of a page is changed. Also includes:
    • noteThe personal message the reviewer added for the creator of the page. This may be blank.
  • tagsFired when maintenance tags are added to the page. Also includes:
    • tagsAn array of the titles of all templates that were added to the page.
    • noteThe personal message the reviewer added for the creator of the page. This may be blank.

Example

To use the action queue, register a function to be ran when an aforementioned action is fired. PageTriage will wait for any asynchronous code to complete before doing anything else, such as refreshing the page. For example, to edit Sandbox after a page has been marked as reviewed, you could use:

$( function () {
	// You must first listen for the ext.pageTriage.toolbar.ready event using mw.hook, to ensure your handler is registered at the right time.
	mw.hook( 'ext.pageTriage.toolbar.ready' ).add( function ( queue ) {
    	// Listen for the 'mark' action.
		queue.add( 'mark', function ( data ) {
			return new mw.Api().edit( 'Sandbox', function ( revision ) {
				// Replace 'foo' with the note the reviewer left.
				return revision.content.replace( 'foo', data.note );
			} );
		} );
	} );
} );

API

PageTriage adds the following API endpoints which can be used:

API Description Type Triggering action
pagetriageaction Mark a page as reviewed or unreviewed, and logs the action in Special:Log. Write
  • Using the Page Curation toolbar to mark a page as reviewed
  • Using the Page Curation toolbar to mark a page as unreviewed
pagetriagelist Retrieves the list of pages in the queue, and each page's metadata, including their reviewed status. To retrieve one page, you must provide the page_id. To provide multiple pages, you must select one of showreviewed/showunreviewed, and one of showredirs/showdeleted/showothers, or no pages will be returned. Read
  • Loading the Page Curation toolbar (automatically loaded if you have the patrol userright and view a page that is unpatrolled or recently patrolled)
  • Viewing Special:NewPagesFeed (provides the list of articles)
pagetriagestats Retrieves stats about the number of pages in the queue and the top reviewers. Read
  • Viewing Special:NewPagesFeed (provides the total articles in the header, and provides the stats in the footer)
pagetriagetagcopyvio Mark an article as a potential copyright violation, and logs the action in Special:Log. Write
  • Marking as a copyright violation by a bot with the copyviobot userright
pagetriagetagging Add clean-up tags or deletion templates to a page, and logs the action in Special:Log. Write
  • Using the Page Curation toolbar to place a maintenance tag on an article
  • Using the Page Curation toolbar to place a deletion tag on an article

Special:Log

The following logs are created by the extension:

Speical:Log log_type log_action Description Notes
Page curation log pagetriage-curation delete, enqueue, reviewed, tag, unreviewed Logs deletion tagging, maintenance tagging, marking page as reviewed, marking page as unreviewed
Potential copyright violation log pagetriage-copyvio insert Allows a bot to log potential copyright violations Doesn't display unless you set $wgPageTriageEnableCopyvio to true

SQL tables

Name Prefix Description Old entry deletion strategy
pagetriage_log ptrl_ Log of all "mark as reviewed", "mark as unreviewed", "patrolled", and "autopatroled" actions. One entry each time the status changes. Used by the pagetriagestats API to figure out who the top patrollers are. All entries deleted after 1 year.
pagetriage_page ptrp_ The main table. Log of all pages created after PageTriage was installed. One entry per page. Stores the "mark as reviewed" statuses mentioned above. Also stores the last time a tag was placed on the page by PageTriage. Query ptrp_reviewed > 0 in this table to figure out if a page is marked as reviewed. No entry also means the page is reviewed. All articles deleted once ptrp_reviewed > 0 (marked as reviewed) and older than 30 days. All redirects deleted after 180 days regardless of patrol status.
pagetriage_page_tags ptrpt_ Stores metadata about pages, to make the filters in the Page Curation toolbar work. For example, if you pick the filter "Were previously deleted", then PageTriage will query this table looking for the recreated tag ID. The tag ID is discovered by checking the pagetriage_tags table. See #pagetriage_page_tags for list of tags. All article metadata deleted once ptrp_reviewed > 0 (marked as reviewed) and older than 30 days. All redirect metadata deleted after 180 days regardless of patrol status.
pagetriage_tags ptrt_ A dictionary of page_tags, and their corresponding ID number. See #pagetriage_page_tags for list of tags.

pagetriage_page_tags

pagetriage_page_tags data is updated by calling ArticleCompileProcessor::newFromPageId( [ $pageId ] )->compileMetadata(). This is called in the following hooks:

  • onPageMoveComplete() - runs when moving a page
  • onLinksUpdateComplete() - runs when saving an edit
  • onMarkPatrolledComplete() - runs when clicking the "Mark this page as patrolled" link in bottom right corner of certain pages

It is called asynchronously. The user will see that their edit succeeded and can continue browsing the website, and the page tags update will occur in the background, invisibly to the user.

List of tags

The pagetriage_page_tags are as follows:

  • Author information
    • user_id
    • user_name - there's a filter where you can type in their username
    • user_editcount
    • user_creation_date
    • user_autoconfirmed
    • user_experience - Experience level: newcomer (non-autoconfirmed), learner (newly autoconfirmed), experienced, or anonymous. These experience levels are baked into core and can be accessed with MediaWikiServices::getInstance()->getUserFactory()->newFromUserIdentity( $performer )->getExperienceLevel()
    • user_bot
    • user_block_status
  • Deletion tags - will display a black trash can icon if marked for deletion
    • afd_status
    • blp_prod_status
    • csd_status
    • prod_status
  • Special:NewPagesFeed red warning text
    • category_count - No categories
    • linkcount - Orphan
    • reference - No citations
    • recreated - Previously deleted
    • user_block_status - Blocked
  • Page information
    • page_len - size of article, in bytes
    • rev_count - number of edits to the article
    • snippet - text from beginning of article, used in Special:NewPagesFeed to preview the article
  • afc_state - 1 unsubmitted, 2 pending, 3 under review, 4 declined
  • copyvio - latest revision ID that has been tagged as a likely copyright violation, if any

Determining if a page is reviewed

Status codes

There are status codes used to track whether a page is reviewed or not. These are the values given when you query patrol_status, ptrp_reviewed, and ptrl_reviewed:

  • Unreviewed
    • 0 - unreviewed
  • Reviewed
    • 1 - reviewed (someone clicked the green check mark in the Page Curation toolbar)
    • 2 - patrolled (someone clicked the "Mark as patrolled" link at the bottom right corner of a page)
    • 3 - autopatrolled (someone with the autopatrol user right created the page, or moved the page from a non-tracked namespace to a tracked namespace)
    • no result - will occur if the page is not in a tracked namespace (mainspace, userspace, and draftspace), if the article was created before PageTriage was installed, or if the article was reviewed for longer than 30 days (these records are deleted by a cron job)

Via the API

To check the review status of pages using an API query, you can use api.php?action=pagetriagelist&page_id=$PAGEID, and check the patrol_status field. Follow the directions above to interpret the values of this field.

Sample JavaScript code:

async function isReviewed(pageID) {
	let api = new mw.Api();
	let response = await api.get( {
		action: 'pagetriagelist',
		format: 'json',
		page_id: pageID,
	} );

	// no result
	if ( response.pagetriagelist.result !== 'success' || response.pagetriagelist.pages.length === 0 ) {
		return true;
	// 1, 2, or 3
	} else if ( parseInt(response.pagetriagelist.pages[0].patrol_status) > 0 ) {
		return true;
	// 0
	} else {
		return false;
	}
}

Via SQL

To check the review status of pages using an SQL query, you need to query the pagetriage_page table and the ptrp_reviewed field. Follow the directions above to interpret the values of this field.

/* By page_id */
SELECT ptrp_reviewed
FROM pagetriage_page
WHERE ptrp_page_id = 71318376

/* By page_title and page_namespace */
SELECT ptrp_reviewed
FROM pagetriage_page
JOIN page ON page_id = ptrp_page_id
/* For page_title, don't forget to use underscores instead of spaces. */
WHERE page_title = 'Živko_Kostadinović'
	AND page_namespace = 0

NOINDEX

NOINDEX refers to the HTML code <meta name="robots" content="noindex">, which can be inserted into a page to stop the page from appearing in search engine results. In default installations of MediaWiki, all pages are indexed unless they contain the wikicode __NOINDEX__. When $wgPageTriageNoIndexUnreviewedNewArticles is set to true, PageTriage will take over deciding what pages are indexed.

First check

  • First check: Noindex the page if ALL of the following are true:
    • $wgPageTriageNoIndexUnreviewedNewArticles is turned on
    • Page age is less than $wgPageTriageMaxAge (set to 90 days on enwiki)
    • Page is in pagetriage_page table[1]
    • Page is marked as unpatrolled (ptrp_status = 0)

Second check

  • Second check: If the wikitext has the __NOINDEX__ magic word, noindex the page if ALL of the following are true:
    • Page age is less than $wgPageTriageMaxNoIndexAge (set to 90 days on enwiki)
    • If $wgPageTriageMaxNoIndexAge is not null, page is in pagetriage_page table[2]

The main use case for the __NOINDEX__ magic word is in deletion templates and maintenance tag templates that are transcluded into mainspace or draftspace. See this search.

Is the page in the pagetriage_page table?

In regards to the requirement "Page is in pagetriage_page table", there are several ways a for a page to get into this table:

  • Not been deleted by a PageTriage cron job
    • One cron job deletes redirects older than $wgPageTriageRedirectAutoreviewAge days old (default 180 days as of Sep 2022), regardless of patrol status. In other words, this cron job autopatrols them.
    • Another cron job deletes reviewed pages after 30 days of being reviewed
  • In a namespace that PageTriage is configured to patrol
  • Isn't an article that is so old it predates the installation of PageTriage

Toolbar

The toolbar has three states: maximized, minimized, and hidden. The maximized toolbar is the full-size toolbar with all buttons. The minimized toolbar still displays and floats, but simply says "Curation" and has an X you can click to close it. The hidden toolbar doesn't display at all, and can be re-opened by clicking the "Open Page Curation" link in the left menu.

Entry points

The extension's features can be triggered by various actions:

Entry point type File location Notes
6 APIs includes/Api/*
1 special page includes/SpecialNewPagesFeed.php
18 hooks includes/Hooks.php
1 hook handler includes/HookHandlers/*
1 cron job cron/updatePageTriageQueue.php runs every 48 hours
6 maintenance scripts maintenance/* need to be run manually

Here is a list of some actions and the corresponding entry points they trigger:

Action Entry points used
View Main page
  • Hooks.php -> onBeforeCreateEchoEvent()
  • Hooks.php -> onArticleViewFooter()
  • Hooks.php -> onResourceLoaderRegisterModules()
Type in search box, triggering search suggestions
  • Hooks.php -> onBeforeCreateEchoEvent()
  • Hooks.php -> onApiMain__moduleManager()
View Special:NewPagesFeed
  • Hooks.php -> onBeforeCreateEchoEvent()
  • SpecialNewPagesFeed
  • Hooks.php -> onResourceLoaderRegisterModules()
  • ApiPageTriageStats
  • ApiPageTriageList
View an unreviewed article while logged in

and having the patrol permission.

  • Hooks.php -> onBeforeCreateEchoEvent()
  • Hooks.php -> onArticleViewFooter()
  • Hooks.php -> onResourceLoaderRegisterModules()
  • Hooks.php -> onResourceLoaderGetConfigVars()
  • Hooks.php -> onDefinedTags()
  • Hooks.php -> onApiMain__moduleManager()
  • ApiPageTriageList (by page_id)

External libraries

Backbone and Underscore are unusual libraries to use in MediaWiki extensions, and jQuery UI is deprecated. Long term, we are interested in replacing these front end libraries, to make the extension easier to maintain.

Testing with ORES

Enwiki has the ORES extension installed, which provides machine learning predictions of an article's quality and of some common issues. ORES works fine in production, but requires some setup if you want to test in a localhost environment. It can be desirable to test with ORES turned on, for example, if you are changing the layout of Special:NewPagesFeed. Here is a localhost testing procedure:

  • Clone Extension:ORES and add wfLoadExtension( 'ORES' ); in LocalSettings.php
  • Add this to LocalSettings.php
$wgPageTriageEnableOresFilters = true;
$wgOresWikiId = 'enwiki';
$wgOresModels = [
	'articlequality' => [ 'enabled' => true, 'namespaces' => [ 0 ], 'cleanParent' => true ],
	'draftquality' => [ 'enabled' => true, 'namespaces' => [ 0 ], 'types' => [ 1 ] ]
];
  • Run php extensions/ORES/maintenance/BackfillPageTriageQueue.php

See also

Notes

  1. Checked by isPageUnreviewed()
  2. Checked by isNewEnoughToNoIndex(), if it doesn't exit early due to $wgPageTriageMaxNoIndexAge being null.