Extension:GoogleNewsSitemap

MediaWiki extensions manual
GoogleNewsSitemap
Release status: stable
Implementation Special page
Description Outputs a list of pages based on what categories they are in as an RSS feed or Google news sitemap.
Author(s)
Latest version 2.2.0 (continuous updates)
Compatibility policy Snapshots releases along with MediaWiki. Master is not backward compatible.
Database changes No
License GNU General Public License 2.0 or later
Download
  • $wgGNSMmaxCategories
  • $wgGNSMmaxResultCount
  • $wgGNSMfallbackCategory
  • $wgGNSMsmaxage
  • $wgGNSMcommentNamespace
Quarterly downloads 8 (Ranked 128th)
Translate the GoogleNewsSitemap extension if it is available at translatewiki.net
Issues Open tasks · Report a bug

The GoogleNewsSitemap extension acts like DynamicPageList (Wikimedia), but instead of outputting a categoryintersection in a wikipage, it makes a special page which outputs such a category-intersection as either an rss feed, an atom feed, or in google news sitemap format.

For example, this can create an rss feed of the last five articles added to category:Published, ordered by the date they were added to the category, as well as a bunch of other things.

This extension was originally made for Wikinews.

Usage

edit

It uses the syntax of (stuff in [ ] is optional, stuff in ( ) means potential choices): http://example.com/w/index.php?title=Special:GoogleNewsSitemap?[feed=(rss|atom|sitemap)][&categories=Catname][&notcategories=OtherCatName][&namespace=0][&count=10][&hourcount=48][&ordermethod=categoryadd][&order=ascending][&redirects=only][&stablepages=only][&qualitypages=only][&usenamespace=true]

Multiple values for categories and notcategories options are separated with a pipe (| or encoded in urls as %7C).

For example if you wanted an rss feed of the last 7 articles added to Category:Foo and Category:Bar but not in Category:Baz, ordered by the date added to Category:Foo such that the one added most recently comes first, you would use the page http://my.wiki.example.com/w/index.php?title=Special:GoogleNewsSitemap?feed=rss&categories=foo%7Cbar&notcategories=baz&stablepages=include&qualitypages=include

Just doing Special:GoogleNewsSitemap (or equivalently Special:NewsFeed ) uses the category Published, and the feed type of atom, no notcategories, count of 50, order descending, ordermethod categoryadd, redirects include, stable and quality pages set to only. (These are sucky defaults, you may need to override a lot of them).

See DynamicPageList (Wikimedia) to see what the options do. While they are not exactly the same, they are very similar. The Main exceptions are that the namespace parameter in this extension can also take :all: as a parameter, and defaults to the main namespace if omitted (instead of all namespaces). The hourcount parameter specifies the number of hours ago the article could have been added to the first category to be considered for inclusion. It can be disabled by setting it to -1 hours.

Installation

edit
  • Download and move the extracted GoogleNewsSitemap folder to your extensions/ directory.
    Developers and code contributors should install the extension from Git instead, using:cd extensions/
    git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/GoogleNewsSitemap
  • Add the following code at the bottom of your LocalSettings.php file:
    wfLoadExtension( 'GoogleNewsSitemap' );
    
  • Configure as described
  •   Done – Navigate to Special:Version on your wiki to verify that the extension is successfully installed.

Configuration

edit

The categories in use by the sitemap format is controlled by the googlenewssitemap_categorymap system message which maps categories to sitemap keywords. It uses the following format:

*categoryname|keywordname
*categorynameyouwanthidden|__MASK__

The following configuration variables are also taken into account (Shown with their defaults):

$wgGNSMmaxCategories = 6;   // Maximum number of categories to look for
$wgGNSMmaxResultCount = 50; // Maximum number of results to allow
$wgGNSMfallbackCategory = 'Published'; // Fallback category if no categories are specified.
$wgGNSMsmaxage = 1800; // squid cache time (separate from memcache time).

// $wgGNSMcommentNamespace can be false to mean do not include a <comments> element in the feeds,
// or it can be true, to mean use the talk page of the relevent page as the comments page
// or it can be a specific namespace number ( or NS_BLAH constant) to denote a specific namespace.
// For example, on many Wikinews sites, the comment namespace is Comments (102), not talk.
$wgGNSMcommentNamespace = true;

All of these configuration parameters are pretty much self-explanatory. $wgGNSMsmaxage determines how long to cache in squid. This determines how out of date the feed can be. Defaults to 30 minutes. The extension also caches the feed using memcache (or other caching backend) for 12 hours, but that cache is checked to see if there are any new articles before using. The squid cache is not checked as such, so you want the timeout to be set to a low value to avoid entries that are too outdated.

The feed title is set by the googlenewssitemap_feedtitle system message. It defaults to [Language Name] [Site Name] [Feed type] feed., for example: "British English MyWiki RSS feed.".