Extension talk:PureWikiDeletion

Latest comment: 13 years ago by Cronje in topic Database error

To do list

edit

TODO:

  • Consider, should we be reading from the master or the slave database? In many cases, probably the former, since the latter may lag, and result in saving incorrect data.
  • Populate the blanked_page table more efficiently using a JOIN query, per bugzilla:23843
  • Put a "Blank" button at the top of the page for faster blanking
  • Cause unblanked pages to show up in Special:NewPages, if the site owner sets the pertinent configuration setting to true.
  • Any API stuff?
  • Cause links to blanked pages to turn a different shade of red after being clicked upon (i.e. maroon).
  • Include a checkbox in the search section of Special:Preferences as to whether the user wants to search titles of blanked pages or not.
  • {{NUMBEROFPAGES}} should exclude blanked pages from the calculation.
  • Special:AllPagesExcludeBlank should actually exclude blank pages.
  • Have the PWDEditHook function check the blanking revision to see if the user and comment details are still available FOR_THIS_USER, and if not, display "(Username or IP removed)" and/or "(comment removed)". Or find some other way to deal with privacy/etc. issues related to RevisionDeleted blanking edits.
  • Cause blanked pages to be redlinked on RecentChanges, Watchlists, Logs, and other special pages.
  • Cause the "this page has been blanked" notice to show up not just in edit headers but in regular page headers.
  • Get rid of edit conflict message for non-logged-in users who try to blank a page; only tell them that it is unauthorized.
  • Fix that glitch that causes it to sometimes display an edit page when you try to look at the history
  •     Semi-fixed: Have options in Special:Preferences (presumably using UserToggles) for automatic watchlisting of pages that the user blanks or unblanks. This won't work until a change is made in the core to use &$watchthis.
  •   Fixed: Add a special page from which a user with appropriate user rights can cause all presently blank pages to be added to the table. This would be useful for wikis in which there are already blanked pages when the PWD extension is installed.
  •   Fixed: Replace the edit screen for non-logged-in users who try to edit a blanked page with a message saying it is unauthorized.
  •   Fixed
  •   Fixed: Put all the code into SVN


Empty vs. #DELETED

edit

Have you considered empty rather than a special string like #DELETED? Ashley Y 08:45, 10 March 2010 (UTC)Reply

It's a possibility. Some editors have pointed out that #DELETED makes it clear that the intent is to delete rather than vandalize, but a vandal could just as easily replace all content with #DELETED. It saves keystrokes to use empty too. On the other hand, if we use #DELETED, then blanking is unnecessary; we can just add a #DELETED tag to the beginning of a page and leave the rest of the content, so that people don't have to dig through the history to find what was there. Then again, that possibility also exists with #REDIRECT, but in practice almost everyone replaces all the existing page content with #REDIRECT. Tisane 23:19, 10 March 2010 (UTC)Reply

Blanking should be logged?

edit

The current spec says:

The blanking of a page already appears on watchlists and recent changes, so I think this is redundant and unnecessary. However, it would probably be useful to be able to get a list of all blanked pages and filter and sort it by user and time of blanking, as we do with the deletion logs. So, I will create a special page with that functionality, and I figure out a way to filter the blanked pages out of AllPages. (Right now it shows up in AllPages as redlinks.) Tisane 09:24, 11 March 2010 (UTC)Reply

Are all these features necessary?

edit

I can see why people wouldn't want it to show up in RandomPage or AllPages (although I haven't figured out yet how to implement that). But given that these pages are blanked, won't the external search engines stop directing people there anyway? Also, it still shows up in the internal search as "page title matches," but as a redlink. Maybe it's not so bad for it to continue showing up there. If it's a search term people are looking for a lot, maybe the article should remain there, or at least be a redirect? Maybe it should show up in search accompanied by the edit summary explaining why it was blanked. Tisane 12:42, 13 March 2010 (UTC)Reply

htmlentities

edit

OK, is that good enough to clear up the XSS issues? Tisane 17:56, 13 March 2010 (UTC)Reply

Code review

edit

Since there is some interest in getting this used on Wikipedia, I'll provide some initial code review, though it will still need review from a staff developer before it can be enabled.

setup code

edit
'version' => '0.0.0',

If you aren't going to give a useful number, you might as well just leave it blank.

$wgGroupPermissions['*']['PWD']    = false;

It isn't necessary to explicitly set it to false, as that's the default. Also, this right doesn't seem to do anything.

function PWDSaveHook

edit
global $wgOut;

This isn't used anywhere

require( "extensions/PWD/PWDConfig.php" );

All configuration should be done in LocalSettings

$con = mysql_connect($yourHost,$yourUsername,$yourPassword); 

You should use the wiki database rather than creating your own, see Manual:Hooks/LoadExtensionSchemaUpdates for how to set up the database using update.php. MediaWiki also has its own classes for dealing with the database. See Manual:Database access

$sql=sprintf("INSERT INTO blanked_pages (blanked_page, blanking_user, blanking_timestamp,

After you switch to using MediaWiki's database classes, you should use the wrapper functions for queries, as these automatically handle quoting and escaping

mysql_real_escape_string($article->getTitle()));

Article::getTitle() returns a Title object, not a string. As written, it should use $article->getTitle()->getPrefixedDBkey(), but it should really be storing the title and namespace index separately or just the pageid.

edit
global $wgRRAnonOnly, $wgUser,$wgOut,$wgDBserver,$wgDBuser,$wgDBpassword,$wgTitle,
		$wgServer, $wgScriptPath;

AFICT, $wgRRAnonOnly doesn't actually exist and none of these are used.

$PWDActivated=true;
if ($PWDActivated==false){

$PWDActivated will never be false because you set it to true in the line directly above. Also, you should use either !$PWDActivated or $PWDActivated===false, depending on whether you need type checking as well

		if( $target->isKnown() ) {
			return true;
		} else {
			return true;
		}

Just "return true" is shorter.

if ( in_array( 'known', $options ) ) {

The existence check may not have been done at this point. You should probably use the LinkEnd hook.

function PWDEditHook

edit
$date = date('M d, Y H:i:s', mktime($hour, $minute, $second, $month, $day, $year));

You should pass the timestamp directly to wfTimestamp()

$user_url=$wgServer.$wgScriptPath.$URLStructForUserPage.$userPage;

Use $wgUser->getSkin()->link() to make a link. Same for all the other links

$output="<span>A former version of this page was blanked by ".$user_HTML

This should set a message in the i18n file and call it with something like wfMsg()

parser function functions

edit

These should call and check $parser->incrementExpensiveFunctionCount() since they need to do a database query for each usage.

return htmlentities($param2);

You shouldn't escape the output from the parser functions, as it still needs to go through the parser

PWD_body.php

edit
p_ID int NOT NULL AUTO_INCREMENT,

This should use the same prefix "blanked_" as the other fields

$sql="CREATE INDEX userind on blanked_pages (blanking_user)";

As far as I can see, "blanked_page" is the only thing used in a WHERE condition, so none of the other fields need an index.

PWDConfig.php

edit

As noted, all configuration should be done in LocalSettings, and it shouldn't use its own database.

date_default_timezone_set ( 'America/Denver' );

This shouldn't be set in an extension unless there's a really good reason

Code review discussion

edit

You may want to look over pages like Manual:Coding conventions. Mr.Z-man 05:02, 14 March 2010 (UTC)Reply

Thanks for the code review; I will get to work on these items immediately. Tisane 07:19, 15 March 2010 (UTC)Reply

Known bug

edit
  Fixed

There is a known bug involving the improper use of $wgUser that causes the user and talk pages on the edit screen to be incorrectly linked, but it will be fixed in version 1.0.0. Tisane 02:53, 19 March 2010 (UTC)Reply

Draft code posted

edit

I posted some draft code. Have at it, Kim. If you start doing a major overhaul, you may wish to use the Template:Inuse. And feel free to add yourself to the introductory comments as a co-author. Tisane 03:50, 19 March 2010 (UTC)Reply

Hmm, you don't have svn access? --Kim Bruning 21:47, 20 March 2010 (UTC)Reply
I applied for it about a week into my programming endeavors, but didn't get a response so I figured I'd wait a little while before making further inquiries. Tisane 07:25, 21 March 2010 (UTC)Reply

More Code Review

edit

Just a point about your SQL schema. Fields within a table should use a consistent prefix; currently some of the fields have "blanked" and some use "blanking". The prefix should also be short; I'd suggest something like "blank". Also, timestamps should be varbinary(14), and ids should be UNSIGNED INT:

CREATE TABLE blanked_pages(
blank_id                     int NOT NULL AUTO_INCREMENT,
blank_page_id                UNSIGNED int NOT NULL,
blank_user_name              varchar(256),
blank_timestamp              varbinary(14),
blank_summary                varchar(256),
blank_parentId               UNSIGNED int,
PRIMARY KEY (blank_id)
);

Happymelon 13:14, 19 March 2010 (UTC)Reply

I had to moved the UNSIGNED around to get it to work with phpMyAdmin:

CREATE TABLE blanked_pages(
blank_id                     int NOT NULL AUTO_INCREMENT,
blank_page_id                int UNSIGNED NOT NULL,
blank_user_name              varchar(256),
blank_timestamp              varbinary(14),
blank_summary                varchar(256),
blank_parentId               int UNSIGNED,
PRIMARY KEY (blank_id)
);

Tisane 12:56, 20 March 2010 (UTC)Reply

Groovy. Also, I just noticed the CamelCase in parentID; we prefer either blank_parent_id or just blank_parent (the fact that it's an id is obvious from the type). Happymelon 18:45, 20 March 2010 (UTC)Reply
Do you notice any security issues, or can MaxSem's XSS notice be taken down? By the way, I'm having a bit of trouble with the wfTimestamp and $wguser->getSkin()->link() changes suggested above. Do you have some code or documentation you can point me in the direction of? Thanks, Tisane 19:25, 20 March 2010 (UTC)Reply
Yes. A user Foo" href="http://evil.site blanking pages pirates the user page links AFAICT, as would blanking a page of the same name. Far and away the easiest way to cure that is to solve the other major problem, which is hardcoded messages with no possibility for localisation. As you can see, the parser has no problem safely displaying the hostile link, and will safely display any internal page link. So if you put your notice text in a sytem message in /PWD.i18n.php as:
A former version of this page was blanked by [[User:$1]] ([[User talk:$1|talk]]) 
([[Special:Contributions/$1|contribs]]) on $2 <br /> The reason given for blanking was:
 ''$3''.<br /> You may [{{fullurl:{{FULLPAGENAMEE}}|action=history}} view the article's 
history], [{{fullurl:{{FULLPAGENAMEE}}|oldid=$4}} edit the last version], or type new 
page into the white space below.
And bung that through the parser, it'll be sure to come out safely (or if it doesn't, there are much bigger problems with MW). And the message can be translated into all three hundred languages MW supports. And to get that HTML back again you need to call
$html = wfMsgExt( 'pwd-blanked', 'parse', array( $username, $date, $reason ) );
Note that the reason will be escaped by the parser; no need for htmlentities. So you don't really need the link*() functions.
Other notes:
  • You can parse MW timestamps to Unix with wfTimestamp( TS_UNIX, $timestamp ), no need to use string splitting
  • Use $wgTitle->exists() instead of the db call in lines 119-122, 251, 275, and all the doesPageExist() functions.
  • What is the function of $whatElse on line 96?
  • Implement styling with CSS, not hardcoded styles. So instead of adding $blankLinkStyle etc, add a class and style it in a /pwd.css (lowercase filenames for outward-facing files). Note that 'blue' is not the default colour for wikilinks; but you shouldn't need to style ordinary known/broken links, only blank ones.
Keep up the good work! Happymelon 20:57, 20 March 2010 (UTC)Reply

Next round

edit

A few more comments:

  • In PWDLink, you can special-case (i.e. ignore) things that aren't in editable namespaces so that you don't spend time looking up special pages, Media: links and system messages (these are treated as bluelinks even if they don't exist locally)
  • In PWDEditHook, you should use the user's preferred date format. $wgLang->timeanddate( wfTimestamp( TS_MW, $result->blank_timestamp ), true );. This will also adjust it if the user has a different timezeone set.
  • In efPWDParserFunction* the correct way to increment the expensive function count is to call $parser->incrementExpensiveFunctionCount() - this will return true or false to say whether to proceed. The parser functions should also probably return something, even if the expensive function count is exceeded. I think #ifexist returns false.
    • This can also special-case special pages and the like as noted above.
    • You should also check that $title isn't null, which would indicate an invalid title
    • You could also eliminate the (direct) query on the page table by getting the pageid with $title->getArticleID. It will return 0 if the article doesn't exist. And if the pageid has already been loaded for some reason before, it may not have to do a DB query.
  • In the DB schema, you don't need to add _ind after all the index names. Unless its an index on multiple columns, you can just use the column name.
    • It also isn't necessary to have an index on columns that will never be used in a WHERE or JOIN clause - these take up unnecessary space and can slow down writes.
  • You may want to split the hook functions out to a _body.php or .hooks.php file and make them static functions of a class. Then you can load the class with $wgAutoloadClasses and call the static functions in the hooks. This is sort of the de facto preferred style for extensions (at least for longer functions).
  • You should also hook on ArticleDeleteComplete to remove deleted pages from the blanked_pages table.

-- Mr.Z-man 22:11, 20 March 2010 (UTC)Reply

Combined response to Happy-melon and Mr. Z-man

edit
  •   Fixed: Timestamp issue
  •   Fixed: Replace database queries to "page" with exists(), getArticleID(), etc.
  •   Fixed: Ignore Special, Media, etc. pages
  •   Fixed: Expensive ParserFunction incrementation
  •   Fixed: Check for null Title
  •   Fixed: Change index names and eliminate superfluous index. I want to keep the other indexes so that later a special page can be added to view blanked pages and filter by title, user, summary, timestamp, etc.
  •   Fixed: Add ArticleDeleteComplete and ArticleUndelete hooks
  • When I put:
$html = wfMsgExt( 'pwd-blanked', 'parse', array( $blank_user_name, $date, $blank_summary ) );
$editPage->editFormPageTop .= $html;

What I get is this string: "<pwd-blanked>" rather than the actual message. Any insight into what I'm doing wrong?

Tisane 13:31, 21 March 2010 (UTC)Reply

Filtering by summary for anything except possibly exact matches and a prefix search would likely be too inefficient for use on Wikimedia. This is why there's no system for searching edit summaries and log summaries in core MediaWiki. For the messages, you'll need to call wfLoadExtensionMessages('PWD') in the function that calls the message, though I believe this is no longer required in 1.16. For the CSS, you wouldn't change the style attribute, you would change the class attribute and load a CSS file. For an examples of splitting the hook functions, AbuseFilter is set up like that. Mr.Z-man 17:20, 21 March 2010 (UTC)Reply

Response

edit
  •   Fixed: Moved messages into PWD.i18n.php
  •   Fixed: Moved hooks into PWD.hooks.php
  •   Fixed: Got rid of summary index
  •   Fixed: Got rid of whatElse()
  • Still working on CSS.

By the way, do you think it is necessary to have a blank_id primary key or can we just use blank_page_id? No pages should be showing up twice in the table, but I don't know if there are other uses of blank_id that we might anticipate. Thanks, Tisane 23:40, 21 March 2010 (UTC)Reply

Blank_id would be useful if it were a history table: one that preserves rows and is only added to, like revision or logging. So if you were using it as a log of who blanked pages when, it would be useful to have an id to refer to. You're only currently using it as an attribute table; rows are deleted when the page is unblanked and so things like the summary are lost on unblanking (which might be a concern in itself). For transparency, the log of blankings should probably be permanent; probably the easiest way to do that is to log the action, user, timestamp, and summary to the logging table and store the log_id in the blanked_pages table.
Essentially, if the table is a permanent record, it should have an id field; if it's transient (which it currently is) it doesn't need one. Happymelon 00:04, 22 March 2010 (UTC)Reply
I've thought of a couple ways permanency could be implemented. There could be a boolean field to indicate whether the page is currently blanked, and the rest of the table could be kept the same. Then each page would still have a maximum of one row in the table, but we would also still lose the historical data of past blankings/unblankings. I'm not sure whether that change accomplishes anything, beyond making sure that a particular blank_id permanently refers to the same page. In which case, blank_id can still be dispensed with and blank_page_id used instead as primary key.
Or (preferably) it could be a log, as you suggest, and the PWDLink() function would either check for something like that above-mentioned boolean field to see if the page is still blank, or it could look up the most recent revision text to see if it's empty, or it would see which log entry (blanking or unblanking) is the most recent. I agree transparency is good because some blanked/unblanked pages may end up getting deleted eventually, and the only record non-sysops will be able to access is the log. I'll probably take a closer look at Extension:GlobalBlocking to see how new types of log entries were added to Special:Log.
By the way, should we be concerned about page blankings triggering Extension:AbuseFilter with lots of "references removed," etc. tags? What about oversighting - do we need to provide for the purging of blanking logs? Tisane 02:43, 22 March 2010 (UTC)Reply
AbuseFilter filters can and will be rewritten with PWD in mind, and there's no reason why they shouldn't trigger on blankings if they want to. Oversight will definitely be needed, especially if the log is permanent. If you use the logging table, oversight is already built in; otherwise you'll need to implement it yourself, which would be a pain. If you did it that way, you can just use your page_blanked table as an attribute list; the logs for the page would contain a permanent record of who blanked what, when, and why. Happymelon 12:12, 22 March 2010 (UTC)Reply
I may have been a little confusing when I talked about returning something if the expensive function count is exceeded. I meant that ifexist returns the "false" condition (i.e. acts as if the page doesn't exist) if the count is exceeded, not that it actually returns false. Mr.Z-man 04:04, 22 March 2010 (UTC)Reply
OIC. Either way, the user of the page is kinda hosed if the count is exceeded. Tisane 05:11, 22 March 2010 (UTC)Reply

Logging

edit

OK, I added logging but there are a couple issues. It shows up in Special:RecentChanges but as two separate entries - the edit and the blank log entry. Should it only show up once? Also, I haven't figured out yet how to get the log entries to actually display in Special:Logs, although the blank log does show up in the drop-down menu. I did verify that the log entries are being added to the logging table. Tisane 03:17, 23 March 2010 (UTC)Reply

You need to add a handler to $wgLogActionsHandlers; in much the same way as hooks. You could find a hook which would let you set the RC_SUPPRESS flag on blank edits so they didn't show up in RC duplicated (or on the log entries, which might be easier). Happymelon 11:38, 23 March 2010 (UTC)Reply
  Fixed: Logging issues.

Now just the blank logs show up, and the blank log works properly. Do we really need a blanked_pages table, by the way, or would it be just as well to check page blankness with:

$myRevision=Revision::loadFromTitle($dbr,$title);
if ($myRevision->getRawText ()==""){ ... }

Is it less expensive to use a blanked_pages table? Also, I seem to have solved half of the caching problem by using Article::onArticleDelete( $mTitle ); whenever an article is blanked. (Hopefully that won't have any unintended consequences.) But I haven't figured out how to solve the other half, which is making links to articles turn blue again after the article is unblanked. If article 'foo' links to blank article 'bar', and 'bar' is then unblanked, 'foo' will still have a red link to 'bar' until 'foo' is edited. Tisane 12:36, 24 March 2010 (UTC)Reply

You'll need a blanked_pages table if you want to add a list of blanked pages. You should have a look at what the undelete interface does to avoid caching issues. Happymelon 20:01, 24 March 2010 (UTC)Reply
  Fixed: Caching issue.

This code did the trick:

$mTitle->touchLinks();
$mTitle->invalidateCache();
$mTitle->purgeSquid();

Tisane 20:27, 24 March 2010 (UTC)Reply

I notice that it only logs logged-in users' blanks/unblanks; blanks/unblanks done by IP addresses go unlogged. I guess we could limit blanking/unblanking to logged-in users. I don't like that for policy reasons, but it's not unheard-of to limit the rights of non-logged-in users; they can't upload or move pages, for instance. In fact, I don't think they can do any of the actions that are logged events. Tisane 01:46, 25 March 2010 (UTC)Reply

    Semi-fixed: Preventing non-logged-in users from blanking/unblanking

I used ArticleSave to tell non-logged-in users "You must be a registered user and logged in to blank or unblank a page" and prevent the edit from being consummated. But it also gives them an edit conflict notice, which is inappropriate. Also, I was hoping that a way could be found to not have to test the same data twice; if there were a way to pass data from the ArticleSave function to the ArticleSaveComplete function, it would only be necessary to test that data once. However, it appears that flags are done through a bitfield that doesn't allow other flags (e.g. BLANKING and UNBLANKING) to be made up. Also, the messages break when the ArticleSave function is moved from PWD.php to PWD.hooks.php. I was hoping to put all those hooks in PWD.hooks.php. Tisane 03:27, 25 March 2010 (UTC)Reply

What's the $messages global in PWDSaveHook? Happymelon 11:17, 26 March 2010 (UTC)Reply
Another forgotten remnant of fruitless debugging attempts that has no business remaining in the code. Good catch. Tisane 14:29, 26 March 2010 (UTC)Reply

NewPages

edit

It has been suggested at w:Project:Village_pump_(proposals)/Archive_60#Pure_wiki_deletion.2C_redux that pages, as they are unblanked, should show up in Special:NewPages. Is this necessary/desirable, given that such pages already show up in the blank log? Undeleted pages don't show up in NewPages, although I can see why people would view that as not being a big deal since only sysops can undelete pages. Page moves from userspace to mainspace don't show up in NewPages either, though; isn't that also a loophole by which people can evade NewPages? I guess people want the ability to see which unblankings have been patrolled and which haven't. Tisane 04:22, 25 March 2010 (UTC)Reply

Since undeletions don't show up in Special:NewPages, I don't think there's a need for it. There's no guarantee an unblanked page will be the same as the previous though, so it may be a good idea to provide a configuration setting in case an individual wiki does want unblanked pages to show in Special:NewPages. Reach Out to the Truth 15:17, 27 March 2010 (UTC)Reply

AllPages

edit

It looks like it's going to be a pain in the neck to implement Special:AllPagesExcludeBlank by modifying code from the SpecialAllpages class, which implements Special:AllPages. I expect that the easiest way would be to add some $join_conds to the database queries to exclude those pages that are in the blank_table. But SpecialAllpages uses a bunch of functions like estimateRowCount and selectField that don't have $join_conds as a parameter. selectField shouldn't be a big problem; I can probably work around that using selectRow (which has join_conds). But estimateRowCount?

This would be a lot easier if the page table had a boolean field indicating blankness; then I could just add that to $where. I'm not sure it would be worth the expense to make an extra write to the database every time a page is blanked/unblanked, though. Hmm, I guess blankings/unblankings will probably be sufficiently rare that it won't matter that much, eh? Are there any other disadvantages of adding a field to the page table? Tisane 08:23, 26 March 2010 (UTC)Reply

The crippling cost of trying to alter a table that already has thirteen million rows in it will mean the extension will probably never be installed on enwiki or other large WMF wikis. Is that a disadvantage? :D Happymelon 11:15, 26 March 2010 (UTC)Reply
Didn't they add a whole field, page_random, just for the sake of Special:Random? Maybe it'll be an easier sell after PWD is merged into the core... Tisane 14:31, 26 March 2010 (UTC)Reply
Special:Random is a core feature, and was added in 2003 when enwiki was about 1% of its current size. Extensions should not generally make changes to core tables. Happymelon 23:24, 26 March 2010 (UTC)Reply

Status

edit

This extension was rejected at w:Wikipedia:VPPR#Pure_wiki_deletion.2C_redux by a small majority of editors, but the effort to implement PWD on enwiki is no worse off that it was before. Indeed, we're making progress, since we got feedback on needed/wanted features, and now Libertapedia uses it. I would say, let's keep ironing out the problems and adding features and complementary extensions (e.g. there needs to be an Extension:BlankBatch, just as there is an Extension:DeleteBatch). It's a bit problematic sometimes even on Libertapedia getting people to switch from the ways of thinking they're used to, but I'm confident that in the end people will be won over when they see how well PWD works in practice. Tisane 12:20, 30 March 2010 (UTC)Reply

CSS

edit

Is it even possible to use CSS or skin->makeBrokenLink(...) within a LinkBegin hook function? I was looking at Extension:BreadCrumbs (Kimon) which uses makeLink and CSS, but it doesn't seem like a similar situation, since that's an ArticleViewHeader hook function that uses $wgOut rather than modifying &$customAttribs. Tisane 12:46, 1 April 2010 (UTC)Reply

Use the LinkEnd hook to add a class to the $attrs['class'] value, based on the $options value you set in LinkBegin. Happymelon 14:41, 1 April 2010 (UTC)Reply

RevisionDelete

edit

I just realized, if someone blanks a page with an edit comment such as "Jimbo Wales' phone number is 555-555-5555," and the revision is subsequently deleted, it will stay in the blanked_pages table, and if someone subsequently edits that page, it will show them the former edit comment. I could change the code so that PWDEditHook would check the last revision for blankness (rather than just checking the table); if someone RevisionDeletes the blanking revision, RevisionDelete's causing the page to become unblank would make the message not appear.

But, what about a situation in which a page is blanked legitimately, then an unblanking edit is made with an edit comment "Happy Z's phone number is 999-FOO-BARR", then it is blanked with that same edit comment, and both revisions are deleted? Then, upon that page being edited, PWDEditHook would detect page blankness (from the blanking revision before the two deleted revisions) and display the deleted comment. Further, these issues could arise if any list of blanked pages is generated from the table. PWD was not really designed with RevisionDelete in mind. This is another factor that would tend to favor scrapping the blanked_pages table and instead checking the present revision each time, I guess, unless hooks are going to be added for, e.g., ArticleRevisionVisiblitySet and ArticleRevisionUndeleted. But I'm not sure, given the scant parameters of the former, that it's even possible to get it to do what we would need. The workaround would be to, when deleting such revisions, to make another edit to the page to clear the item out of the blanked_pages table. Tisane 16:39, 2 April 2010 (UTC)Reply

Watchlist issue

edit
  Fixed

I added a checkbox for "Add pages I blank to my watchlist" but it won't work because even if the ArticleSaveComplete hook adds something to the watchlist, core functions in Article.php will remove it. I would do a workaround using UnwatchArticle, except that the parameters are insufficient to tell the hook where it's being called from — e.g., by the user specifically asking that a page be unwatched, or from one of those Article.php functions telling it to after doedit. Tisane 14:02, 3 April 2010 (UTC)Reply

What, exactly, do you need changed in core? Happymelon 09:36, 4 April 2010 (UTC)Reply

doEdit is the function that calls the ArticleSaveComplete hook, as follows:

wfRunHooks( 'ArticleSaveComplete', array( &$this, &$user, $text, $summary,
	$flags & EDIT_MINOR, null, null, &$flags, $revision, &$status, $baseRevId ) );

Note that it is passing null to $watchthis. ArticleSaveComplete should have &$watchthis as a parameter so that the hook function can change whether or not the page should be watched. The thing is, functions like insertNewArticle and updateArticle run doEdit and then alter whether the page is watched depending on whether the checkbox was marked on the edit screen. So doEdit would also probably need to have &$watchthis as a parameter. I tried to do just that but ran into errors, which perhaps were related to the fact that I changed doEdit to have &$watchthis as its last parameter, and then called it, e.g., with

$this->doEdit( $text, $summary, $flags,,,$watchthis);

I'm probably making some sort of newbie mistake here. Anyway, I wonder whether if this can be changed without messing up a bunch of other stuff. It's evident that someone was planning on implementing $watchthis in ArticleSaveComplete at some point or another, if not &$watchthis. I could probably come up with a workable patch if I keep playing with the code. I just need to be more tenacious in figuring this stuff out. Tisane 10:12, 4 April 2010 (UTC)Reply

See bug 23641. Tisane 06:04, 24 May 2010 (UTC)Reply

OK, a patch is forthcoming to fix that bug. However, the functionality is still not ideal. The way it now works is that when you blank or unblank a page, it watches it depending on those checkboxes in your preferences. But what it really should do is, when you click to edit a blanked page, check the "Watch this page" checkbox by default, and allow the user to uncheck it if he wants. The problem is, EditPage::showEditForm:initial has $editPage, not &$editPage, as a parameter, so I can't really change the value of that checkbox. Would it be desirable to make that change in the hook? Then I could presumably use:

if ($blanking_user->getOption( 'watchunblank' )){
		   $editPage->watchthis=true;
}

Tisane 08:32, 25 May 2010 (UTC)Reply

Never mind, I just realized that functionality already exists. Tisane 18:48, 25 May 2010 (UTC)Reply

PWD -> PureWikiDeletion, blanked_pages -> blanked_page

edit

I guess to conform to the usual norms, PWD should be renamed PureWikiDeletion and the blanked_pages table should be renamed blanked_page? Tisane 19:08, 10 April 2010 (UTC)Reply

Yes, probably. Happymelon 20:05, 10 April 2010 (UTC)Reply

Nice work!

edit

I just wanted to stop by and say thanks for the work you folks are doing! I've been a supporter of this idea for a while. --Explodicle 15:06, 21 May 2010 (UTC)Reply

Robot policy

edit

I notice that blanked pages continue to be indexed by Google. Even after the Google cache reflects only a blanked page, it still shows up high in the search results. I wonder, is there any disadvantage to adding blanked pages to the $wgArticleRobotPolicies array, e,g.

$wgArticleRobotPolicies = array( 'Junk page' => 'noindex' );

Then when the page is unblanked, I can remove it from the array, e.g.

unset($wgArticleRobotPolicies['Junk page']);

Or should I use $wgOut->setRobotPolicy('noindex,nofollow');? Or perhaps that latter method is used for some different purpose. Thanks, Tisane 02:03, 23 May 2010 (UTC)Reply

The first method won't work; $wgArticleRobotPolicies is stateless; any additions you make to it will be lost on the next pageview. Use $wgOut->setRobotPolicy('noindex,nofollow');, as you say, when someone asks for a blanked page. Happymelon 10:18, 23 May 2010 (UTC)Reply

Merge to core?

edit

Should this be merged to the core? In some ways, the hooks don't really seem designed with something like PWD in mind. E.g., when you blank a page, it should say "Action complete" or something like that, rather than just showing you a blanked article or an edit screen. And of course, there's the $watchthis issue. It seems like PWD is probably a good system for a lot of wikis to use. It is a less "authoritarian" system and more in line with the wiki philosophy than deletion as we know it today. Tisane 10:44, 24 May 2010 (UTC)Reply

I'm not convinced that this is a good addition to core. While it is certainly a very defensible system, and one which many wikis might find desirable, it is a significant change to the current behaviour which should not be forced on all wikis whether or not they want it. Its behaviour is ideally suited to an extension. Hooks can be redesigned and expanded, if necessary. Happymelon 15:12, 24 May 2010 (UTC)Reply
If it were merged, it would have to be a feature that could be turned on or off, and perhaps it could default to off. I guess I'll submit more hook revision requests to Bugzilla... 20:51, 24 May 2010 (UTC)
Exactly, and for such a big feature, that's an indication that it may not belong in core anyway. There are few other core configuration options which have such wide-ranging effect; patrolling is probably the most significant, and that's pretty minor in comparison. Happymelon 21:51, 24 May 2010 (UTC)Reply

Use sysops, or sysop-lites instead

edit

Maybe a better approach is to continue deleting pages (rather than blanking them) but just create a "Sysop Lite" user group that can view deleted revisions and delete and undelete. E.g.:

$wgGroupPermissions['sysoplite']['delete'] = true;
$wgGroupPermissions['sysoplite']['bigdelete'] = true; // can be separately configured for pages with > $wgDeleteRevisionsLimit revs 
$wgGroupPermissions['sysoplite']['deletedhistory'] = true; // can view deleted history entries, but not see or restore the text
$wgGroupPermissions['sysoplite']['deletedtext'] = true; // can view deleted revision text
$wgGroupPermissions['sysoplite']['undelete'] = true;

All users who are not obvious vandals will be made "sysop lites" after a short time, much like what RationalWiki does: http://rationalwiki.org/wiki/RationalWiki:Sysops Or maybe all users can be made sysop lites from the get-go. Tisane 16:03, 7 July 2010 (UTC)Reply

Database error

edit

Upon installation, I get the following error:

A database query syntax error has occurred. This may indicate a bug in the software. The last attempted database query was:
   (SQL query hidden)
from within function "Database::selectRow". Database returned error "1146: Table 'wiki.blanked_page' doesn't exist (my server's name)".

How can I fix this? Cronje 03:45, 15 November 2010 (UTC)Reply

Nevermind, it looks like I was using an unsupported version of MediaWiki. Dolt. Cronje 03:56, 15 November 2010 (UTC)Reply
Return to "PureWikiDeletion" page.