Extension:ConfirmEdit/Patch

Patch for even more spam protection edit

This is a patch to allow experienced users to bring in external links without solving a captcha, regardless if they has skipcaptcha permissions. A user is considered to be trusted if she has a large number of edits.

This patch also prohibits new users from adding _any_ external links. Such behavior should help a lot to tackle spam, because the whole reason of spam is to add such links (they call it "link building") and spam is almost always added by newly created users.

Configuration:

Apply the patch to extensions/ after unpacking ConfirmEdit 1.2. If you want to deviate from the defaults, add this to LocalSettings.php:

# Don't ask for a captcha for users with more than this number of edits.
$wgCaptchaTrustedThreshold = 150; // default 100
# Always reject edits adding new external links with less than this number of edits.
$wgCaptchaNewbieThreshold = 10;   // default 5

The patch isn't as well honed as it could be, for example user messages aren't localized. Also, the refusal for newbies to add external links applies no matter which permissions a user has. Other than that, it appears to work just fine. For a wiki using this patch, see http://reprap.org.

Markus "Traumflug" Hitter, September 2013, <mah@jump-ing.de>

--- ConfirmEdit/Captcha.php.2013-09-06	2013-05-28 12:10:57.000000000 -0700
+++ ConfirmEdit/Captcha.php	2013-09-08 10:30:12.000000000 -0700
@@ -1,6 +1,9 @@
 <?php
 
 class SimpleCaptcha {
+
+	private $newLinks = null;
+
 	function getCaptcha() {
 		$a = mt_rand( 0, 100 );
 		$b = mt_rand( 0, 10 );
@@ -62,6 +65,21 @@
 	}
 
 	/**
+	 * Insert the captcha prompt into an edit form.
+	 * @param OutputPage $out
+	 */
+	function newbieCallback( &$out ) {
+		global $wgCaptchaNewbieThreshold, $wgUser;
+		$count =  $wgUser->getEditCount();
+		$out->addWikiText( "As a spam protection measure, users with fewer than
+		                    $wgCaptchaNewbieThreshold edits are '''not''' allowed
+				    to add ''external'' links. Please use internal links
+				    instead (which is always to be perferred) or raise
+				    your edit count by doing a few edits without links.
+				    \n\nYour edit count is $count.\n\n" );
+	}
+
+	/**
 	 * Show a message asking the user to enter a captcha on edit
 	 * The result will be treated as wiki text
 	 *
@@ -237,6 +255,34 @@
 	/**
 	 * @param $editPage EditPage
 	 * @param $newtext string
+	 * @param $merged bool
+	 * @return an array with the number of newly added links
+	 */
+	function findNewLinks( &$editPage, $newtext, $merged ) {
+		if ( $this->newLinks != null )
+			return $this->newLinks;
+
+		if ( $merged ) {
+			// Get links from the database
+			$oldLinks = $this->getLinksFromTracker( $editPage->mArticle->getTitle() );
+			// Share a parse operation with Article::doEdit()
+			$editInfo = $editPage->mArticle->prepareTextForEdit( $newtext );
+			$newLinks = array_keys( $editInfo->output->getExternalLinks() );
+		} else {
+			// Get link changes in the slowest way known to man
+			$oldtext = $this->loadText( $editPage, $section );
+			$oldLinks = $this->findLinks( $editPage, $oldtext );
+			$newLinks = $this->findLinks( $editPage, $newtext );
+		}
+
+		$unknownLinks = array_filter( $newLinks, array( &$this, 'filterLink' ) );
+		$this->newLinks = array_diff( $unknownLinks, $oldLinks );
+		return $this->newLinks;
+	}
+
+	/**
+	 * @param $editPage EditPage
+	 * @param $newtext string
 	 * @param $section string
 	 * @param $merged bool
 	 * @return bool true if the captcha should run
@@ -246,6 +292,18 @@
 		$title = $editPage->mArticle->getTitle();
 
 		global $wgUser;
+
+		// Users with more than $wgCaptchaTrustedThreshold edits
+		// are considered to be trusted, so they don't need a captcha.
+		global $wgCaptchaTrustedThreshold;
+		if ( ! isset( $wgCaptchaTrustedThreshold ) )
+			$wgCaptchaTrustedThreshold = 100;
+		if ( $wgUser->getEditCount() > $wgCaptchaTrustedThreshold ) {
+			wfDebug( "ConfirmEdit: trusted user, skipping captcha\n" );
+			return false;
+		}
+
 		if ( $wgUser->isAllowed( 'skipcaptcha' ) ) {
 			wfDebug( "ConfirmEdit: user group allows skipping captcha\n" );
 			return false;
@@ -284,22 +341,7 @@
 		}
 
 		if ( $this->captchaTriggers( $editPage, 'addurl' ) ) {
-			// Only check edits that add URLs
-			if ( $merged ) {
-				// Get links from the database
-				$oldLinks = $this->getLinksFromTracker( $title );
-				// Share a parse operation with Article::doEdit()
-				$editInfo = $editPage->mArticle->prepareTextForEdit( $newtext );
-				$newLinks = array_keys( $editInfo->output->getExternalLinks() );
-			} else {
-				// Get link changes in the slowest way known to man
-				$oldtext = $this->loadText( $editPage, $section );
-				$oldLinks = $this->findLinks( $editPage, $oldtext );
-				$newLinks = $this->findLinks( $editPage, $newtext );
-			}
-
-			$unknownLinks = array_filter( $newLinks, array( &$this, 'filterLink' ) );
-			$addedLinks = array_diff( $unknownLinks, $oldLinks );
+			$addedLinks = $this->findNewLinks( $editPage, $newtext, $merged );
 			$numLinks = count( $addedLinks );
 
 			if ( $numLinks > 0 ) {
@@ -469,6 +511,21 @@
 			# The CAPTCHA was already checked and approved
 			return true;
 		}
+
+		// Always reject edits of newbies adding external links.
+		global $wgCaptchaNewbieThreshold, $wgUser;
+		if ( ! isset( $wgCaptchaNewbieThreshold ) )
+			$wgCaptchaNewbieThreshold = 5;
+		if ( $wgUser->getEditCount() < $wgCaptchaNewbieThreshold ) {
+			$addedLinks = $this->findNewLinks( $editPage, $newtext, $merged );
+			if ( count( $addedLinks ) > 0 ) {
+				wfDebug( "ConfirmEdit: rejecting newbie edit due to new links\n" );
+				$editPage->showEditForm( array( &$this, 'newbieCallback' ) );
+				return false;
+			}
+		}
+
 		if ( !$this->doConfirmEdit( $editPage, $newtext, $section, $merged ) ) {
 			$editPage->showEditForm( array( &$this, 'editCallback' ) );
 			return false;

Experience with this patch edit

After two months with this patch, we still wait for the first spam edit. Other than spambots still creating accounts, misuse of our wiki has completely disappeared.

Legitimate users apparently understand the error message. No complaints, but occasionally useless edits to raise the edit count appear. Typically, these users revert their useless edits without maintainer intervention. Exactly like planned.

-- Traumflug@reprap.org