Anti-spam features

MediaWiki provides the following features to reduce the problem of Wiki Spam.

Note that many of these features are not activated by default. If you are running a mediawiki installation on your server/host, then you are the only one who can make the necessary configuration changes! By all means ask your users to help watch out for wiki spam (and do so yourself) but these days spam can easily overwhelm small wiki communities. It helps to raise the bar a little. You should also note however, that none of these solutions can be considered completely spam-proof. Always revisit 'recent changes' periodically!

Contents

Content Banning Blacklists

$wgSpamRegex

To prevent a spammer from saving wiki edits with problematic content, use the variable '$wgSpamRegex' (in older versions of Mediawiki it was called '$wgSpamBlacklist'). Set the variable in LocalSettings.php (overriding the value appearing in DefaultSettings.php). Set it to a regular expression for matching on any URLs (or parts of URLS) which you do not want to allow users to link to. You can also match any other bad content which you wish to ban. Users are presented with an explanatory message, indicating which part of their edit text is not allowed.

Simple example $wgSpamRegex setting

Place a line like this somewhere in your LocalSettings.php file.

# If this Regular expression matches the text of an edit, then the edit is disallowed.
$wgSpamRegex = "/online-casino|buy-viagra|adipex|phentermine|adult-website\.com|display:none|overflow:\s*auto;\s*height:\s*[0-4]px;/i";

This prevents any mention of 'online-casino' or 'buy-viagra' or 'adipex' or 'phentermine'. The '/i' at the end makes the search case insensitive.

The example also prevents any reference to 'adult-website.com'. Clearly this kind of setting provides an easy way to get rid of a particular spammer if they keep coming back to your wiki.

Finally the example also blocks certain CSS style attributes which have recently been used to hide spam in many attacks. Unfortunately there are many workarounds this spammer can use, but for the time being this will get them off your back.

This is only a simple example. See $wgSpamRegex documentation for more detail.

Longer spam blacklists

The above approach will become too cumbersome if you attempt to block more than a handful of spammy URLs. A better approach is to have a long blacklist identifying many known spamming URLs, in a more readable format (not a single regular expression). To acheive this, you will need to use the SpamBlacklist extension. With this, you can allow some of your users to edit the blacklist on a wiki page, and you can fetch updates from external sources.

Spam cleanup script

Blacklisting spam words or spammer domain names prevents future spam, but doesn't get rid of existing spam. In fact if you allow existing spam to remain, then the blacklist may interfere with people attempting to make legitimate edits. It's important that you clean-up as well as adding to the blacklist. You can do this by hand, or if you have a widespread spam situation, you may find this spam cleanup extension useful. This script automatically goes back and removes matching spam on your wiki after you make an update to the spam blacklist. It does this by scanning the entire wiki, and where spam is found, it reverts to the latest spam-free revision. Some more information on wikia.com

CAPTCHA images (extension now available)

The ConfirmEdit extension will confirm that an edit is being made a by a human, and not a spam bot. It does this by forcing users to type the text from a CAPTCHA image. By default this is only triggered if they have added a URL as part of their edit. The displayed message reads...

"Your edit includes new URL links; as a protection against automated spam, you'll need to type in the words that appear in this image".

Captchas have some disadvantages in terms of accessibility and inconvenience to your real human users. Also it will not completely spam-proof your wiki. For starters it will not prevent human spammers.

Proxy Blocker

As of version 1.4.1, Mediawiki has Proxy blocking. The idea is to prevent the use of open proxies. Most spammers use open proxies to obscure their identity, and to avoid IP address bans. It enables them to access a wiki, and make edits, from many different IP addresses.

rel=nofollow link attribute

Mediawiki uses the rel=nofollow link attributes. This prevents google following any external links added by users, thereby making spamming pointless. Note that this does not prevent spam. Spammers generally don't notice the difference, and will abuse your wiki anyway, but it does mean that they don't actually benefit from it.

By default, it is put on all external links, plus log and history pages. See NoIndexHistory. Note that putting it on all external links is a rather heavy handed anti-spam tactic, which you may decide not to use (switch off the rel=nofollow option). See Nofollow for a debate about this. It's good to have this as the installation default though. It means lazy administrators who are not thinking about spam problems, will tend to have this option enabled.

Lock Down (Poor Solution)

You can disallow editing by anonymous users. Force users to create an account with a username, and sign-in every time prior to editing. More extreme (better spam protection) is to create a "gated community" in which new users (and spammers) cannot create a new account. They have to request one from you.

People often naively suggest lock-down as best solution to wiki spam. It does reduce spam, but it is a poor solution and a Lazy Solution), because you are introducing something which massively inconveniences real users. Having to choose a username and password is a big turn off for many people. The wiki way is to be freely and openly editable. This "soft security" approach is one of the key strengths of the wiki concept. Are you going to let the spammers spoil that?

...if so, you can easily lock down your MediaWiki installation as follows:

Add the following to your LocalSettings.php

# Force people to register before they are allowed to edit
$wgGroupPermissions['*']['edit'] = false; 
$wgShowIPinHeader = false;

Note that this only reduces spam. In fact these days MediaWiki installations are routinely targetted by more advanced spam bots, which can perform automated registrations, and so this setting will mean you end up with a lot of bogus user accounts (where the name is just a set of random letters) in the database. You should combine this with the use of Captcha extension (above), which can keep bots out.

To take the lock down idea to extremes, MediaWiki allows you to create a "gated community" where new users can't even register without asking you to set up an account for them. To do this, add the following to your LocalSettings.php:

#Disallow creating accounts
$wgGroupPermissions['*']['createaccount'] = false;

See Help:User rights for more information.

lockdown patch

There is a "mediawiki spamfight" patch from tektank.it for MediaWiki 1.5.x which locks down your wiki, and makes some other tweaks:

  1. Only registered users can edit
  2. Accounts can be created only by providing a well-formed e-mail address.
  3. Account creation is restricted: max. 1 (configurable) per each IP number (requesting client)
  4. IP number of the client requesting user account is recorded into the user account - you can use this data later for banning offending domains
  5. Hidden sections (CSS Hidden Spam) cannot be added anymore (Need to clarify what exactly is blocked)

This only works on old MediaWiki 1.5.x installations. See the readme.txt file inside the zip, for more information.

Other Ideas

This page lists features which are currently included, or available as patches, but on the discussion page you will find many other ideas for anti-spam features which could be added to MediaWiki, or which are under development.

There is now also 'Spam Filter' project, dedicated to the task of building more effective spam filtering for mediawiki.

External links