Category Archives: Search

Rel=”nofollow” Follow Up

The rel=”nofollow” is certainly coming into effect already, with quite a few prominent weblogs implementing it themselves, installing a patch/update or a plugin.

In my previous comment about it, I mentioned that I felt that it isn’t the search engines job to filter out spam and that it should rest on the owner of the site to make sure thier particular backyard on the internet is mowed.

With that in mind, we clearly need to come up with some alternative methods to combat spam. There are a few options which would invariably slow down most spammers, but not all, lets investigate a few of them.

The first being mandatory registration on your site to leave a comment. The problem with forced registration, is that it doesn’t lend itself to someone being linked to your site and leaving a comment. Signing up on every site is just a pain in the arse, you know it and so do I, so for the moment, that isn’t an option.

Secondly, I think forcing comment moderation is an option. However, if you have an active blog, the inherent workload for the owner is quite tall. There is also the downside that people leaving comments on your site can’t view them, or participate with other users, until you approve their comments. Not ideal, we’ll leave it for the moment.

Third, this isn’t all that likely. Allow anyone to post comments to your site and their comments go live, but be examined for spam content before posting. This is fine, except where the spammers leave a non-spam like comment with a link. At which point, it gets posted and they get their reward. We could take it further and parse their input, pull down the text for the page they are linking to and parse the html for illegal keywords (in the same line of thinking as Squid might if it was proxying content).

Fourth and this is really a category of tactics. User interogation when they post. For instance, they go to post and before they do, they have to enter a string that is blurred within an image (done before). What about a random but easily answered question? This line of thinking I think, would make it much harder for spammers to automate their attacks; especially if the challenge was random.

Fifth, change the way we accept comments. For instance, most spammers will pick a particular type of blogging software and attack it because it is simple. Look at MT, when you submit a comment with that software, the feedback is always posted to comments.cgi or the like of. If I were a spammer, that is making my life very simple. Make it more complex, lets make the submission URL synthetic, so they can’t hardcode it. Lets link the synthetic URL to their session id and make it available for only x minutes at a time. Check that the referrer for the submission is in fact your own site and that the HTTP header information is all there and intact.

At this point, I havn’t thought the fifth item right through; however I feel that there might actually be some merit in it. What about a combination of all of them, varying from submission to submission; just to keep them guessing a little.

What ideas have crossed your mind about it?

Rel=”nofollow”

Recently a bunch of people (precompiled list, thanks Google) decided to make an attempt at reducing the significance of comment spam on websites.

The concept, in short, is that any feedback provided by the user with links in it, will have the rel=”nofollow” attribute in place. When the search engines index the page, they won’t count any links with rel=”nofollow” as an incoming link to that particular site; thus removing the reward for a spammer.

The reason I’m undecided about the outcome/reason behind this method, is that it will remove the reward for a genuine user to gain popularity through participation. The sake of example, lets consider someone who participates online all the time, is an active part of the community, however does not feature in other sites blogrolls. He gains popularity for his site through participation on other peoples sites. Now, with the rel=”nofollow” in place, he loses that popularity; thus reducing his position in search results.

In my opinion, it is the users problem to make sure their site isn’t spammed (think of it like mowing your lawn) and the search engines job to rank the content.

I think there are alternatives and I’ll write about them shortly.