Well Ted’s announcement that we will be clamping down on people trying to game the system didn’t go down too well, with a great many of you concerned that you’d suddenly be banned for making an honest mistake. There’s also been a great deal of confusion over just what it is we’re going after. So, let’s start over, and I’ll explain exactly how it’s going to work.
First a bit of background. I don’t condone hacks, spoofs, exploits or anything like that. As a programmer though I can certainly understand what would drive someone to work this stuff out, and honestly some of the ‘solutions’ out there are impressive, to say the least. That said, we’re not talking about anything here that you could ever do by accident. It’s next to impossible that you’d have set your blog up and quite by accident dump a ranking spoof in there.
What we’re looking for are ways that people ‘boost’ their rank through nefarious means. I’ll explain, one by one.
The Google Page Rank Exploit
This one is quite old, and a few sites out there detailing how it’s done now claim that it doesn’t work any more. It does. We came across a couple of sites recently doing just this exploit. They either stopped doing it, or went elsewhere after we spoke with them.
The principle is shown in figure 1 above. Either using Javascript, or else by modifying the web server’s request handling behavior, a different result is served based on whether the request came from a web browser, or one of Google’s many automated spiders.
The idea is a simple one. If a web browser asks for a page, then we’ll show it. If on the other hand it’s Google crawling the site, redirecting them to a much higher traffic site effectively makes Google think that we are a part of the bigger site. When Page Rank lookups are done, the page rank that’s returned is actually the page rank of the much bigger site(amazon.com, microsoft.com, apple.com are all common targets) rather than my blog. Simple, devious, and not nice. Don’t do this.
The Alexa Image Redirect exploit
This is a relatively new exploit, but it’s been a round in different forms for quite some time. It’s a way for a group of ‘friends’ to share alexa rank and is achieved by embedding invisible images on a page that are loaded through Alexa redirecting out to the other sites.
For example, if I post an image on my site and you go ahead and link it on yours, that’s fine. You obviously found my image useful in some way, and the readers of your blog will be driving traffic to mine because they are interested in the image too. It’s part of the content of the blog.
However, if you go ahead and embed an invisible image on your blog and force that image to be loaded through an Alexa redirect then that begs the question of “why?”. It’s not an advertising tracking dot from us or anyone else, since they would never ever go through Alexa’s redirect.alexa.com url. It’s not content on your blog either, since it’s invisible. The only reason anyone would ever embed an image on a blog that loads through Alexa redirect is to ‘spoof’ traffic. The idea is that we all share the same type of dots and thus any visitors to my site, your site or anyone else we’re involved in this with causes a hit across all our sites, but through Alexa. The net result is a sizeable jump in Alexa ranks.
The most common tag you’ll see doing this in a page looks like this
<img src="http://redirect.alexa.com/redirect?http://www.mypalssite.com/ . images/someimage.jpg" style="width: 3px; height: 3px;" rel="nofollow">
…there’s usually some formatting around this as well to make sure the image doesn’t show up, or to make it appear as something different (a small green square, a red dot, and so on). If I really wanted to link to the other site and give them traffic I could just do this
<img src="http://mypalssite.com/images/someimage.jpg">
It would have the same effect, almost. The big difference is that the traffic is not being forced through Alexa and thus causing Alexa to gather inaccurate data on a site.
Please note though, this is very different to a blog roll. If you have a blog roll on your site, then you clickable links or images that take the browser elsewhere. For example:
<a href="http://www.payperpost.com">PayPerPost.com</a>
and
<a href="http://codehappy.wordpress.com">My own blog</a>
Those links could be set up to redirect through alexa when clicked, and that would be fine. The reader of the site genuinely wanted to go to those sites, since they clicked the links. The problem is only when the links are images that are automatically loaded through Alexa when the page loads.
Hope that clears everything up for everyone.
Finally, these are just the two most prominent ranking exploits out there. I started messing around with the idea of scripting the identification of these two a week or two back, and that’s basically the script that we’re going to run to identify wrongdoers. There are others though, other ways of doing dishonest things like this, and I will be extending the script as I come across them.