Hotlinking is simply not cool. Referrer spam is also not cool. I get both of these from 4 Blogspot sites, and have struggled to contain their mess. The problem is that they are hosted by Google, through their Blogger platform, GoogleUserContent.com. Though Blogger is free, they are very difficult to kill. Here’s what I did to combat the problem.
Using Blogger as a Referrer Spam Platform
Apache, the server and not the Indian tribe, is a fickle mistress. She is more than a little unpredictable, or at least it feels this way on Site5. While I realize that Apache is a web server, a computer who should be very logical, often times I notice very odd behaviour. Maybe it is the server setup, caching, or even traffic volume, I do not know. I do know that if you have some error in your htaccess file, the Apache server will then display a combination of ip addresses and host names. Once you fix the error, which no one can point out and there is no error message to go by, you will be back to only ip addresses.
My htaccess file is getting large as I continually ban more bad bots of the world. As it gets larger there are bound to be more mistakes. One of the mistakes can occur in “deny from” lines, which account for the vast majority of lines in the htaccess. If you add any alpha characters to the ip addresses in “deny from” lines, the Apache server will do all host lookups and try to not return IP addresses. This means that some spammers’ ip addresses will be hidden behind bogus host names. For accuracy it is best for the Apache server to return their IP addresses. Using IPs you can then do host and search lookups, find them and ban them.
This is a preview of
Check htaccess Deny From lines for Alpha Characters
. Read the full post (594 words, 0 images, estimated 2:23 mins reading time)
The web is said to be about free access, and I certainly agree. When China’s Great Firewall entered a more rigorous phase, and Google decided to leave China, some said that free access to information on the internet was a basic human right, I disagreed. Still, here in Toronto, Canada I do appreciate open internet access. There are limits, however, when certain people take advantage of your hospitality. People try to scrape your site to use for their purposes, they try to break in and use your site to launch their own malicious doings, they try to spam you so that your site’s comments increase their link and trackback stats. There are all kinds of schemes that cost the site owner bandwidth, and eventually money. The site owner is forced to increase his level of service from his ISP (or get kicked off of his shared service), or move to another ISP. This is not a zero sum issue: The site owner loses financially.