Combating Blogspot Referrer Spam, Hosted by Google

Hotlinking is simply not cool. Referrer spam is also not cool. I get both of these from 4 Blogspot sites, and have struggled to contain their mess. The problem is that they are hosted by Google, through their Blogger platform, GoogleUserContent.com. Though Blogger is free, they are very difficult to kill. Here’s what I did to combat the problem.

Using Blogger as a Referrer Spam Platform

Here’s the scheme that was used on my site. Sign up for a Blogger site. Hotlink to images on other people’s site. When your Blogger site becomes popular, all the other sites will waste lots of bandwidth and receive a lot of referrer spam, which they will be unable to kill. When the other sites complain to Google, because Google is their internet service provider, Google will claim that hotlinking is not illegal. You are then stuck with hosting images for their site as well as their referral spam. You can report Blogspot copyright issues to Google, and file a DMCA notice.

Hotlinking is when you link to someone else’s web site to display an image on your own website. You steal their image and more importantly you steal their bandwidth. Blogspot has been an oasis for hotlinking images from my site, and I struggled to contain them, as they burned through my bandwidth. Their sites send me a lot of referrer spam, which I do not appreciate getting. I’m sure this happens to a log of other web sites as well.

Usually referrer spam is easy to remove through a htaccess referrer rule, but somehow Blogger has done something to evade these bans. I kept seeing these Blogspot referrals daily in my logs.

1. Ban these Blogspot sites using htaccess HTTP_REFERER rules.

RewriteCond %{HTTP_REFERER} ^.*ilpuntoantico [OR]
RewriteCond %{HTTP_REFERER} ^http://.*blogspot(\.al|\.ca|\.ch|\.co|\.com|\.de|\.fr|\.gr|\.in|\.it|\.md|\.mx|\.my|\.nl|\.si) [OR]

Result: Ineffective, but I don’t know why. It does ban other referrers but not Blogspot. Blogspot laughed in my face.

2. Ask my ISP, Site5, for help in banning this referrer spam epidemic. Result: The tell me they do not offer this service. They are my host provider, this spam is running on their equipment, you would think that it is in their best interest to rid themselves, and in turn, me, of referrer spam and the bandwidth drain, but no.

3. Remove the hotlinked images from my site
Here are the image requests from one of the Blogspot sites, from my raw access log:

79.30.175.79 [24/Oct/2016:05:53:51 GET /wp/wp-content/uploads/2010/03/Singer306k.jpg HTTP/1.1 200 435 http://ilpuntoantico.blogspot.it/2013_03_01_archive.html Mozilla/5.0 (Windows NT 6.3; Trident/7.0; Touch; rv:11.0) like Gecko
79.30.175.79 [24/Oct/2016:05:53:51 GET /wp/wp-content/uploads/2010/03/Singer306kback.jpg HTTP/1.1 200 435 http://ilpuntoantico.blogspot.it/2013_03_01_archive.html Mozilla/5.0 (Windows NT 6.3; Trident/7.0; Touch; rv:11.0) like Gecko

Smart me, how about just renaming the images so that the images Singer306k.jpg and Singer306kback.jpg are no longer available. I can easily do this with Filezilla and a small change in my WordPress post. Unfortunately for me, Apache is efficient. This image is so popular, downloaded every day by the same blog, that they have some special cache set up for these popular images. Even though I no longer host an image of the above names, my ISP still offers these images to Blogspot and any other site, and uses my bandwidth to do it. Lovely.

Result: Ineffective. Blogspot sites can still access my images, even though I have renamed them and they are no longer on my site.

There was the odd non-Blogspot site that hotlinked to these images. WordPress sends a very polite “File not Found” screen, which happens to be 3 times the size of my hotlinked image. This used more bandwidth than if they simply downloaded the image.

4. Ban by IP anyone who goes to the Blogspot sites and uses my hotlinked images

People go to the blogspot site, which loads my hotlinked images. Their IP addresses are sent to me when this occurs. I could log these IP addresses and try to find a pattern, and block their IP addresses.

Result: Time consuming, wastes my server resources and ineffective. Two of my Blogspot troublemakers, kosmetik-freaks.blogspot.co.id and tanyadokterkeluarga.blogspot.co.id where largely based in Indonesia, the 4th most populous country in the world, with 258,316,051 people. These Blogspot sites were accessed by their cell and sattelite phones, so the Ip ranges were ridiculously out of control.

Though I was able to ban the odd one in maybe 10, the work effort required was certainly not worthwhile.

5. Contact GoogleUserContent.com and their abuse email

Do a host lookup of ilpuntoantico.blogspot.fr and you’ll find the Google’s IP address of 172.217.0.161. Use whois.com and you can find network-abuse@google.com. So I emailed them and attached a portion of my log with some of the Blogspot referrer spam. After 3 weeks they never replied.

Try number 2: Use whois.com and search for GoogleUserContent.com, which yields abusecomplaints@markmonitor.com. Off my email goes with the same raw access log snipet. Interestingly, within a couple of hours I receive 2 emails outlining the Blogger complaint process. They state that if there is no illegal activity, such as copyright or legal infraction, human rights abuse there is nothing they can do. Also that MarkMonitor.com is Google’s domain name registrar and therefore cannot change any content anyway.

Result: Ineffective. Google will not help you with hotlinked images.

6. Ban these Blogspot sites from using my images using htaccess HTTP_REFERER rules.

RewriteCond %{HTTP_REFERER} ^http://(.+\.)?blogspot [NC]
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://i.imgur.com/qX4w7.gif [L]

This is turning out to be effective, and I am surprised. It uses the same HTTP_REFERER rule as in #1, but does not allow them to download images. I have checked the html on their site and my site’s image link info is available to them, but not the image.

Interesting is that in my raw access log I receive a code 200, successful, but I only transmit 435 bytes, which is smaller than the original images of 91k and 53k. The 435 bytes is even smaller than my error 404 “file not found” error code. No error 403 or 500 is triggered, but a successful code 200. Go figure.

Result: Effective but not a ban and does not stop Blogspot from hotlinking to my site in the future. While I would have preferred an error 403 or 500, a definitive ban, I can live with this. That they receive a successful code 200 is puzzling. They will also log as a legit referrer in Google Analytics

Addendum: 2016-Oct-30 Other blogspot.com referrers are hitting my site. This time they are getting 404s because they are using the file names that I have already renamed. It is obvious that these file names are shared amongst spammers. Why did not they get stopped with the htaccess UA or referrer rule?

51.36.124.238 [29/Oct/2016:17:59:42 GET /wp/wp-content/uploads/2009/03/cochineal_beetle.jpg HTTP/1.1 404 19425 http://minamed-cuisine.blogspot.com/p/blog-page_03.html?m=1 Mozilla/5.0 (iPhone; CPU iPhone OS 10_0_2 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) GSA/20.3.136880903 Mobile/14A456 Safari/600.1.4

Apache httaccess. You are like an out of control and high maintenance mistress that partially cooperates but sometimes leaves you with deep gashes that will slowly, eventually, heal.

Image copyrighted by Don Tai @ 2017

Image copyrighted by Don Tai @ 2017

2 DMCA’s filed with Google 2017-feb-13

Leave a Reply

Your email address will not be published. Required fields are marked *