Category: bot

WordPress Comment Spam Methods

Hate, we do, all comment spam. They post, we delete, but I actively ban. Still, they come back for more. It must be economically worthwhile for these people to continually do this, because there seems to be no end in sight as to when they will stop. Comment spam is here to stay. Innovations are bound to happen, so I’ve logged what I have learned.

You will need to utilize your raw access log to see these techniques in action.

Your typical comment spam

Bots and Online Concert Ticket Sales: Humans Will Lose

Lose, humans will, in a competition with a bot. Smarter people in the stock market know this and acknowledge that bots play a major role in their online trading system. This has yet to occur in the online ticket sales area. The CBC documented an ex-bot operator on how bots rig the system. I do agree, and something must be done about it. Tragically Hip concert tickets selling out to bots before humans means that fans will pay a huge premium for tickets, and none of this premium will go to the artists. This is simply not right.

Keeping Pinterest in an Ocean of AWS Bots

Big Weed told me to not ban Pinterest. While I am not a huge Pinterest fan, she is/was so I listen to her. The problem is that Pinterest is hosted on Amazon Web Services (AWS), a cloud host provider infamous for hosting bad bots. Here are the IP ranges to ban AWS but keep Pinterest coming back.

# AWS 52.192.0.0 – 52.223.255.255 52.192.0.0/11
deny from 52.192.0.0/13 52.200.0.0/16 52.201.0.0/17 52.201.128.0/18 52.201.192.0/19 52.201.224.0/20 52.201.240.0/21
# Pinterest 52.201.248.0/24 52.201.249.0/24
deny from 52.201.250.0/23 52.201.252.0/22 52.202.0.0/15 52.204.0.0/14 52.208.0.0/12

Why I Banned Amazon Web Services AWS

My friend was surprised when I told him that I banned all IP ranges of Amazon Web Services (AWS) from my site. It is particularly ironic considering that we both had recently attended an AWS Cloud Computing IoT presentation, which was well done and interesting to both of us.

AWS accounts for a huge chunk of the world’s cloud computing platform, and my decision to ban all IP ranges did not come lightly. I just simply could not keep up with all the comment spammers and scrapers coming out of AWS. It seems like I am not alone. This has been by experience as well. There are others.

ilpuntoantico.blogspot Referrer Spam: Research, Ban

ilpuntoantico has to do with former point, Italian openwork needlepoint. Hosted by GoogleUserContent.com, I have tried to contact Google network-abuse@google.com for help to stop this referrer spammer (2016-Sept-29) but they have not replied. I have also contacted GoogleUserContent.com (abusecomplaints@markmonitor.com 2016-oct-19) for help in banning this hotlinking of images. This group of sites burns up a lot of bandwidth every day.

Conclusion: Filed a DMCA equest to Google. Awaiting their response.

2.33.130.129
2.33.130.129
2.33.130.134

net-2-33-160-7.cust.dsl.teletu.it 2.33.160.7
net-2-33-160-7.cust.dsl.teletu.it
net-2-33-160-7.cust.dsl.teletu.it
net-2-33-160-7.cust.dsl.teletu.it

net-2-35-34-88.cust.vodafonedsl.it
net-2-35-34-88.cust.vodafonedsl.it
net-2-35-34-88.cust.vodafonedsl.it
net-2-35-34-88.cust.vodafonedsl.it

net-2-36-2-24.cust.vodafonedsl.it
net-2-36-2-24.cust.vodafonedsl.it
net-2-36-2-24.cust.vodafonedsl.it
net-2-36-2-24.cust.vodafonedsl.it
net-2-36-2-24.cust.vodafonedsl.it
net-2-36-2-24.cust.vodafonedsl.it

2.41.88.145
2.41.88.145
2.41.88.145

5.90.4.169
5.90.4.169
5.90.4.239

tanyadokterkeluarga.blogspot Referrer Spam: Research, Ban

tanyadokterkeluarga.blogspot is a persistent referrer spammer. They use a huge amount of Ip addresses that do not repeat the third octet. It has similar strategies to kosmetik-freaks.blogspot, in fact sharing identical IP ranges. They are sister referrer spammers. Both are not banned by the HTTP_REFERER in htaccess. If you kill one you kill the other, a nice double prize. As with the sister, this spammer runs out of Indonesia.

These are the referrers:
tanyadokterkeluarga.blogspot.ca
tanyadokterkeluarga.blogspot.co.id
tanyadokterkeluarga.blogspot.com
tanyadokterkeluarga.blogspot.in
tanyadokterkeluarga.blogspot.my
tanyadokterkeluarga.blogspot.sg

hvvc.us Content Scraper: Research, Ban

There are some scrapers and there are others that are ridiculous. I just got scraped hard by 209.133.216.182, 209-133-216-182.static.hvvc.us, with 105 server entries and 7 unique user agent names. Excessive, to say the least.

Here are the UA’s used:

Mozilla/5.0 (BlackBerry; U; BlackBerry 9900; en) AppleWebKit/534.11+ (KHTML, like Gecko) Version/7.1.0.346 Mobile Safari/534.11+
Mozilla/5.0 (compatible; heritrix/3.3.0-SNAPSHOT-20160721-2308 +http://www.exif-search.com)
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0
Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201
Opera/12.02 (Android 4.1; Linux; Opera Mobi/ADR-1111101157; U; en-US) Presto/2.9.201 Version/12.02
UniversalFeedParser/3.3 +http://feedparser.org/
Windows-Media-Player/11.0.5721.5145

209.133.192.0 – 209.133.223.255 209.133.192.0/19
NOC4Hosts, HIvelocity Network

I have sent an email to their ISP, abuse@hivelocity.net.

kosmetik-freaks.blogspot Referrer Spam: Research, Ban

This kosmetik-freaks.blogspot is a referrer spammer that has been harassing me for quite a long time. I have tried to ban them with an HTTP_REFERER ban but this does not work. My ISP, Site5, will not help me. They are predominantly out of Indonesia. They are pret103.47.135.43
103.47.135.50
103.47.135.7
103.47.135.72
too sophisticated to evade my detection for so long.

The sister referrer spammer is tanyadokterkeluarga.blogspot, which uses the identical method and largely shares the same IP ranges. When you kill one you kill the other. Almost all these UAs are mobile devices, leading me to believe these are mobile customers that have downloaded the same spam app.

kwpublisher.com Referrer Spam: Research, Ban

kwpublisher.com is a long-time referrer spammer that I would like to remove. I have tried to ban them with an HTTP_REFERER ban but this does not work. My ISP, Site5, will not help me. This guy seems to have a similar method to kosmetik-freaks.blogspot. They seem to be out of Pakistan mostly, but have gone to Indonesia and China. I am now tracking them closely.

Conclusion: Tracked down the code hotlinking to my site. Complained to their domain names provider. Them they disappeared. Goodbye.

39.42.52.98 x 4 39.32.0.0 – 39.63.255.255 Pakistan Tel

45.32.48.27
45.32.48.27
45.32.48.27

Host Name 0 Zero or localhost in your Raw Access Log

Does your raw access log display a host name of “0”, or zero? Very odd, is it not? I have been struggling with this for a couple of months, and my ISP Site5 had no answers. It turns out that one of my spammers, NFORCE_ENTERTAINMENT, puts an unprintable character into their host table, so that when my ISP looks them up, they display the unprintable character in my log as “0”.

Trying to control your site’s spam can be challenging. If you try to ban an IP that is simply 0, or a host name of “0” you will fail, because there is no zero in their host name, but an unprintable character. Ban these guys instead.