Category: bot

This Image was Banned on QQ, Chinese Social Media

This image was banned on 2 QQ (Chinese) social media forums. This is a label from a carton of 30 Burnbrae large eggs, made in the USA. Photo 1 by Don Tai

This image was banned on 2 QQ (Chinese) social media forums. This is a label from a carton of 30 Burnbrae large eggs, made in the USA. Photo 1 by Don Tai

I do not intend to be subversive on Chinese social media, nor very political. I also do stay away from sensitive issues on QQ, a Chinese social media messaging and forum site. This image ban, however, really took me by surprise. I teach some English on the Chinese forums, so I thought I’d show them a typical Canadian product label in English and French. I tried 3 times, and the QQ bot banned me all three times from posting to 2 separate QQ forums. Very odd.

Being Human in an Internet Bot World

Humans are slow and somewhat unpredictable, at least compared to a bot scraping this web site. I actually like that. After all, my posts are meant to be read by humans and not bots. I welcome bots only if they provide a route for humans to my site, such as search engines. For all other bots, known and unknown intent, you will receive a 403 if I can help it.

WordPress Trackback Spam Technique for Content Spamming

Recently I have been observing a different WordPress spam technique that uses WP trackbacks. This technique has some interesting characteristics that are unlike other types of spam, so my usual clues as to origin and banning method did not work. Fortunately this technique also has some unique characteristics that can be used to ban them. Fortunately.

WordPress Trackbacks
When one WP site links to another WP site, the WP sites communicate with each other using a method called trackbacks. The first site sends a trackback request to the second site. The second site posts the trackback as a special comment, which invites the user to click through to the first site. These trackbacks are automated, making it convenient for both sites.

Request Header-Based Logging for Apache

When someone, such as a person or a bot, the requester, requests a resource from your server, this request, for Apache, is logged in the raw access log. The requester also leaves some information about itself called http request headers. While not standard to log on Apache, with a little bit of php added to the html, this extra information can be logged and examined to help determine if the requester is a bot or human.

As an additional file will be created daily, I opted to put these files into a subdirectory. The headers, one per line, are being logged into a headers-yyyymmdd.log file, which seems free form. Different requesters leave different sets of headers.

Content Spam: Human or Not? jijikserver Pinsupport

I received this message on my site which on the surface looked like a human. Though they had grammar errors there was enough there to pass. With further analysis I believe this to be a bot.

hey hai this is ashok , i have lg optimusp768 with rooted, unlocked bootloader and also cwm , but i cant find custom roms any wheere please prepare one custom rom , or atleast one stock rom with more features

Human Characteristics:
The comment was on topic. The English, which had grammar and spelling mistakes, was passable.

Bot Characteristics:

Hacked By An0n 3xPloiTeR, 8B0K3N H34R7, Team Pak Cyber Ghosts: Cyber Hack Forensic Examination

Hacked By An0n 3xPloiTeR And 8B0K3N H34R7 Team Pak Cyber Ghosts [P.C.G], main message screen with running footer 1

Hacked By An0n 3xPloiTeR And 8B0K3N H34R7 Team Pak Cyber Ghosts [P.C.G], main message screen with running footer 1

This hack suspended the hosting account and the web site as a malware infected account. The hack set up a malware attack for anyone who visited the site, specifically targeting Windows. I am still trying to figure out how they got in, This is a Pakistani-based attack, or so their message says. I’ll try to document as much as I can to help others in the same situation.

Feedly: Somewhat Schizophrenic so Please Settle Down

A dear friend uses Feedly to monitor my site. He complained that he was getting 403s Banned, and asked why. Well, I have found that Feedly usually only takes my RSS feed, but sometimes, not often, it scrapes me mercilessly. Once I see a bot start scraping, I ban it. I moved him over to the more well behaved Feedburner by Google.

Here are the Feedly user agents:

Feedly/1.0 (+; like FeedFetcher-Google)
FeedlyBot/1.0 (

The latter, FeedlyBot, runs off WZComm, and had previously scraped me, so I banned it. WZComm also runs the surdotly bot, which is also banned. The former, Feedly/1.0, runs off Level 3, and seems well behaved.

China’s Tactics of Influence in Foreign Countries

China is a sovereign country, the same as any other independent and the world must respect this. What is unique about China is their willingness to use any means to exert their influence far beyond Chinese jurisdiction. I see that here in Canada, but there are reports of the same tactics being used in Australia and New Zealand.

Tactics include:
Funding education programs that have a pro-Chinese viewpoint
There is great concern here in Canada about their funding tactics. While it is great to encourage the study of Mandarin language, China is using this platform to teach a pro-Chinese viewpoint to very young kids. More than worrisome, this is meddling in the internal affairs of Canada. The Toronto District School Board had signed an agreement with this group, but the decision was reversed.

Russian Referrer Bot on 2017 Aug 31: 43 Unique IPs

I can only label this a Russian referrer bot because it uses predominantly Russian referrers, used for referrer spam. In fact I have no evidence of its origin. The list of 46 unique requesting IPs are from around the world, seemingly random. While it is easy to ban these 43, there is no way to find the originator of this bot.

Referrer spam is unique in that the originating IP does not care about returned data. All the IP request wishes to do is insert their referrer info into the request. This request goes back to and therefore affects and pollutes your Google Analytics. The requesting IPs, not wanting any information in return, could be from anywhere and could well be faked.