Tag: bot

Chatgpt On Chinese Social Media QQ

I have been on QQ Chinese social media for quite a number of years, and am now an administrator for a couple of QQ groups. Recently a couple of QQ groups have been able to use the Chatgpt API to connect with and allow Chinese group members to have a first-hand try at Chatgpt. Here are my observations.

QQ is one of China’s largest social media platform, where you can find friends, video chat, send pics and videos, participate in group chats, exchange money, listen to music, and more. The platform is less invasive than WeChat. Both are run by Tencent. I have become an administrator on a couple of English-only QQ groups.

This is a preview of Chatgpt On Chinese Social Media QQ. Read the full post (1522 words, 0 images, estimated 6:05 mins reading time)

This Image was Banned on QQ, Chinese Social Media

This image was banned on 2 QQ (Chinese) social media forums. This is a label from a carton of 30 Burnbrae large eggs, made in the USA. Photo 1 by Don Tai

I do not intend to be subversive on Chinese social media, nor very political. I also do stay away from sensitive issues on QQ, a Chinese social media messaging and forum site. This image ban, however, really took me by surprise. I teach some English on the Chinese forums, so I thought I’d show them a typical Canadian product label in English and French. I tried 3 times, and the QQ bot banned me all three times from posting to 2 separate QQ forums. Very odd.

This is a preview of This Image was Banned on QQ, Chinese Social Media. Read the full post (475 words, 3 images, estimated 1:54 mins reading time)

Being Human in an Internet Bot World

Humans are slow and somewhat unpredictable, at least compared to a bot scraping this web site. I actually like that. After all, my posts are meant to be read by humans and not bots. I welcome bots only if they provide a route for humans to my site, such as search engines. For all other bots, known and unknown intent, you will receive a 403 if I can help it.

This is a preview of Being Human in an Internet Bot World. Read the full post (411 words, 0 images, estimated 1:39 mins reading time)

Attempted Login Attack from Webmaster Agency Ltd REALTY.RU RU 434 times

194.190.169.83 24/Aug/2018:21:04:47 to 24/Aug/2018:21:21:05 You attempted 434 login attempts. I see you. I know when you visited and that you are trying to break into my site. You have been logged and sent packing with 403s. I have 2,425 of your header logs. Do not do this again.

194.190.169.0 – 194.190.169.255
org-name: Webmaster Agency Ltd
person: Dmitry V. Volkov
address: REALTY.RU LTD
address: 1, Kurchatov Sq.
address: 107005, Moscow
address: Russia
org-type: OTHER
phone: +74957724216

Request Header:

2018-08-24:21:04:47
URL: /wp-login.php
IP: 194.190.169.83
Content-Length: 22
Content-Type: application/x-www-form-urlencoded
Host: dontai.com

Permanent link to this post (89 words, 0 images, estimated 21 secs reading time)

Request Header-Based Logging for Apache

When someone, such as a person or a bot, the requester, requests a resource from your server, this request, for Apache, is logged in the raw access log. The requester also leaves some information about itself called http request headers. While not standard to log on Apache, with a little bit of php added to the html, this extra information can be logged and examined to help determine if the requester is a bot or human.

As an additional file will be created daily, I opted to put these files into a subdirectory. The headers, one per line, are being logged into a headers-yyyymmdd.log file, which seems free form. Different requesters leave different sets of headers.

This is a preview of Request Header-Based Logging for Apache. Read the full post (400 words, 0 images, estimated 1:36 mins reading time)

Content Spam: Human or Not? jijikserver Pinsupport

I received this message on my site which on the surface looked like a human. Though they had grammar errors there was enough there to pass. With further analysis I believe this to be a bot.

hey hai this is ashok , i have lg optimusp768 with rooted, unlocked bootloader and also cwm , but i cant find custom roms any wheere please prepare one custom rom , or atleast one stock rom with more features

Human Characteristics:
The comment was on topic. The English, which had grammar and spelling mistakes, was passable.

Bot Characteristics:

This is a preview of Content Spam: Human or Not? jijikserver Pinsupport. Read the full post (428 words, 0 images, estimated 1:43 mins reading time)

Feedly: Somewhat Schizophrenic so Please Settle Down

A dear friend uses Feedly to monitor my site. He complained that he was getting 403s Banned, and asked why. Well, I have found that Feedly usually only takes my RSS feed, but sometimes, not often, it scrapes me mercilessly. Once I see a bot start scraping, I ban it. I moved him over to the more well behaved Feedburner by Google.

Here are the Feedly user agents:

Feedly/1.0 (+http://www.feedly.com/fetcher.html; like FeedFetcher-Google)
FeedlyBot/1.0 (http://feedly.com)

The latter, FeedlyBot, runs off WZComm, and had previously scraped me, so I banned it. WZComm also runs the surdotly bot, which is also banned. The former, Feedly/1.0, runs off Level 3, and seems well behaved.

This is a preview of Feedly: Somewhat Schizophrenic so Please Settle Down. Read the full post (136 words, 0 images, estimated 33 secs reading time)

Nikto Web Server Scan: View from the Access Log

Playing, I am, with the Nikto web server scanning package. I scanned my own site, just for fun. While it does take some time, it did finish. I wondered how it would look from my site’s raw access log viewpoint. In summary, Nikto is not stealthy at all. It is also easily detected and banned mid-scan, as it takes a long time to complete.

Essentially you start a Terminal, and type “nikto -h “. There are lots of options, such as output to a log. The Nikto output highlights web site vulnerabilities and cross references these with a database of known hacks. Using this tool you can highlight the site’s weaknesses and then strengthen your site from hackers.

This is a preview of Nikto Web Server Scan: View from the Access Log. Read the full post (532 words, 0 images, estimated 2:08 mins reading time)

strider.delmarvagroup.com 173.49.213.106 really wants to contact me

173.49.213.106 strider.delmarvagroup.com, from the MCI Communications block, you really need to put some smarts into your bot. What are you thinking?

173.48.0.0 – 173.63.255.255 MCI Communications

I’m not sure why you are doing this, but please stop. I don’t have a contact form at that location.

This is a preview of strider.delmarvagroup.com 173.49.213.106 really wants to contact me. Read the full post (371 words, 0 images, estimated 1:29 mins reading time)

City of Toronto Internet Scraper Bot

City of Toronto internet scraper bot scrapes my site a couple of times per month. Why? Toronto, Canada

I live in the City of Toronto, and write about Toronto-related subjects. What is surprising is that the City of Toronto has an internet bot that randomly scrapes content from my site a couple of times each month. The bot started scraping me near the end of January 2017.

What is interesting was that I, concerned citizen, actually emailed them because I thought they had a Zombie PC taken over by a bot, or some other security issue. I sent the City a log of the relevant entries related to their IP address. Was I naive. Here is their reply (isg@toronto.ca):

This is a preview of City of Toronto Internet Scraper Bot. Read the full post (409 words, 1 image, estimated 1:38 mins reading time)