Tag: bots

Rampant Online Ad Fraud and Bot Activity

Ad fraud software in action. Target website, randomized referrers, browser agents and proxy IP addresses. That is enough to spoof anti-bot software.

Ad fraud software in action. Target website, randomized referrers, browser agents and proxy IP addresses. That is enough to spoof anti-bot software.

It is no secret that I battle and ban bad bots on my site. If a bot is not a well known search engine or provides me some type of service then I usually ban it. Sure it can visit my site, but it will receive a blank page. But why do they visit? Who is paying them? Welcome to the world of Online Ad Fraud.

US Anti-Scalper Bot Bill may soon come to Canada?

Want you do, to go to a concert, but just after the supposed start time for ticket sales, all the tickets are gone. You, again, have lucked out. Minutes later these tickets are all available on reseller sites for double the price. It really does sound like a scam. While the US just enacted a federal law, here in Ontario we are just starting the investigation phase. I hope that we can adopt something as strong as the US in order to keep an even keel with bot technology and online shopping safe.

Bots and Online Concert Ticket Sales: Humans Will Lose

Lose, humans will, in a competition with a bot. Smarter people in the stock market know this and acknowledge that bots play a major role in their online trading system. This has yet to occur in the online ticket sales area. The CBC documented an ex-bot operator on how bots rig the system. I do agree, and something must be done about it. Tragically Hip concert tickets selling out to bots before humans means that fans will pay a huge premium for tickets, and none of this premium will go to the artists. This is simply not right.

Strange Host Names that I Cracked

These host names try hard to evade detection of their IP addresses, in order to scrape content and sometimes break into from web sites. They have specifically scraped mine and so I hunted them down and banished them. Often times the unix host command returns nothing, so research is required. This usually works.

User Agents I Could not Ban with htaccess

These user agents, or bots, somehow fool and subvert my .htaccess user agent rules and continue to scrape my site. I’ve looked at my htaccess user agent rule many times and don’t know why. The next step is to ban their IP.

AhrefsBot is a large content scraper that hits my site hard, reads robots.txt but ignores it, fools my htaccess, bot is “Mozilla/5.0 (compatible; AhrefsBot/5.0; +http://ahrefs.com/robot/)”

Reducing your Bandwidth for WordPress and Drupal

Busy I have been recently, with not much time for my blog, but it was all for a good cause. My internet service provider (ISP) informed me that I was taking up too much CPU time on their shared service and banned me. I am a good guy and generally follow the rules, so getting banned is out of character. After a frantic email they restored my account so that I could figure out what happened. I truly am a “less is more” type of guy, and that includes IT resources, and my online sites are pretty consistent, so a propensity of new content was not the issue. Eventually I took some steps to rein in the numerous bots that were scraping and doing whatever to my site, wasting my CPU usage on my tab, and eventually getting me banned. If your site is suffering the same fate, you may glean some hints and tips for reducing your CPU usage.