Kik content scraper bots sent me this IP from bredbandsbolaget.se. Kik uses single IPs from all over North American ISPs, and they’re now expanding globally. Kik content scrapes my site daily, so it is in my best interest to stop them.
Just for fun I translated from Swedish to English, “bredbandsbolaget” translates to “broadband company”! LOL! bredbandsbolaget.se provides TV, internet and telephone in Sweden. They have a web site. After the ip address the next set of numbers before the “cust” might be the Swedish telephone number, starting with the area code. Then again maybe not, as some have hex
bb.sky.com is a regular content scraper on my site, so I have decided to track them down. I finally figured out their hex IP address, so I can target ranges better.
Sky is a very large TV and internet provider in the Uk. They have a huge range of IPs.
fregat.ua is a bot from Russia. It was logged for ransomware, so you really don’t want them to try to break into your site. Quite bold, they are, trying to get my login and admin pages, so they are a definite security threat for trying to break into my site. Fregat.ua is an ISP with a web page.
This is part of the keywords-monitoring-your-success.com, free-video-tool.com Semalt Botnet that spread to other South American hosts, but they have changed the referrer name slightly to keywords-monitoring-success.com. This host is tricky because they only provide the last 2 octets of the IP address, leaving me to guess the first two.
Here is my clue: customer-qro-199-67.megared.net.mx
There are clues to the same pattern used by megared.net.mx, using a variety of new 2 initial octets combined with the last 2 from the host name. While I only have this one IP as a content scraper, their reputation is one of an email spammer. I guess they moved into a newer but related business model.
My content scraper host name was 98-68.furanet.com. It looks like their pattern or strategy is a reverse order domain name with the first 2 octets missing. Looking at their IP range I would guess 93.93.64.0/21, which covers the 68 of 98-68.furanet.com. From my Google search I’ve added 91.192.108.0/22 which they also commonly use.
Ban these most commonly used IPs:
91.192.108.0/22
93.93.64.0/21
My site has been getting content and image scraped by bb-81-107.018.net.il and bb-153-46.018.net.il, but these two host names do not resolve. Furthermore there is very little on the internet on them. My next step is to ban their complete IP range.
Pattern:
If there are 4 octets in the host name, then reverse the octets. If there are only 2 octets then these are the last 2 of the IP. You will need to use the host command and try the first 2 octets of their common ranges.
Fool, it would, an automated anti-bot system, because humans are more intelligent than bots. They are innovative, in their evil genius way. Computer security is all about the arms race. The better the methods, the better the counter measures, and then it repeats. No security measure is foolproof for very long.
IPVNow.com has a slew of host names that when you look them up, resolve successfully and all point to the same IP address, 103.224.182.241. This misdirection is what would fool the anti-bot software, because this IP is real and it points to a valid company, Trellian, which owns IPVNow.com. But banning this single IP does not stop the content scraping. Each host name has its own IP address that uses ISPs Ubiquity and Nobis. These are the IPs you need to ban.
This host name is constantly scraping my site, but when I look it up it does not resolve. Searches on Google reveal that they seem to change their IP address very often. Many other sites are getting spammed and content scraped by this host. I have no alternative than to ban the whole IP range of customer.worldstream.nl.
I read my raw access log and the first column provides me with an IP address or host name. This first column is usually enough to target the specific IP that is errant, and I ban the last IP octet of 256 addresses.