Hey all, So I've been going through my logs and finding tonnes of click bots. When looking up the IPs I'm seeing many of them belong to cloud hosting / colo companies, which to me means there's probably just 3rd party apps installed on their servers that are clicking everything in sight. Some are definitely 'security' companies that are looking for virii and such. At least one is Lashback. I'm wondering, how does everyone deal with these? I'm torn because on one hand I figure if they're opening and clicking links, that kinda helps engagement metrics. But, it skews revenue metrics among other things. I can easily block them, or I could redirect them to a different site entirely (disney.com?). How is everyone else out there handling them?
We strongly recommend you block all click-bots, since a big portion of them are going to be spamtraps (and the ones that are not don't report you any revenue anyways). In any case if you want to fine-tune and not block all click-bots, but just the spamtrap ones, just pm me and let's see if we can help you out.
Yeah I'm playing with it a bit. I've been blocking them at the source, but then today decided I'm going to redirect them to a harmless yet trusted site like google.com or yahoo.com, etc. The reality is, if they're clicking the links anyways it's already skewing my metrics. At least this way I'm not letting them get all the way to the landing page, and also not leaving them stuck on a tracking domain.
uhh no. when mailing GI a majority of the "bots" are spam filtering software visiting all the URLs in a suspect message to determine if it's malware, spam, etc.. if you just block those ranges your delivery will suffer to those domains.
Yeah that's what it looks like to me when I check out most of the IPs. Thats why I'm thinking redirect them to a generic harmless domain like google.com or something.
A 301/302 to an innocuous domain may help. To make it seem less like 'cloaking' based on UA/IP, you might want to setup a reverse proxy of the innocuous sites for the 'bot' traffic instead.
I like to know, how to block the bot completely? Because every time I block the bot ip, it will comeback into another ip.
A lot of MTA software (like ROBO) will blackhole these ips. While I am sure that others like DA, Throughtput, SG also do this I can't speak to that. You can get a list of these "bots" and netblocks from the ad networks typically. You can then plug them into your MTA which will either scrub those associated emails/domains out completely, or it will not send to them or any domain/email on those ips. It is especially helpful if you're running CPC offers, as you want to get rid of them. They will inflate your initial sales numbers, and later you lose that cash when they clean up dups and such. It's better to get pure CPC when possible to avoid the hassle and headache later. Either way, start with your AM at the ad networks. They should have a full list of these for you.
Our mailing platform does that it will auto block bots so they cant effect you or your sponsor. hit me up ill show you a demo. skype: [removed by admin] penises
The OP here was asking for information about how to deal with bots. No where did he ask about needing a new platform. I would appreciate in future if you could stay on topic with the posts and not just use other peoples questions as an outlet to push your platform and services.
Here's a list of user agent patterns of bots I've encountered. I have also noticed most spam filtering software is now using "real" albeit severely outdated version of user agent strings. There are other ways to identify those bots that mimic "real" browser agents but you'll need to figure that out on your own. PHP: $bad_ua = array('EasouSpider', 'MJ12bot', 'Baiduspider', 'SynHttpClient', 'Jakarta Commons', 'GoogleBot' , 'LinkWalker' , 'bingbot', 'spyder/Nutch', 'aiHitBot','thunderstone','oBot','Genieo','RU_Bot','meanpathbot','YandexBot','Wayback Machine','ips-agent','nutch','DotBot','A6-Indexer', 'JetBrains', 'TurnitinBot' ,'FeedBurner', 'curl' , 'wget', 'SurveyBot', 'DomainTools', 'AppEngine-Google', 'BacklinkCrawler', 'Apache', 'SISTRIX', 'Exabot', 'ia_archiver', 'Feedly', 'mapping experiment', 'Synapse','CATExplorador', 'Google favicon', 'SEOstats', 'bot-pge.chlooe', 'LSSRocketCrawler', 'Who.is Bot', 'icarus6', 'PhantomJS', 'AdnormCrawler', 'RelateIQ Crawler', 'Airmail', 'PM\/3', 'Embedly', 'Microsoft Internet Explorer', 'SpamBayes', 'Python', 'urllib', 'Java\/1.7', 'msnbot', '4\.1\.249\.1025', 'Synapse', 'MSIE 5\.01', 'DirBuster', 'Nmap Scripting Engine', ); if (regex_array($_SERVER['HTTP_USER_AGENT'], $bad_ua)) { do_something_else(); exit; }
For another example, a bot at 209.66.70.253 used the following User Agents over 2000 requests. Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0) Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Trident/4.0) Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Trident/5.0) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0) Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.14) Gecko/20110218 AlexaToolbar/alxf-2.0 Firefox/3.6.14 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_7; en-us) AppleWebKit/534.20.8 (KHTML, like Gecko) Version/5.1 Safari/534.20.8 Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 Mozilla/5.0 (Windows; U; Windows NT 5.1; tr; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8 ( .NET CLR 3.5.30729; .NET4.0E) Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/533.17.8 (KHTML, like Gecko) Version/5.0.1 Safari/533.17.8 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.14 (KHTML, like Gecko) Chrome/10.0.601.0 Safari/534.14 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.20 (KHTML, like Gecko) Chrome/11.0.672.2 Safari/534.20
I would just block known user agents used by bots. Blocking datacenter IPs is not a bad idea either, you get rid of sysadmins trying to identify spam, probably honeypots and definitely bots. Marking your subscribers as bots so you don't send them mail again is a good idea. You can throw in a captcha if you're unsure about blocking them or not.
nickphx gave great answers by blocking ua in first line of defense, also redirecting to 301 / 302 is greater than blocking complete ips by droping connections. Filtering by organisation is also a good idea by using something like maxmind integrated. There's is also another trick in order to detect and get rid of the majority of them. In the body or headers of the campaigns you put an url that is non clickable and non viewable by a user unless he check out the html source. On the page of this url you record informations like ip / email etc By doing so anything who visit that link will be harmful for your mailing till it cannot be reached from a real user. According to my mailings there's emails that click the links many times from differents ips, I just completely remove everything. Ban ip and then put the email in the screamer list, so this way I'm sure I don't have clickbots or any other offensive antis and some traps. (this doesn't remove all traps)
I understand what you mean, but the way I did never caused me to get worst delivery once activated or deactivated, the only thing that it does was getting bots . Also I did coded it in a way that links and body randomised, so harder to detect. Anyway it's just one more line of defense to track and detect bots, also if you do use big static lists you can run it a couple drops with the option activated then removing link, then do it later and so on. Test it you'll see