Example of a browser bot blocker

Hi malware fighters,

There are all sort of bots, crawlers and grazers that are ignorant of the normal blocking procedure, and that are after illegal download info, that try to feed you with ads and spam, even your own installed browser can crawl the Internet with intentions that you may classify as undesirable. You can block these on your websites with the following perl script with a special referal list, and you can refer to it in the common gateway interface of your webpage. Here it is for you:


#!/usr/local/bin/perl

#######################
# Browser Agents banned
@browser = ("Wget/1.6","Zeus","EmailSiphon"); # List of banned Agents - Add as many as you like
# User-Agents with no privileges (mostly spambots/spybots/offline downloaders that ignore robots.txt)
RewriteCond %{REMOTE_ADDR} “^63\.148\.99\.2(2[4-9]|[3-4][0-9]|5[0-5])$” [OR] # Cyveillance spybot
RewriteCond %{REMOTE_ADDR} ^12\.148\.196\.(12[8-9]|1[3-9][0-9]|2[0-4][0-9]|25[0-5])$ [OR] # NameProtect spybot
RewriteCond %{REMOTE_ADDR} ^12\.148\.209\.(19[2-9]|2[0-4][0-9]|25[0-5])$ [OR] # NameProtect spybot
RewriteCond %{REMOTE_ADDR} ^64\.140\.49\.6([6-9])$ [OR] # Turnitin spybot
RewriteCond %{HTTP_REFERER} iaea\.org [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} ^[A-Z]+$ [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} anarchie [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} Atomz [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} cherry.?picker [NC,OR] # spambot
RewriteCond %{HTTP_USER_AGENT} “compatible ; MSIE 6.0″ [OR] # spambot (note extra space before semicolon)
RewriteCond %{HTTP_USER_AGENT} crescent [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} “^DA \d\.\d+” [OR] # OD
RewriteCond %{HTTP_USER_AGENT} “DTS Agent” [OR] # OD
RewriteCond %{HTTP_USER_AGENT} “^Download” [OR] # OD
RewriteCond %{HTTP_USER_AGENT} EasyDL/\d\.\d+ [OR] # OD
RewriteCond %{HTTP_USER_AGENT} e?mail.?(collector|magnet|reaper|siphon|sweeper|harvest|collect|wolf) [NC,OR] # spambot
RewriteCond %{HTTP_USER_AGENT} express [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} extractor [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} “Fetch API Request” [OR] # OD
RewriteCond %{HTTP_USER_AGENT} flashget [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} FlickBot [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} FrontPage [OR] # stupid user trying to edit my site
RewriteCond %{HTTP_USER_AGENT} getright [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} go.?zilla [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} “efp@gmx\.net” [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} grabber [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} imagefetch [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} httrack [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} “Indy Library” [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} “^Internet Explore” [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} ^IE\ \d\.\d\ Compatible.*Browser$ [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} “LINKS ARoMATIZED” [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} “Microsoft URL Control” [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} “mister pix” [NC,OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} “^Mozilla/4.0$” [OR] # dumb bot
RewriteCond %{HTTP_USER_AGENT} “^Mozilla/\?\?$” [OR] # formmail attacker
RewriteCond %{HTTP_USER_AGENT} MSIECrawler [OR] # IE’s “make available offline” mode
RewriteCond %{HTTP_USER_AGENT} ^NG [OR] # unknown bot
RewriteCond %{HTTP_USER_AGENT} offline [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} net.?(ants|mechanic|spider|vampire|zip) [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} nicerspro [NC,OR] # spambot
RewriteCond %{HTTP_USER_AGENT} ninja [NC,OR] # Download Ninja OD
RewriteCond %{HTTP_USER_AGENT} NPBot [OR] # NameProtect spybot
RewriteCond %{HTTP_USER_AGENT} PersonaPilot [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} snagger [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} Sqworm [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} SurveyBot [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} tele(port|soft) [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} TurnitinBot [OR] # Turnitin spybot
RewriteCond %{HTTP_USER_AGENT} web.?(auto|bandit|collector|copier|devil|downloader|fetch|hook|mole|miner|mirror|reaper|sauger|sucker|site|snake|stripper|weasel|zip) [NC,OR] # ODs
RewriteCond %{HTTP_USER_AGENT} vayala [OR] # dumb bot, doesn’t know how to follow links, generates lots of 404s
RewriteCond %{HTTP_USER_AGENT} zeus [NC]
RewriteRule .* - [F,L]
RewriteCond %{HTTP_USER_AGENT} EmailSiphon
RewriteRule .* - [F,L]
RewriteCond %{REMOTE_ADDR} "^63\.148\.99\.2(2[4-9]|[3-4][0-9]|5[0-5])$"
RewriteRule .* - [F,L]
RewriteCond %{HTTP_REFERER} iaea\.org
RewriteRule .* - [F,L]
RewriteCond %{HTTP_USER_AGENT} "Microsoft URL Control"
RewriteRule .* - [F,L]
RewriteCond %{REMOTE_ADDR} ^12\.148\.196\.(12[8-9]|1[3-9][0-9]|2[0-4][0-9]|25[0-5])$ [OR]
RewriteCond %{REMOTE_ADDR} ^12\.148\.209\.(19[2-9]|2[0-4][0-9]|25[0-5])$ [OR]
RewriteCond %{HTTP_USER_AGENT} NPBot
RewriteRule .* - [F,L]
RewriteCond %{REMOTE_ADDR} ^64\.140\.49\.6([6-9])$ [OR]
RewriteCond %{HTTP_USER_AGENT} TurnitinBot
RewriteRule .* - [F,L]
# User-Agents with no privileges (mostly spambots/spybots/offline downloaders that ignore robots.txt)
RewriteCond %{REMOTE_ADDR} “^63\.148\.99\.2(2[4-9]|[3-4][0-9]|5[0-5])$” [OR] # Cyveillance spybot
RewriteCond %{REMOTE_ADDR} ^12\.148\.196\.(12[8-9]|1[3-9][0-9]|2[0-4][0-9]|25[0-5])$ [OR] # NameProtect spybot
RewriteCond %{REMOTE_ADDR} ^12\.148\.209\.(19[2-9]|2[0-4][0-9]|25[0-5])$ [OR] # NameProtect spybot
RewriteCond %{REMOTE_ADDR} ^64\.140\.49\.6([6-9])$ [OR] # Turnitin spybot
RewriteCond %{HTTP_REFERER} iaea\.org [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} ^[A-Z]+$ [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} anarchie [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} Atomz [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} cherry.?picker [NC,OR] # spambot
RewriteCond %{HTTP_USER_AGENT} “compatible ; MSIE 6.0″ [OR] # spambot (note extra space before semicolon)
RewriteCond %{HTTP_USER_AGENT} crescent [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} “^DA \d\.\d+” [OR] # OD
RewriteCond %{HTTP_USER_AGENT} “DTS Agent” [OR] # OD
RewriteCond %{HTTP_USER_AGENT} “^Download” [OR] # OD
RewriteCond %{HTTP_USER_AGENT} EasyDL/\d\.\d+ [OR] # OD
RewriteCond %{HTTP_USER_AGENT} e?mail.?(collector|magnet|reaper|siphon|sweeper|harvest|collect|wolf) [NC,OR] # spambot
RewriteCond %{HTTP_USER_AGENT} express [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} extractor [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} “Fetch API Request” [OR] # OD
RewriteCond %{HTTP_USER_AGENT} flashget [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} FlickBot [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} FrontPage [OR] # stupid user trying to edit my site
RewriteCond %{HTTP_USER_AGENT} getright [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} go.?zilla [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} “efp@gmx\.net” [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} grabber [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} imagefetch [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} httrack [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} “Indy Library” [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} “^Internet Explore” [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} ^IE\ \d\.\d\ Compatible.*Browser$ [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} “LINKS ARoMATIZED” [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} “Microsoft URL Control” [OR] # spambot
RewriteCond %{HTTP_USER_AGENT} “mister pix” [NC,OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} “^Mozilla/4.0$” [OR] # dumb bot
RewriteCond %{HTTP_USER_AGENT} “^Mozilla/\?\?$” [OR] # formmail attacker
RewriteCond %{HTTP_USER_AGENT} MSIECrawler [OR] # IE’s “make available offline” mode
RewriteCond %{HTTP_USER_AGENT} ^NG [OR] # unknown bot
RewriteCond %{HTTP_USER_AGENT} offline [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} net.?(ants|mechanic|spider|vampire|zip) [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} nicerspro [NC,OR] # spambot
RewriteCond %{HTTP_USER_AGENT} ninja [NC,OR] # Download Ninja OD
RewriteCond %{HTTP_USER_AGENT} NPBot [OR] # NameProtect spybot
RewriteCond %{HTTP_USER_AGENT} PersonaPilot [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} snagger [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} Sqworm [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} SurveyBot [OR] # rude bot
RewriteCond %{HTTP_USER_AGENT} tele(port|soft) [NC,OR] # OD
RewriteCond %{HTTP_USER_AGENT} TurnitinBot [OR] # Turnitin spybot
RewriteCond %{HTTP_USER_AGENT} web.?(auto|bandit|collector|copier|devil|downloader|fetch|hook|mole|miner|mirror|reaper|sauger|sucker|site|snake|stripper|weasel|zip) [NC,OR] # ODs
RewriteCond %{HTTP_USER_AGENT} vayala [OR] # dumb bot, doesn’t know how to follow links, generates lots of 404s
RewriteCond %{HTTP_USER_AGENT} zeus [NC]
RewriteRule .* - [F,L]

########################
# Get Browser Agent Info
$get_agent = $ENV{'HTTP_USER_AGENT'}; # Get Browser Agent - Requires SSI

########################
# Check Against ban List
foreach $ban (@browser) {
if ($get_agent =~ /$ban/) {
$punish = 1;
}
}
if ($punish == 1) { # banned agent is placed into an infinite loop
while (1) {
$x++;
}
}
else {
print "Content-type: text/html\015\012\015\012\n\n"; # The innocent are set free
}

############
# End Script 

Enjoy,

polonus

So would it be easy to add wareout to this

69.50.184.52=== OrgName: Atrivo

195.225.176.31===netcathost.com

and also the 85. ones

Hi essexboy,

It is an example, and just working out of an idea, will come up with a more elaborate one. Just have to feel the workings of this one,

RewriteCond %{HTTP_HOST} !^yarinareth.net$ [NC]
RewriteCond %{HTTP_REFERER} ^(.*)$ [NC]
RewriteRule ^(.*)$ %1 [R=301,L]

RewriteCond %{HTTP_REFERER} !^$ 
RewriteCond %{HTTP_REFERER} !^http://localhost/.*$ [OR,NC] 
RewriteCond %{HTTP_REFERER} !^http://mysite.com/.*$ [OR,NC] 
RewriteCond %{HTTP_REFERER} !^http://www.mysite.com/.*$ [OR,NC] 
RewriteRule .*\.(gif|GIF|jpg|JPG)$ http://mysite/images/bad.gif [L,

Hi webforum users,

I made the code more flexible and expanded the bot list:
http://www.4shared.com/file/17743795/620dcb0/bot-blocker.html

polonus

@Damian: Your link doesn’t work …

Hi DarthMickey,

Now it works, I guess.

Damian

its works damian :wink:
where’s your stamp ;D

Hi drhayden1,

Stamp added. Use this bot blocker script either in a chrome filter or find use as attached to a cgi for your webpage. Very usefull for webdevelopers and bot blockers alike. The history of the script is that the skeleton is a general bot blocking perl script, the insertion of the Rewrite Cond HTTP-Referal list was my personal addition.
It is helpful against bad bots that do not live up to the rules or those that are undesirable because they snoop. If people could come up with further addidtions or comments, feel free to do so…

polonus

Yes the link works fine now. Haven’t tried the script yet.

like the approval stamp damian-who’d you get it from ;D

Hi Dan,

Thanks. But there is even a better one to be downloaded here: http://www.4shared.com/file/17779560/da4b7a2a/botblocker2.html
File size: 9368 bytes
MD5: 2b1f36d7f867b6d1d41ee765b3ae7d96
SHA1: 32bd1dcbf60600b73d4886a97fadbfb8458c83c5
packers: Unicode
Scan before downloading, the file is packed and zipped.
Site File size: 20392 bytes
DrWeb’s scan results
botblocker2.html - archive HTML

botblocker2.html/Script.0 - OK
botblocker2.html/Script.1 - OK
botblocker2.html/Script.2 - OK
botblocker2.html/Script.3 - OK
botblocker2.html/Script.4 - OK
botblocker2.html/Script.5 - OK
botblocker2.html/Script.6 - OK
botblocker2.html/Script.7 - OK
botblocker2.html - OK

polonus