bot-trap allows your Web site to automatically ban bad Web robots (a.k.a
bot-trap allows your Web site to automatically ban bad Web robots (a.k.a. Web spiders) that ignore the robots.txt file.
This does not include Googlebot and other well-behaved robots.
The main advantage over other implementations of this concept is that bot-trap has a manual "unban" feature so that humans can unban, but robots can't.
How It Works:
- You place a small "web-bug" strategically in your web pages. This bug is just a tiny image link that says to go to /bot-trap/index.php. Normal people don't see this link, but web bots do.
- You create a /robots.txt file that tells web bots not to go to the /bot-trap directory.
When the bad robot visits /bot-trap/index.php anyway, /bot-trap/index.php adds the IP address of the bad bot to a block list in /.htaccess. They are blocked from access to the site from then on. You can also be emailed when this happens.
It is possible that someone is banned who shouldn't be. Perhaps a previous user of an IP address in a DHCP pool was a naughty user and ran a bad bot, but now the new user is banned. Not to worry, the custom "403 Forbidden" page allows any user to unban themselves by typing a requested word into a form box. Real people can easily do this, but bots can't!
1. Unpack the tarball in your web page root directory:
# tar -xzf bot-trap-x.x.tar.gz
2. Either add a line to your root .htaccess file like:
ErrorDocument 403 /bot-trap/forbid.php
or copy the premade one (bot-trap/htaccess-root-example). Notice that since once an IP is banned, it can't access anything in /, so the 403 page should be in /bot-trap, and /bot-trap/.htaccess should only say "Allow from all". Look at the forbid.php file in the distribution to see how to do this, or just use it as-is.
3. Make sure .htaccess controls are allowed in your Apache configuration (especially the "AllowOverride" directive). This allows bot-trap to ban IP addresses using the htaccess mechanism.
4. Create the empty file blacklist.dat in your web root directory, and make blacklist.dat, .htaccess, and the bot-trap directory in your web root directory owned by the www user with write permission. If web server uses a group (like the group "www-data" on Debian GNU/Linux), set these files and directories group-writable.
5. Edit bot-trap/settings.php to hold the correct email addresses to send alerts to.
6. Add "web-bugs" to your main web page to catch the bad bots. This is the XHTML code:
< !-- Bad robot trap: Don't go here or your IP will be banned! -->
< a href="/bot-trap/">< img src="bot-trap/pixel.gif" border="0"
alt=" " width="1" height="1"/>< /a>
7. Add the bot-trap directory to your robots.txt file, or copy the example robots.txt file (bot-trap/robots.txt.example) to the root directory.
8. Make sure /.htaccess and all other files have the correct permissions and ownership for your site.
tags bot trap robots txt your web root directory txt file the bad trap directory the bot trap index index php blacklist dat sure htaccess the correct
Download bot-trap 0.92
Other software in this category
- Desktop Environment
- Science and Engineering
- Text Editing&Processing