Bot-trap for TNG and Apache 2.4
Posted: Mon Aug 20, 2018 6:56 am
Use Bot-trap or not? The answer depends on your opinions, the data on your website and what you want to achieve. Most bots are helpful because without them you would not be reading this or able to find virtually anything on the internet. We expect bots to follow the robots.txt rules but just like motorists following speed limits, many bots do not pay any attention to robots.txt. Unfortunately a bot-trap bans the IP address not the bot. If the person running the bot changes their IP address, the bot is no longer banned and another user, getting a dynamically assigned IP address, could be banned instead. If you install Bot-trap consider, how will an innocent user inheriting a banned IP address contact you? One way to remedy the problem is create an error handler (described below the download link). Error handlers allow you to display informational messages including an email address if you feel it is necessary.
An additional option is install the Rip Prevention mod written by Brian McFadyen and Brent Hemphil. Many bots scrape data and often avoid bot traps. The Rip prevention mods checks if a visitor's accesses are rapid and repeated. If they are, a warning is issued and if the accesses continue rapid and repeatedly, the visitor is temporarily banned and an explanation page is displayed.
For example, within one hour I had over 500 accesses from a bot. The reason I know this is RIP Prevention writes banned accesses to the log file with a line similar to the following: Mon 01 Aug 2022 01:05:43 PM Rip Prevention Challenge accessed by 194.110.203.3/ from
Warnings and bans are disabled for administrators and logged in users. The mod creates a check_access.php file that can be edited manually to add or remove bots. You can optionally install the Rip Challenge Mod. This mod works with the Rip Prevention Mod by adding a CAPTCHA challenge after a configurable number of accesses (default 30) for non registered users. Using these two mods, along with Bot-Trap, Bot access has been dramatically reduced on my website.
IMPORTANT! If you install the Bot-trap mod, regularly monitor additions to the .htaccess file. Although applebot, bingbot, googlebot, msnbot and many others claim to follow the robots.txt rules they DO NOT. These good bots were blocked on my site even though the correct entries were present in robots.txt file. This was verified using Google's robots.txt tester. If a "good bot" gets blocked you can allow continued access by removing their IP address from the .htaccess file but leave their IP address line in the blacklist file. This will prevent the bot from being added again even if they trip the bot-trap.
Bryan Larson developed the TNG Bot-trap Mod from bot-trap code written by Daniel Webb. I used it until Apache 2.4 was installed. The TNG WIKI version can be used with Apache 2.4 provided the Apache mod_access_compat Module is installed. If the module is not installed you will get an error. Most hosts use the module but it is not used on my Synology NAS. This version of Bryan's Bot-trap was modified to work with Apache 2.4 authorization containers. It is NOT compatible with Apache 2.2 and uses "Require not ip" instead of "Deny from ip". If you have an existing Apache .htaccess file, extract the Bot-trap files to the TNG mod folder and run mod manager. Select "Run Checks" from the Bot-trap install menu. Run checks will look for the Apache containers and add them if they are not present. If your previous .htaccess file has denied ip addresses or hosts, edit the .htaccess file and change "Deny from" to "Require not" and move those entries between the <RequireAll> and </RequireAll> tags. Failure to place those lines between the tags may cause a server 500 error. It may be possible to use both commands if the mod_access_compat Module is installed but you could get unpredictable results in some cases.
How does Bot-trap work? Bot-trap adds a small graphic with a link that humans do not see but bots can. When the link is opened bot-trap bans the IP address immediately but provides an option for a user to unban themselves. Users can select the "I'm human" button and type the correct response to unban themselves. If a user or Bot abandons the page without unbanning themselves the IP address remains banned.
If you have TNG installed inside a CMS, and are using the CMS footer, you will need to add a line, similar to the one below, in the theme's footer.php file to set the Bot-trap.
Bot-trap writes banned IPs to the TNG .htaccess file but does not change the CMS .htaccess file.
Use caution as you could lock yourself out of your own site if you don't unban yourself or otherwise remove your own IP address from the .htaccess file.
TNG WIKI Download Link for v14.0.4 and v14.0.5
Revisions:
An additional option is install the Rip Prevention mod written by Brian McFadyen and Brent Hemphil. Many bots scrape data and often avoid bot traps. The Rip prevention mods checks if a visitor's accesses are rapid and repeated. If they are, a warning is issued and if the accesses continue rapid and repeatedly, the visitor is temporarily banned and an explanation page is displayed.
For example, within one hour I had over 500 accesses from a bot. The reason I know this is RIP Prevention writes banned accesses to the log file with a line similar to the following: Mon 01 Aug 2022 01:05:43 PM Rip Prevention Challenge accessed by 194.110.203.3/ from
https://familyhistories.us:443/familytree/getperson.php
. Yet this bot never accessed the bot-trap.Warnings and bans are disabled for administrators and logged in users. The mod creates a check_access.php file that can be edited manually to add or remove bots. You can optionally install the Rip Challenge Mod. This mod works with the Rip Prevention Mod by adding a CAPTCHA challenge after a configurable number of accesses (default 30) for non registered users. Using these two mods, along with Bot-Trap, Bot access has been dramatically reduced on my website.
IMPORTANT! If you install the Bot-trap mod, regularly monitor additions to the .htaccess file. Although applebot, bingbot, googlebot, msnbot and many others claim to follow the robots.txt rules they DO NOT. These good bots were blocked on my site even though the correct entries were present in robots.txt file. This was verified using Google's robots.txt tester. If a "good bot" gets blocked you can allow continued access by removing their IP address from the .htaccess file but leave their IP address line in the blacklist file. This will prevent the bot from being added again even if they trip the bot-trap.
Bryan Larson developed the TNG Bot-trap Mod from bot-trap code written by Daniel Webb. I used it until Apache 2.4 was installed. The TNG WIKI version can be used with Apache 2.4 provided the Apache mod_access_compat Module is installed. If the module is not installed you will get an error. Most hosts use the module but it is not used on my Synology NAS. This version of Bryan's Bot-trap was modified to work with Apache 2.4 authorization containers. It is NOT compatible with Apache 2.2 and uses "Require not ip" instead of "Deny from ip". If you have an existing Apache .htaccess file, extract the Bot-trap files to the TNG mod folder and run mod manager. Select "Run Checks" from the Bot-trap install menu. Run checks will look for the Apache containers and add them if they are not present. If your previous .htaccess file has denied ip addresses or hosts, edit the .htaccess file and change "Deny from" to "Require not" and move those entries between the <RequireAll> and </RequireAll> tags. Failure to place those lines between the tags may cause a server 500 error. It may be possible to use both commands if the mod_access_compat Module is installed but you could get unpredictable results in some cases.
How does Bot-trap work? Bot-trap adds a small graphic with a link that humans do not see but bots can. When the link is opened bot-trap bans the IP address immediately but provides an option for a user to unban themselves. Users can select the "I'm human" button and type the correct response to unban themselves. If a user or Bot abandons the page without unbanning themselves the IP address remains banned.
If you have TNG installed inside a CMS, and are using the CMS footer, you will need to add a line, similar to the one below, in the theme's footer.php file to set the Bot-trap.
Code: Select all
echo "<a href="../TNGFOLDER/bot-trap/"><img src="../TNGFOLDER/bot-trap/pixel.gif" border="0" alt=" " width=”1" height="1"></a>\n";
Use caution as you could lock yourself out of your own site if you don't unban yourself or otherwise remove your own IP address from the .htaccess file.
TNG WIKI Download Link for v14.0.4 and v14.0.5
Revisions:
- THis version not does affect SEO. Prior versions caused a lower rating due to accessibility issues.
- TNG header and footer are not shown except on the unban page.
- Added languages, Dutch, English, French, German and Spanish so users can see how to unban themselves in their own language.
- Added more detailed instructions about Apache .htaccess and converting to 2.4 directives.
- Changed operation so IPs in blacklist.dat that ARE NOT in .htaccess are ignored.
- Changed the name of the Bot-trap folder in the TNG root for smarter bots that avoid keywords in the URL.
Creating an 403 error page for your website
If your TNG website requires a login you DO NOT need Bot-trap. If you install Bot-trap anyway, users will not be able to unban themselves. However, there may be a situation where a user inherits a banned IP from their provider. Since their IP is banned they may never get a chance to unban themselves. To give users a way to contact admin for your website, add an error handler to your .htaccess file. When a user is denied access the error page will load with information including a contact email address, provided they are not using IE. Apparently Microsoft does like their users receiving informational messages if access is denied. If you want to display a message, instead of a blank 403 error page, add the code below to your TNG .htaccess file. Then copy 403.php from the optional_files folder, located in the mods bot-trap folder, to the TNG bot-trap folder. This creates an error handler that loads the 403.php file included with Bot-trap.The 403 error page displays a message along with a contact email address, provided you enter one in the Bot-trap options menu. Once you've added the error handler, if someone is banned or trips the bot-trap and does not complete the unban procedure correctly the message below will display so they can contact you. If you DO NOT want to display the error page, do not add the ErrorDocument code to .htaccess.Code: Select all
ErrorDocument 403 "<meta http-equiv='refresh' content='0; url=bot-trap/403.php'/>"
If you manually remove an IP address from .htaccess file, do not forget to remove the address from the blacklist.dat file, unless you want to grant that IP address permanent access.
Thanks to Daniel Webb for originally creating Bot-trap and Bryan Larson for adapting Bot-trap to TNG.