Bot-trap for TNG and Apache 2.4

Post Reply
steven
Posts: 133
Joined: Sun Oct 01, 2017 3:08 pm

Bot-trap for TNG and Apache 2.4

Post by steven »

Use Bot-trap or not? The answer depends on your opinions, the data on your website and what you want to achieve. Most bots are helpful because without them you would not be reading this or able to find virtually anything on the internet. We expect bots to follow the robots.txt rules but just like motorists following speed limits, many bots do not pay any attention to robots.txt. Unfortunately a bot-trap bans the IP address not the bot. If the person running the bot changes their IP address, the bot is no longer banned and another user, getting a dynamically assigned IP address, could be banned instead. If you install Bot-trap consider, how will an innocent user inheriting a banned IP address contact you? One way to remedy the problem is create an error handler (described below the download link). Error handlers allow you to display informational messages including an email address if you feel it is necessary.

Another option is install the Rip Prevention mod written by Brian McFadyen and Brent Hemphil. Many bots scrape data and often avoid bot traps. The Rip prevention mods checks if a visitor's accesses are rapid and repeated. If they are, a warning is issued and if the accesses continue rapid and repeatedly, the visitor is temporarily banned and an explanation page is displayed.
For example, within one hour I had over 500 accesses from a bot. The reason I know this is RIP Prevention writes banned accesses to the log file with a line similar to the following: Mon 01 Aug 2022 01:05:43 PM Rip Prevention Challenge accessed by 194.110.203.3/ from https://familyhistories.us:443/familytree/getperson.php. Yet this bot never accessed the bot-trap.

Warnings and bans are disabled for administrators and logged in users. The mod creates a check_access.php file that can be edited manually to add or remove bots. You can optionally install the Rip Challenge Mod. This mod works with the Rip Prevention Mod by adding a CAPTCHA challenge after a configurable number of accesses (default 30) for non registered users. Using these two mods, along with Bot-Trap, Bot access has been dramatically reduced on my website.

IMPORTANT! If you install the Bot-trap mod, regularly monitor additions to the .htaccess file. Although applebot, bingbot, googlebot, msnbot and many others claim to follow the robots.txt rules they DO NOT. These good bots were blocked on my site even though the correct entries were present in robots.txt file. This was verified using Google's robots.txt tester. If a "good bot" gets blocked you can allow continued access by removing their IP address from the .htaccess file but leave their IP address line in the blacklist file. This will prevent the bot from being added again even if they trip the bot-trap.

Bryan Larson developed the TNG Bot-trap Mod from bot-trap code written by Daniel Webb. I used it until Apache 2.4 was installed. The TNG WIKI version can be used with Apache 2.4 provided the Apache mod_access_compat Module is installed. If the module is not installed you will get an error. Most hosts use the module but it is not used on my Synology NAS. This version of Bryan's Bot-trap was modified to work with Apache 2.4 authorization containers. It is NOT compatible with Apache 2.2 and uses "Require not ip" instead of "Deny from ip". If you have an existing Apache .htaccess file, extract the Bot-trap files to the TNG mod folder and run mod manager. Select "Run Checks" from the Bot-trap install menu. Run checks will look for the Apache containers and add them if they are not present. If your previous .htaccess file has denied ip addresses or hosts, edit the .htaccess file and change "Deny from" to "Require not" and move those entries between the <RequireAll> and </RequireAll> tags. Failure to place those lines between the tags may cause a server 500 error. It may be possible to use both commands if the mod_access_compat Module is installed but you could get unpredictable results in some cases.

How does Bot-trap work? Bot-trap adds a small graphic with a link that humans do not see but bots can. When the link is opened bot-trap bans the IP address immediately but provides an option for a user to unban themselves. Users can select the "I'm human" button and type the correct response to unban themselves. If a user or Bot abandons the page without unbanning themselves the IP address remains banned.

If you have TNG installed inside a CMS, and are using the CMS footer, you will need to add a line, similar to the one below, in the theme's footer.php file to set the Bot-trap.

Code: Select all

echo "<a href="../TNGFOLDER/bot-trap/"><img src="../TNGFOLDER/bot-trap/pixel.gif" border="0" alt=" " width=”1" height="1"></a>\n";
Bot-trap writes banned IPs to the TNG .htaccess file but does not change the CMS .htaccess file.

Use caution as you could lock yourself out of your own site if you don't unban yourself or otherwise remove your own IP address from the .htaccess file.

This version was updated on Oct 1, 2020 and tested on v12.0, v12.2, v12.3 and v13.0.

bot-trap_v12.0.0.7.zip
(23.4 KiB) Downloaded 506 times


Revisions:
  1. Checks the Apache version when checks are run.
  2. The TNG header and footer are displayed selectively due to new templates.
  3. Run Checks has a few improvements to scan code and add the correct values when missing.
  4. Files contain commented code for using TNG integrated with WordPress which was tested with TNG in the WP folder. This version was not tested with both WP and TNG in the root. However, it should function unless you have some unusual file rewrite rules.

This file was tested on a Synology server using TNG v12.0.1, v12.1, v12.2, v12.3, Apache 2.4, PHP5.6, PHP7.0 and PHP7.2. It should be compatible with earlier versions of TNG that use the stdsitecredit file but was not tested.

Bot-trap_v12.0.0.6-180824.zip
(21.84 KiB) Downloaded 772 times


What's new in this version?
  1. Compatible with Apache 2.4 using the <RequireAll></RequireAll> Apache Authorization Container tags.
  2. When a bot hits the trap, two file writes occur instead of three. One for .htaccess and one for blacklist.dat.
  3. Run Checks preserves existing lines in robots.txt and .htaccess and adds the correct values if they are missing.
  4. Added the option to make a backup copy of the existing .htaccess file before adding an IP address. If a failure occurs you can manually rename and use the copy.
  5. Created a optional file to protect TNG folders from unauthorized direct access. (optional)
  6. Created an error message page that loads provided you add the error handler line to .htaccess. (optional)


Creating an 403 error page for your website
 
If your TNG website requires a login you DO NOT need Bot-trap. If you install Bot-trap anyway, users will not be able to unban themselves. However, there may be a situation where a user inherits a banned IP from their provider. Since their IP is banned they may never get a chance to unban themselves. To give users a way to contact admin for your website, add an error handler to your .htaccess file. When a user is denied access the error page will load with information including a contact email address, provided they are not using IE. Apparently Microsoft does like their users receiving informational messages if access is denied. If you want to display a message, instead of a blank 403 error page, add the code below to your TNG .htaccess file. Then copy 403.php from the optional_files folder, located in the mods bot-trap folder, to the TNG bot-trap folder. This creates an error handler that loads the 403.php file included with Bot-trap.

Code: Select all

ErrorDocument 403 "<meta http-equiv='refresh' content='0; url=bot-trap/403.php'/>"
The 403 error page displays a message along with a contact email address, provided you enter one in the Bot-trap options menu. Once you've added the error handler, if someone is banned or trips the bot-trap and does not complete the unban procedure correctly the message below will display so they can contact you. If you DO NOT want to display the error page, do not add the ErrorDocument code to .htaccess.
 
Image
 
If you manually remove an IP address from .htaccess file, do not forget to remove the address from the blacklist.dat file, unless you want to grant that IP address permanent access.

Thanks to Daniel Webb for originally creating Bot-trap and Bryan Larson for developing TNG Bot-trap.
Last edited by steven on Wed Sep 30, 2020 3:42 pm, edited 7 times in total.
Post Reply