Another option is install the Rip Prevention mod written by Brian McFadyen and Brent Hemphil. Many bots scrape data and often avoid bot traps. The Rip prevention mods checks if a visitor's accesses are rapid and repeated. If they are, a warning is issued and if the accesses continue rapid and repeatedly, the visitor is temporarily banned and an explanation page is displayed.
For example, within one hour I had over 500 accesses from a bot. The reason I know this is RIP Prevention writes banned accesses to the log file with a line similar to the following: Mon 01 Aug 2022 01:05:43 PM Rip Prevention Challenge accessed by 18.104.22.168/ from
https://familyhistories.us:443/familytree/getperson.php. Yet this bot never accessed the bot-trap.
Warnings and bans are disabled for administrators and logged in users. The mod creates a check_access.php file that can be edited manually to add or remove bots. You can optionally install the Rip Challenge Mod. This mod works with the Rip Prevention Mod by adding a CAPTCHA challenge after a configurable number of accesses (default 30) for non registered users. Using these two mods, along with Bot-Trap, Bot access has been dramatically reduced on my website.
IMPORTANT! If you install the Bot-trap mod, regularly monitor additions to the .htaccess file. Although applebot, bingbot, googlebot, msnbot and many others claim to follow the robots.txt rules they DO NOT. These good bots were blocked on my site even though the correct entries were present in robots.txt file. This was verified using Google's robots.txt tester. If a "good bot" gets blocked you can allow continued access by removing their IP address from the .htaccess file but leave their IP address line in the blacklist file. This will prevent the bot from being added again even if they trip the bot-trap.
Bryan Larson developed the TNG Bot-trap Mod from bot-trap code written by Daniel Webb. I used it until Apache 2.4 was installed. The TNG WIKI version can be used with Apache 2.4 provided the Apache mod_access_compat Module is installed. If the module is not installed you will get an error. Most hosts use the module but it is not used on my Synology NAS. This version of Bryan's Bot-trap was modified to work with Apache 2.4 authorization containers. It is NOT compatible with Apache 2.2 and uses "Require not ip" instead of "Deny from ip". If you have an existing Apache .htaccess file, extract the Bot-trap files to the TNG mod folder and run mod manager. Select "Run Checks" from the Bot-trap install menu. Run checks will look for the Apache containers and add them if they are not present. If your previous .htaccess file has denied ip addresses or hosts, edit the .htaccess file and change "Deny from" to "Require not" and move those entries between the <RequireAll> and </RequireAll> tags. Failure to place those lines between the tags may cause a server 500 error. It may be possible to use both commands if the mod_access_compat Module is installed but you could get unpredictable results in some cases.
How does Bot-trap work? Bot-trap adds a small graphic with a link that humans do not see but bots can. When the link is opened bot-trap bans the IP address immediately but provides an option for a user to unban themselves. Users can select the "I'm human" button and type the correct response to unban themselves. If a user or Bot abandons the page without unbanning themselves the IP address remains banned.
If you have TNG installed inside a CMS, and are using the CMS footer, you will need to add a line, similar to the one below, in the theme's footer.php file to set the Bot-trap.
Bot-trap writes banned IPs to the TNG .htaccess file but does not change the CMS .htaccess file.
Code: Select all
echo "<a href="../TNGFOLDER/bot-trap/"><img src="../TNGFOLDER/bot-trap/pixel.gif" border="0" alt=" " width=”1" height="1"></a>\n";
Use caution as you could lock yourself out of your own site if you don't unban yourself or otherwise remove your own IP address from the .htaccess file.
This version was updated on Oct 1, 2020 and tested on v12.0, v12.2, v12.3 and v13.0.
- Checks the Apache version when checks are run.
- The TNG header and footer are displayed selectively due to new templates.
- Run Checks has a few improvements to scan code and add the correct values when missing.
- Files contain commented code for using TNG integrated with WordPress which was tested with TNG in the WP folder. This version was not tested with both WP and TNG in the root. However, it should function unless you have some unusual file rewrite rules.
This file was tested on a Synology server using TNG v12.0.1, v12.1, v12.2, v12.3, Apache 2.4, PHP5.6, PHP7.0 and PHP7.2. It should be compatible with earlier versions of TNG that use the stdsitecredit file but was not tested.
What's new in this version?
- Compatible with Apache 2.4 using the <RequireAll></RequireAll> Apache Authorization Container tags.
- When a bot hits the trap, two file writes occur instead of three. One for .htaccess and one for blacklist.dat.
- Run Checks preserves existing lines in robots.txt and .htaccess and adds the correct values if they are missing.
- Added the option to make a backup copy of the existing .htaccess file before adding an IP address. If a failure occurs you can manually rename and use the copy.
- Created a optional file to protect TNG folders from unauthorized direct access. (optional)
- Created an error message page that loads provided you add the error handler line to .htaccess. (optional)
Creating an 403 error page for your website
If your TNG website requires a login you DO NOT need Bot-trap. If you install Bot-trap anyway, users will not be able to unban themselves. However, there may be a situation where a user inherits a banned IP from their provider. Since their IP is banned they may never get a chance to unban themselves. To give users a way to contact admin for your website, add an error handler to your .htaccess file. When a user is denied access the error page will load with information including a contact email address, provided they are not using IE. Apparently Microsoft does like their users receiving informational messages if access is denied. If you want to display a message, instead of a blank 403 error page, add the code below to your TNG .htaccess file. Then copy 403.php from the optional_files folder, located in the mods bot-trap folder, to the TNG bot-trap folder. This creates an error handler that loads the 403.php file included with Bot-trap.
The 403 error page displays a message along with a contact email address, provided you enter one in the Bot-trap options menu. Once you've added the error handler, if someone is banned or trips the bot-trap and does not complete the unban procedure correctly the message below will display so they can contact you. If you DO NOT want to display the error page, do not add the ErrorDocument code to .htaccess.
Code: Select all
ErrorDocument 403 "<meta http-equiv='refresh' content='0; url=bot-trap/403.php'/>"
If you manually remove an IP address from .htaccess file, do not forget to remove the address from the blacklist.dat file, unless you want to grant that IP address permanent access.
Thanks to Daniel Webb for originally creating Bot-trap and Bryan Larson for developing TNG Bot-trap.