Blocking Bad UserAgents with ModSecurity and Fail2ban

Many bots crawl websites that do nothing to generate positive traffic for the site. They just use up available resources and bandwidth of the server. You can reduce this drastically by implementing blocking using ModSecurity to detect the bad agents and then fail2ban to block them in iptables for a period of time. This guide assumes you already have ModSecurity Installed. If you do not, follow our guide to get it installed then proceed with this.

Configure ModSecurity to Block User Agents

In our Apache configuration setup, we already have a include directory for ModSecurity rules

Include /etc/httpd/conf/modsecurity.d/rules/*.conf

So we are going to create a new .conf to start detecting agents

nano /etc/httpd/conf/modsecurity.d/rules/block_user_agents.conf

We are going to add the following and save the file

SecRule REQUEST_HEADERS:User-Agent "@pmFromFile badbots.txt" "id:350001,rev:1,severity:2,log,msg:'BAD BOT - Detected and Blocked. '"

We are then going to create the list of User Agents to be detected and blocked:

nano /etc/httpd/conf/modsecurity.d/rules/badbots.txt

And insert the following user agents. If you want to let some of these in feel free to edit the list as you see fit.

AhrefsBot
Anonymizer
Attributor
Baidu
Bandit
BatchFTP
Bigfoot
Black.Hole
Bork-edition
DataCha0s
Deepnet Explorer
desktopsmiley
DigExt
feedfinder
gamingharbor
heritrix
ia_archiver
Indy Library
Jakarta
Java
juicyaccess
larbin
linkdex
Missigua
MRSPUTNIK
Nutch
panscient
plaNETWORK
Snapbot
Sogou
TinEye
TwengaBot
Twitturly
User-Agent
Viewzi
WebCapture
XX
Yandex
YebolBot
MJ12bot
masscan
baidu
Yandex
RSSingBot
Scanbot
betaBot
DotBot
SemrushBot
mj12bot
FeedFetcher
seoscanners.net
Moreover
ltx71
inboundlinks.win
sitebot

Configure Fail2Ban

First you will need to install Fail2ban

yum  -y install fail2ban

After that has finished installing, you will want to create a new jail file

nano /etc/fail2ban/jail.local

Creating a local jail will allow the main fail2ban configuration to be updated with new updates.

[apache-modsecblock-badbots]
enabled = true
filter = apache-useragent
logpath = /var/log/httpd/error_log
action = iptables-multiport[name=apache-badbots, port="http,https", protocol=tcp]
 postback[name=BADBOT, port="http,https", protocol=tcp]
maxretry = 2
bantime = 172800
ignoreip = 127.0.0.0/8 10.0.0.0/8 192.168.1.0/24

Update ignoreip with any local IPs or any others you want to allow in regardless of the UserAgent. This allows each IP to access twice with the a UserAgent indicated in the list, after that it will be banned.

You will then want to create the failregex pattern

nano /etc/fail2ban/filter.d/apache-useragent.conf

And add the following

# Fail2Ban configuration file
#

[Definition]


# Option: failregex
# Notes.: Regexp to catch known spambots and software alike. Please verify
# that it is your intent to block IPs which were driven by
# abovementioned bots.
# Values: TEXT
#

failregex = [[]client <HOST>[]] ModSecurity: Access denied with code 406 .* [[]msg "BAD BOT - Detected and Blocked. "[]] .*$

# Option: ignoreregex
# Notes.: regex to ignore. If this regex matches, the line is ignored.
# Values: TEXT
#

Go ahead and restart fail2ban and apache

service httpd restart

service fail2ban restart

You should now be able to watch the apache error log /var/log/httpd/error.log to see if any bans are picked up.

May 31, 2017LinuxAdmin.io

1 1 vote

Article Rating

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Chris

7 years ago

Hi, very interesting approach – but have not managed to get rid of ahrefs bot so far with this method. Followed your instructions, albeit slightly modified since on Debian. Badbots are detected and written into the error.log alright, but fail2ban fails to pick on them … maybe something wrong with the regex? It looks good to me, but since I am zero on regex I might be overlooking something … Here a sample of what is written into the error.log: [Tue Nov 28 16:34:54.520349 2017] [:error] [pid 30083] [client 51.255.65.33:49402] [client 51.255.65.33] ModSecurity: Warning. Matched phrase “AhrefsBot” at REQUEST_HEADERS:User-Agent. [file… Read more »

Author

LinuxAdmin.io

Reply to Chris

Hello Chris,
It might be the format, have tried doing a fail2ban-regex on the jail and log file to confirm its matching the hit through mod security?

Blocking Bad UserAgents with ModSecurity and Fail2ban

Configure ModSecurity to Block User Agents

Configure Fail2Ban

Recent Posts

Most Commented

Tags

About

Most Viewed

Archives

<img src="https://linuxadmin.io/wp-content/uploads/2017/04/linuxadmin_io_logo.png" data-hidpi="https://linuxadmin.io/wp-content/themes/Grimag/assets/images/logo2x.png" alt="LinuxAdmin.io"/>

Blocking Bad UserAgents with ModSecurity and Fail2ban

Configure ModSecurity to Block User Agents

Configure Fail2Ban

You Might Also Like

Piwik Analytics on Nginx

Install mod_deflate on Apache

Recent Posts

Most Commented

Tags