Blocking Bots on Your WordPress Site With Htaccess & More

No matter what type of site you have (WordPress, static HTML, etc.) bots can be a big nuisance. They can eat up your bandwidth, compromise your security, and cause a variety of problems. One of the biggest problems with bots is that hackers frequently used them to try and breach the security of WordPress hosts, sites, databases, login portals, and all sorts of other websites. If you take the time to block a variety of bots from your site, you might be able to limit some of the generic attacks that are performed with a large number of bots.

blocking-bots-on-your-website-htaccess

Sometimes bot aren’t used for malicious purposes, but they can still cause trouble on your site. For example, some bots like the Wayback Machine Bot at Archive.org help to preserve the web, which is a good thing, but some people still don’t want them crawling all over their site and eating up their bandwidth, especially in they’re on a shared hosting plan with limited resource or have a small amount of available bandwidth each month for their hosting plan.

Fortunately, there are a variety of ways that you can block bad or annoying bots. No matter what you do, some will still slip through, that’s just life in cyberspace. Hackers who create and use bots for malicious purposes are always finding ways to get through blocks. However.there many things that you can do to help prevent a wide range of bots from accessing your site and creating problems.

Using Htaccess to Block Bots

If you’re using an Apache server, you can use your .htaccess file to block a variety of bots in a few different ways.

The first thing that you can do is put a few lines of code in your .htaccess file that detect the user agent of the bot and then block access to the website. The code for that is as follows:

#Block Bad User Agents
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BadBotUserAgent1[OR]
RewriteCond %{HTTP_USER_AGENT} ^BadBotUserAgent2[OR]
RewriteCond %{HTTP_USER_AGENT} ^BadBotUserAgent3
RewriteRule .* - [F,L]

In this example, you’d replace the “BadBotUserAgent1” etc. with the user agent of the bot that you’re trying to block. The final line in this snippet of code tells the server to give a 403 error (forbidden) to any user agent that matches what you’ve put in the file. You can find a fairly good list of bad user agents to start with here.

You can also check your server logs and check for any odd user agents that appear to be eating up a ton of bandwidth on your server or crawling a large amount of pages. Just be sure to double check things and make sure that you aren’t blocking good bots like popular search engines. This can be hard sometimes because bad bots may imitate good search engine bots by spoofing their user agent names.

Blocking Bad Bots by IP

If you know the IP address of the bot that you want to block, you can put a list of IP addresses at the end of your .htaccess file. That code would look like this:

#Block Bad Bots by IP Address
Deny from 123.123.123.123
Deny from 123.123.123.123

In that example, you would replace the example IPs with the IPs of the bots that you want to block. You can also block entire ranges of IPs by using code like this:

#Block Bad Bots by IP Address
Deny from 123.123.123.0/24
Deny from 123.123.123.0/24

This code would block 123.123.123.0 – 123.123.123.255 which includes that entire IP range. This method is also not perfect as it’s easy for bots to change IPs and still hit your site. You can read a lot of information about IPs that may be malicious at Project Honeypot.

Bot Blocking Plugins and Programs

blocking-bots-in-wordpress.jpg

Another alternative is to use bot blocking programs or plugins (if you’re using WordPress). One that I have used and can recommend is Spyder Spanker. You can check out Spyder Spanker here (disclaimer—that’s an affiliate link and I may receive a commission if you purchase the product, but I do not recommend junk!).

I have used Spyder Spanker on a number of sites and I can honestly say it’s a product I can recommend. I would advise at least purchasing the pro level license or higher if you buy it because it allows you more granular control over the software.

The really nice thing about Spyder Spanker is that you can block bots by country. I have found that there are a few troublesome countries that seems to have a lot more bot traffic than others, so this is definitely a nice feature to have. Spyder Spanker is easy to use and I often put it on WordPress sites to actually help increase security because a lot of bot attacks seems to originate from the same countries.

What About Robots.txt?

Most malicious bots and even mischievous bots will simply ignore robots.txt. Often times the only bots who respect it are bots who you want to be able to crawl your site, like well-known search engines. It doesn’t hurt to use robots.txt in conjunction with a good .htaccess file and a bot blocking plugin, but in my opinion, robots.txt doesn’t often do much by itself to help keep out unwanted bots.