Blocking Bots on Your WordPress Site With Htaccess & More

How-To Last Updated:

No matter what type of site you have (WordPress, static HTML, etc.) bots can be a big nuisance. They can eat up your bandwidth, compromise your security, and cause a variety of problems. One of the biggest problems with bots is that hackers frequently used them to try and breach the security of WordPress hosts, sites, databases, login portals, and all sorts of other websites. If you take the time to block a variety of bots from your site, you might be able to limit some of the generic attacks that are performed with a large number of bots.

blocking-bots-on-your-website-htaccess

Sometimes bot aren’t used for malicious purposes, but they can still cause trouble on your site. For example, some bots like the Wayback Machine Bot at Archive.org help to preserve the web, which is a good thing, but some people still don’t want them crawling all over their site and eating up their bandwidth, especially in they’re on a shared hosting plan with limited resource or have a small amount of available bandwidth each month for their hosting plan.

Fortunately, there are a variety of ways that you can block bad or annoying bots. No matter what you do, some will still slip through, that’s just life in cyberspace. Hackers who create and use bots for malicious purposes are always finding ways to get through blocks. However.there many things that you can do to help prevent a wide range of bots from accessing your site and creating problems.

Using Htaccess to Block Bots

If you’re using an Apache server, you can use your .htaccess file to block a variety of bots in a few different ways.

The first thing that you can do is put a few lines of code in your .htaccess file that detect the user agent of the bot and then block access to the website. The code for that is as follows:

#Block Bad User Agents
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BadBotUserAgent1[OR]
RewriteCond %{HTTP_USER_AGENT} ^BadBotUserAgent2[OR]
RewriteCond %{HTTP_USER_AGENT} ^BadBotUserAgent3
RewriteRule .* - [F,L]

In this example, you’d replace the “BadBotUserAgent1” etc. with the user agent of the bot that you’re trying to block. The final line in this snippet of code tells the server to give a 403 error (forbidden) to any user agent that matches what you’ve put in the file. You can find a fairly good list of bad user agents to start with here.

You can also check your server logs and check for any odd user agents that appear to be eating up a ton of bandwidth on your server or crawling a large amount of pages. Just be sure to double check things and make sure that you aren’t blocking good bots like popular search engines. This can be hard sometimes because bad bots may imitate good search engine bots by spoofing their user agent names.

Blocking Bad Bots by IP

If you know the IP address of the bot that you want to block, you can put a list of IP addresses at the end of your .htaccess file. That code would look like this:

#Block Bad Bots by IP Address
Deny from 123.123.123.123
Deny from 123.123.123.123

In that example, you would replace the example IPs with the IPs of the bots that you want to block. You can also block entire ranges of IPs by using code like this:

#Block Bad Bots by IP Address
Deny from 123.123.123.0/24
Deny from 123.123.123.0/24

This code would block 123.123.123.0 – 123.123.123.255 which includes that entire IP range. This method is also not perfect as it’s easy for bots to change IPs and still hit your site. You can read a lot of information about IPs that may be malicious at Project Honeypot.

Bot Blocking Plugins and Programs

blocking-bots-in-wordpress.jpg

Update for 2020: We haven’t used Spyder Spanker in a while, and have seen the plugin cause problems with a variety of web hosts, and not update properly, so we can no longer recommend it. They have changed their product line, and it seems that the “classic” version is no longer available and the subscription pricing is different (it may be better or worse than before).

We don’t recommend products we don’t use, so we’ve pulled the recommendation here, unfortunately. Spyder Spanker used to be a great product where you could easily block bots by country (as a lot of bot attacks seem to originate from the same countries), but we’re unsure of the current status/utility/value of this plugin. There might be an update to this in the future if we have the chance to evaluate the current version of their software, but for now, it’s not recommended.

What About Robots.txt?

Most malicious bots and even mischievous bots will simply ignore robots.txt. Often times the only bots who respect it are bots who you want to be able to crawl your site, like well-known search engines. It doesn’t hurt to use robots.txt in conjunction with a good .htaccess file and a bot blocking plugin, but in our opinion, robots.txt doesn’t often do much by itself to help keep out unwanted bots.