To crawl delay, or not to crawl delay?

Taming bots not to consume an extensive amount of hosting resources has been an ongoing task for us. After the successful launch of the new Anti-bot AI system, which has already blоcked more than 1 billion hits only from malicious bots, we’d like to shed some more light on another measure in that area – the crawl-delay setting. Read on to find out what it is, why you might want to consider it and why we no longer apply a default crawl-delay setting on our servers.

What is crawl-rate and crawl-delay?

By definition, the crawl-rate defines the time frame between separate requests bots make to your website. Basically, it defines how fast a bot will crawl your site. Having a crawl-delay setting tells bots, who choose to follow it (like Yahoo!, Bing, Yandex. etc.), to wait for a certain amount of time between single requests.

Why use crawl-delay setting?

If you have a lot of pages and many of them are linked from your index, a bot that starts crawling your site may generate too many requests to your site for a very short period of time. This traffic peak could possibly lead to depleting your hosting resources monitored on an hourly basis. This way, if you encounter such problems, it’s a good idea to set the crawl-delay to 1-2 seconds so the bots crawl your website in a more moderate way without causing peaks in load.

It’s important to say that the Google bot does not take into consideration the crawl-delay setting. That is why you should not worry that such a directive can influence your Google standings and you can safely use it, in case there are other aggressive bots you want to stop. It is highly unlikely to experience issues due to Google bot crawling, but if you want to lower its crawl-rate you can do this only from the Google Search Console (former Google Webmasters Tools).

No default crawl-delay on SiteGround servers

Until recently there was a default crawl delay setting applied universally on the SiteGround shared servers. It could have been overwritten by each user if different custom value was set in the user’s robot.txt files. We used this directive in order to prevent our customers from losing their server resources to bots. However, modern search engine bots are sophisticated enough to crawl without causing issues and bad bots are blocked by our AI system, so there was simply no point in keeping that setting. So we removed it.

Hristo Pandjarov

WordPress Initiatives Manager

Enthusiastic about all Open Source applications you can think of, but mostly about WordPress. Add a pinch of love for web design, new technologies, search engine optimisation and you are pretty much there!

Community

Comments ( 18 )

Sharron

Jun 14, 2017

If you have any worries about bots getting jiggy with crawling your site; Wordfence plugin can be set to cause crawl delay. - It will temp-block a bot that makes too many crawl-requests in a set period. ( This applies to free & paid-for Wordfence. ) I'm not affiliated to Wordfence, but I thought I'd mention it.

Reply

Hristo Pandjarov Siteground Team

Jun 15, 2017

That can be potentially dangerous for Google crawling your site and may result in error reports if set to high. Just a caution notice, when you limit bots, make sure you don't overdo it.

Reply

Jim

Feb 27, 2020

Verified Google crawlers can be set to have unlimited access to your site with the Wordfence plugin, even if you have strict rate limiting rules for the others. However, I still find a good, quality bad bot blocking plugin a necessity in addition. You really need both.

Reply

Lisa

Jun 15, 2017

I think it's safe to say that the AI Bot that you recently launched doesn't kill all malicious crawlers, by the attach that hit my site yesterday. I had to go through and block the malicious IP addresses. How do we set a craw delay if we want to do so, to catch the malicious bots that get through your system?

Reply

Hristo Pandjarov Siteground Team

Jun 16, 2017

Unfortunatelly, no system can catch 100% of the malicious traffic. However, we've blocked already billions of hits towards our servers with it and we constantly improve its performance. Setting crawl-delay will not help you against bad bots because they will simply ignore it. I would recommend opening a ticket about this with info about the traffic you think is malicious so we can check it out and make sure our system will start detecting it better or giving some firewalling plugin a try. The CloudFlare firewall in the Plus version is a great option to check too.

Reply

Craig Daniels

Jun 19, 2017

Always glad you're improving things against bad actors. But yesterday my site got bashed and then shut down for too much CPU usage. all the traffic came from 2 IP addresses that mostly hit 3 pages and yet they were not stopped and I had to watch as my sites where shut down for hours. I blocked the IP addresses but it was too late. Can't you create a setting where we limit the number of hits from all IP addresses to say 200 a day as a way of stopping thousands of hits from the same IP?

Reply

Hristo Pandjarov Siteground Team

Jun 20, 2017

Unfortunatelly, that wouldn't be a good idea on a larger scale because it would affect a lot of customers in a negative way. However, I will look into that case and see if a possible update of our Anti-Bot system rules could be applied.

Reply

Joby Cefalu

Sep 09, 2019

Why would my SEO manager put a crawl-delay 20 on my index?

Reply

Angelina Micheva

Sep 11, 2019

Hi Joby, The only reason for that could be if bots crawl your index page too aggressively and cause spikes in the load on your server. However, it will be best to check directly with your SEO manager what is the reasoning for setting up the crawl delay like this.

Reply

stu rohrer

Jun 25, 2017

Thanks for this, but could you explain where to go to implement the crawl delay on my domain/server?

Reply

Hristo Pandjarov Siteground Team

Jun 26, 2017

You can add crawl delay per speciffic bots by following the instructions in this article: https://www.siteground.com/kb/how_to_decrease_the_crawl_rate_of_bings_search_engine_bot/

Reply

Brian Prows

Jul 05, 2017

To what extent does Pingdom and WordPress' Jetpack slow down server response times? I have both set to monitor for downtimes. (I'm using free CloudFlare service.) Pingdom reports an average server response time of around 680. Google keeps reminding me that server response time shouldn't exceed 200ms.

Reply

Hristo Pandjarov Siteground Team

Jul 06, 2017

JetPack has tons of functionality and there isn't a straigh-forward answer to that question. In most cases it does not slow your site actually :) I would recommend that you post a ticket in your Help Desk so we can give it a look and tell you more.

Reply

mohamed

Dec 03, 2018

my site is in two languages and i don't find any one explain how to make the robots file for multilingual sites so can you help me ?

Reply

Hristo Pandjarov Siteground Team

Dec 04, 2018

I would look into this article: https://moz.com/community/q/adding-multi-language-sitemaps-to-robots-txt

Reply

John Eric

Mar 05, 2020

Hi, My Website is properly on Google search console. But I couldn't see my backlinks in Search Console.

Reply

Hristo Pandjarov Siteground Team

Mar 06, 2020

I'd recommend checking in the Google forums for assistance, I am sure they will be able to tell you more on that subject :)

Reply

craig

May 13, 2020

IS there a way to setup a crawl delay for a specific bot on all sites on my server at once? What are crawl delay parameters?

Reply

Start discussion

Ready to get your website started?

Choose a hosting plan, start or migrate your site in a few clicks, and grow your online presence!

Get Started Chat with an expert