News:

AbanteCart v1.4.2 is released.

Main Menu

Googlebot controversy

Started by junkyard, October 31, 2013, 05:10:42 PM

Previous topic - Next topic

junkyard

Googlebot was detected to create rather high and excessive load on site.
It appears to crawl several thousand files with images and hundreds of phps of the cart so many times a day:

http://www.google.com/bot.html
"For most sites, Googlebot shouldn't access your site more than once every few seconds on average. However, due to
network delays, it's possible that the rate will appear to be slightly higher over short periods.In general, Googlebot
should download only one copy of each page at a time. If you see that Googlebot is downloading a page multiple times,
it's probably because the crawler was stopped and restarted. "

We think we have to block it from crawling certain files\pages using a robots.txt file approach:
https://support.google.com/webmasters/answer/93708
https://support.google.com/webmasters/answer/156449

Is there any recommendation as to what Cart's directories could be blocked from crawling without
doing harm for the products to be found on web using google?   Thank you

abantecart

You do not need to expose a lot to search engines to crawl.

Only index.php on main web directory, image and resources directories should be open to search engines.
At some case you might want to open extensions directory if you have some web resources in some extension.
Please  rate your experience or leave your review
We need your help to build better free open source ecommerce platform for everyone. See how you can help

Forum Rules Code of conduct
AbanteCart.com 2010 -