Issue status update for http://drupal.org/node/14177 Project: Drupal Version: cvs Component: other Category: feature requests Priority: minor -Assigned to: Anonymous +Assigned to: bertboerland@www.drop.org Reported by: bertboerland@www.drop.org Updated by: bertboerland@www.drop.org -Status: active +Status: patch Seems like there is no robots.txt anymore in cvs? The old one was something like (delay added) # small robots.txt # more information about this file can be found at # http://www.robotstxt.org/wc/robots.html # if case your drupal site is in a directory # lower than your docroot (e.g. /drupal) # please add this before the /-es below # to stop a polite robot indexing an exampledir # add a line like # user-agent: polite-bot # Disallow: /exampledir/ # a list of know bots can be found at # http://www.robotstxt.org/wc/active/html/index.html # see http://www.sxw.org.uk/computing/robots/check.html # for syntax checking User-agent: * Crawl-Delay: 10 Disallow: /?q=admin Disallow: /admin/ Disallow: /cron.php Disallow: /xmlrpc.php Disallow: /database/ Disallow: /includes/ Disallow: /modules/ Disallow: /scripts/ Disallow: /themes/ Disallow: */add/ bertboerland@www.drop.org Previous comments: ------------------------------------------------------------------------ December 10, 2004 - 11:29 : bertboerland@www.drop.org Though it is not "a standard" within the "non standard" robots.txt, many bots obey the "Crawl-delay:" parameter. Since drupal sites seem to be popular with search engines and lost of people have more aggresive bots than visitors at their site, it might be wise to slow down the robots by adding a robots.txt line like: User-Agent: * Crawl-Delay: 10 (time in seconds between page requests) Slurp (yahoo/AV) and MSFT bots obey this paramter, Googlebot not yet but will most likely in 2.1+ Does it makes sense to ship drupal with a default robots.txt with this parameter? If so, then there should be something in the documentaion about moving this to docroot in case drupal is installed in a subdirectory.