The robots.txt file is a simple text file (no html) that is placed in your website’s root directory in order to tell the search engines which pages to index and which to skip.
Many webmasters utilize this file to help the search engines index the content of their websites.
If webmasters can tell the search engine spiders to skip pages that they do not consider important enough to be crawled (eg. printable versions of pages, .pdf files etc.), then they have a better opportunity to have their most valuable pages featured in the search engine results pages. The robots.txt file is a simple method of essentially easing the process for the spiders to return the most relevant search results.
That being said, I have seen many occasions where the robots.txt has not been used in the best way possible. For instance, webmasters are prone to make mistakes when installing the robots.txt and the repercussions can be severe. There is a simple instruction that restricts all search engine spiders from crawling the entire site:
Without the “forward slash” in the instructions, search engines are granted access to the entire site. So, the inclusion of this one character in the robots.txt can prevent a website from showing in the search engines. There could be many reasons why webmasters would do this intentionally (website is still relatively new and they may still want to tweak certain pages for keyword density etc.), but more often than not, it is a mistake and is usually only realized when the site hasn’t shown up in the search engine indexes for months.
Errors aside, another benefit of having a robots.txt is that you can specify the location of the Google .xml or Yahoo sitemap with this simple instruction:
sitemap: http://www.client.com/sitemap.xml (this assumes the xml sitemap is located at the root of the domain).
This also increases spiderability for the search engines. Of course, even though this is a small aspect of the search engine optimization process, if utilized correctly, a robots.txt can be a significant benefit.