As we (or most of us) know, we do not want the search engine spiders crawling around and indexing certain parts of our website. For instance, it is probably not intelligent to have spiders crawling the secure port of the server, i.e. secure parts of our websites (https://).
We also don’t want to have the Googlebot or the Bingbot crawling and indexing pages that would be a poor choice as a target for search, like PDF files, Excel documents of images. This is where the robots.txt file comes in.
The robots.txt file, uploaded to the root directory of the site (www.example.com/robots.txt), tells the spiders which directories and pages to skip. Why do we want this? If someone were to find a PDF file or a Flash file in a search result and were to click on it, those types of documents generally don’t contain links leading back to the rest of the site and can be a “dead-end” for both the the search engines and for the user. A simple “Disallow” instruction in the robots.txt file will prevent non-SEO friendly pages from possibly showing up in search results for your desired keyphrases: