To PDF or not to PDF? That is the question

MoreVisibility - February 8, 2007

When developing a website it is critical to have unique and relevant copy to inform your potential customers about your product or service. Another great reason would be to allow the search engines to better understand your subject matter. One of the more common ways I have seen business owners displaying their content is through the Adobe’s PDF format. A PDF format gives the user the ability to download the information in a clear and structured format. While this can be a great way for your visitors to find and read the copy you develop, you could be inadvertently causing yourself harm.

For example, let’s say you have five PDF files which house a majority of your content. The search engines will most likely index the PDF files much like a normal page on your website. This means there is a possibility that when a searcher is conducting a search on a key phrase you’re targeting, the five PDF files could surface within the natural results. You’re probably asking yourself, how could it possibly be a bad thing to have my PDF files indexed and being displayed within the search result pages? While the search engines crawl through the copy of the PDF file and index the content, critical functionality such as a primary navigation is absent. Thus the PDF acts as a dead end for search engine spiders. The same can also be said about the searchers who find themselves at the PDF versus the actual website. If a searcher clicks a natural listing which happens to be one of the PDF files used to display content, they would lose the ability to navigate to other areas throughout the website. This could ultimately result in a loss of a sale as well as a diminished branding experience.

So to answer the question, should I use PDF files to publish content on my website? The answer is no. However, use the content within the PDF format as a downloadable or printer friendly version. The copy which is published should be placed as an actual page of your website. This will allow the search engines and users to come through and view the copy on the page and also navigate to other sections of your website.

A potential issue at this point would be both the html version and the PDF file contain the same information and both are index-able. This could pose a problem as one of these two would be filtered out due to there being a duplicate content issue. The way to ensure the newly developed page is accessible to the search engines instead of the PDF is to use the Robots.txt file to limit access to the PDF file. The Robots.txt will disallow every file within a folder you identify. You will have to ensure all the PDF files are within their own folder. Let’s say you set up a folder on the root called PDF. This is going to be the folder that we are going to disallow the search engines to index. Your Robots.txt file would then look like the following:

User-agent: *
Disallow: /PDF/

In the end, the choice on how to display the content on your website is ultimately yours. Just remember that the decision you make could have a negative or positive impact. Use the bulleted list below to better understand the pro’s and con’s of using the Adobe’s PDF format as the main source for your content.

To PDF
– Preformatted content
– Easy to read
– Printer Friendly
– Download able format

Not to PDF
– Dead end for search engines
– Possible dead end for searchers
– Brand recognition loss
– Crucial Functionality not present
– Potential traffic and sales loss

© 2023 MoreVisibility. All rights reserved.