In our previous two blog posts on canonicalization we covered the definition of the term, the most common webmaster concern of www versus non-www versioning of a domain, and the canonical tag.
This blog post will dig a little deeper into canonicalization for website owners who link to content files, such as PDFs, from their website. Your noting of the preferred canonical version of a content file to search engines should be considered just as important as when you note the canonical version of a page of your website to the search engines.
The canonical tag can be used to signal the preferred or canonical URL for various content file types as well as HTML documents.
For example, if your website offers a number of PDF documents for download, such as a series of how-to guides and also has matching pages containing how-to information, there ends up being two URLs for each how-to instructionable:
http://www.companyname.com/PDFs/how-to-make-pie.pdf
http://www.companyname.com/how-to-make-a-pie.html
You have the option of using rel=”canonical” in the HTTP header of the HTML webpage when the PDF file is requested to indicate to Google and other search engines, that the HTML version is preferred.
By marking that page as the canonical version of the PDF, visitors finding the information in search engine results would be taken to your website, tracked by analytics (assuming you are using an analytics platform), and will see all the navigation and onsite linking you have set-up. If the PDF was clicked through to from search engine results, it could be considered a dead-end due to lack of links, navigation, and tracking.
It should be noted that this technique requires not only access to the source code, but the ability to configure the server hosting your website to use the canonical tag in HTTP headers for files such as PDFs. This method could also be used to identify the canonical version of a URL for a PDF in the case of one PDF file being located in multiple directories.
Please don’t hesitate to contact MoreVisibility if you need help with optimizing your website or assigning the canonical version of your website pages or documents.