In our previous two blog posts on canonicalization we covered the definition of the term, the most common webmaster concern of www versus non-www versioning of a domain, and the canonical tag.
This blog post will dig a little deeper into canonicalization for website owners who link to content files, such as PDFs, from their website. Your noting of the preferred canonical version of a content file to search engines should be considered just as important as when you note the canonical version of a page of your website to the search engines.
The canonical tag can be used to signal the preferred or canonical URL for various content file types as well as HTML documents.
For example, if your website offers a number of PDF documents for download, such as a series of how-to guides and also has matching pages containing how-to information, there ends up being two URLs for each how-to instructionable:
You have the option of using rel=”canonical” in the HTTP header of the HTML webpage when the PDF file is requested to indicate to Google and other search engines, that the HTML version is preferred.
By marking that page as the canonical version of the PDF, visitors finding the information in search engine results would be taken to your website, tracked by analytics (assuming you are using an analytics platform), and will see all the navigation and onsite linking you have set-up. If the PDF was clicked through to from search engine results, it could be considered a dead-end due to lack of links, navigation, and tracking.
It should be noted that this technique requires not only access to the source code, but the ability to configure the server hosting your website to use the canonical tag in HTTP headers for files such as PDFs. This method could also be used to identify the canonical version of a URL for a PDF in the case of one PDF file being located in multiple directories.
Please don’t hesitate to contact MoreVisibility if you need help with optimizing your website or assigning the canonical version of your website pages or documents.
In our previous blog post on canonicalization we gave a brief definition and covered the most common webmaster concern of www versus non-www versioning of a domain.
This blog post will cover the canonical tag which was released in February 2009 to help website owners and their webmasters regulate how search engines sorted their content in the hope of keeping duplicate content out of each search engine’s index.
First, what is the canonical tag and what does it look like? The canonical tag is marked by rel=”canonical” and is placed in the header of the source code for a webpage.
Here is an example:
<link rel=”canonical” href=” https://www.morevisibility.com/blogs.php” />
This tag notes that the URL version listed as the href would be the version that is canonical, or preferred by you, the webmaster. It should be noted that rel=”canonical” is not the same as a 301 redirect. The 301 redirect automatically re-directs both search engine bots and human visitors to your perferred version of the page. The canonical tag is only ever seen by search engine bots, not users. A 301 redirect is also considered a stronger signal that there is a canonical source, whereas the canonical tag is taken as more of a suggestion and may be overruled by the search engines if their analysis or algorithms see that a different page seems to be a better fit. The canonical tag also has cross-domain functionality; it can be used in cases where it may not be easy to set up redirects, such as when you are migrating from one domain to another and are unable to create server-side redirects. Google has stated that they consider the rel=”canonical” link element as a hint or suggestion and not as an absolute directive.
These two techniques can help webmasters convey to the search engines which version of their webpages to include (or exclude if needed) from the indices and how they prefer link metrics (such as link juice) to be assigned. Contact MoreVisibility if you need help with the canonicalization of your website.
Canonicalization is often misunderstood by website owners who are frustrated with different versions, or unintended duplicated pages, appearing in search engine results pages (SERPs). The concept behind canonicalization is that, when duplicate pages are created, there is one preferred version indicated, called the canonical version. If no canonical version is indicated, the search engines may choose one or more versions to show in their SERPs. Canonicalization problems are common and every website owner should make sure to check on the status of their website. We featured canonicalization as one of our Top 5 Most Common SEO Mistakes in our August 2011 Newsletter.
The most common concern with canonicalization is www versus non-www of a domain in a URL. In this situation http://www.companyname.com and http://companyname.com are seen as separate pages. If each page serves a 200 OK, search engine spiders can come across each version and see them as stand alone pages, not the same page, and possibly consider one of them duplicate content. To keep this from happening, we suggest that you select which version of your domain you would like to be the canonical version and set the other version to redirect. Let’s assume that we want http://www.companyname.com to be the canonical version. Placing redirects from every non www page to the www page would be needed to indicate to the search engines that www pages are considered the canonical version. We also advise against any duplication of your website whether on the same domain or on other domains that you may own.
In the next blog in this series, we will cover the canonical tag.