Many times organizations inadvertently produce duplicate versions of the same web page across various sections of their website, or across various websites. The search engines have become increasingly adept at filtering out duplications in the SERPs to give users varied, relevant results. It is recommended that any duplication be removed from your site. If the duplications are on another domain, remove the content and have the page use a 301 redirect to its corresponding duplicate on the domain you intend to rank. These best practices are really nice to know, but for a myriad of reasons many organizations can not implement them. Usually this failure derives from a cumbersome CMS and it is a common issue with ecommerce sites. The next best solution is the canonical link element.
The Canonical link element can be found in the HEAD section of the HTML document, and correctly references the actual URL that your organization wants to be indexed. Let’s take the following example URLs that are all duplicate pages.
http://domain.com/category/product.html *I want this page to rank*
http://domain.com/category1/product.html
http://specialdomain.com/ /product.html
After some discussion with your team, you decide that the first URL is the page that needs to be indexed. You would put the following tag on the pages you do NOT want indexed.
<link rel=”canonical” href=”http://domain.com/category/product.html”/>
This will give the “credit” to the correct page, and eliminate any ambiguity as to which page the search engine should index.
Best SEO practice dictates that duplicate content, the same content found on multiple pages of one domain, is widely frowned upon. Most of the time, duplicate content is unintentional, either due to a lack of knowledge of its destructive consequences or just out of blatant disregard.
That being said, there are times when you do need duplicate content on your website. Consider an ecommerce site, particularly a product catalog, for example.
In a product catalog, it is not uncommon for the same product to be associated with multiple categories. For this very reason you could possibly have the same content on multiple URLS…
http://domain.com/category/product.html * I want this to be indexed.
http://domain.com/category1/product.html
http://domain.com/category2/product.html
… which will all show the product details for one specific product. Now ideally we want search engines to only index one of those pages and ignore the other two. Fortunately, Google, Bing and a few other search engines, allow us to circumvent this problem rather easily by using the canonical link element. The canonical link element, which is put in the HEAD section of an HTML document, tells search engines the preferred location for a particular URL. So in order for the search engines to ignore two of the pages from our example above, and only pay attention to the page we want indexed, the following line can be added to both pages we do not want indexed:
<link rel="canonical" href="http://domain.com/category/product.html"/>
[/source]
Now when the search engines visit the pages with that element, they will only consider its canonical location in their indexes. It should be noted however, that you should still avoid duplicate content at all costs, but there are some times when your application truly needs it, and for this, the canonical link element is perfect.