Articles in The 'Duplicate-content-filters' Tag

March 31 2010

Duplicate Content: Are There Penalties?

by Darren Franks

A common misconception within the online community is that there are “penalties” for having duplicate content on your website. Many webmasters tend to get very antsy if they think the folks at Google et al are going to put them in “search engine jail” for having duplicates of this and duplicates of that on their website. In actuality, omission or de-ranking is reserved for only the most blatant offenders.

If your intention is to deliberately steal content from another website or spam your page’s content with keywords with the goal of ranking higher, then you should probably fear Google’s wrath. However, if some pages simply look very similar or are just duplicated because of a stubborn CMS, the worst that will happen is that one of these pages will simply be filtered out and demoted to the supplemental index. The best way around this is to either employ the proper redirects (a topic I discussed in my last blog post) or to make all pages on the site as distinct as possible.

Barring the iron fist of the search engines, it is still good practice to avoid duplicate content for the sake of your users. The more unique content on a website, the wider the reach you will have in the search engines and the better experience you will provide for your users.

June 30 2008

Duplicate Content Tips

by Emily Creech

There can be numerous reasons why pages may not appear in the search results or why rankings can drop. One reason is duplicate content. There are many ways in which content can be duplicated and it usually happens unintentionally. It can occur for very valid reasons, often through actions that have been taken to boost rankings. It is not the worst thing to have happen, but if it can be fixed, it is probably in your best interest to do so. I have come across this issue a few times lately and thought I would offer a few helpful tips.

In general, a search engine’s mission is to provide unique and relevant content to the searcher. When an engine comes across duplicate content, the question arises; “Which pages are the most appropriate pages to index?” To display the most useful pages in the search engine results pages (SERPS), a duplicate content filter evaluates, sorts through, and removes the duplicate content pages (and spam). The search engines may do a fairly good job determining what to index, but by taking proactive steps, it is possible to help guide them to the pages you want indexed (or at least keep them from weeding out certain pages from your site). Keep in mind that without providing any guidance, they will do it themselves which may cause disappointment.

Below are just a few ways to avoid duplicate content:

  • Resolve canonicalization issues by redirecting to the preferred domain (ex: redirecting the non-www version to the www version)
  • Submitting a sitemap with the canonical version of each URL.
  • Implement a robots.txt file to tell the search engine spiders not to crawl or index certain pages (such as printer friendly pages).
  • Have all additional domains properly redirected using a 301 redirect. This will also transfer any built authority.
  • Keep dynamic parameters in URLs to a minimum.
  • Have an internal linking strategy to build relevancy.
  • Make each page on your site unique, including unique titles, meta descriptions, headings, and navigation.

These ideas barely scratch the surface of ways to reduce duplicate content, but hopefully it will get you headed in the right direction.

March 7 2008

Duplicate Content in Search Engine Indexes: Too much of a good thing?

by Marjory Meechan

Having duplicate content on your site may not seem like it could cause a problem in search engine indexes. After all, the more keyword relevant pages that a site has in indexes, the more likely that a page from the site will appear in the search engine results pages (SERPS) for that keyword, right? It’s true that duplicate content in search engine indexes is not the worst problem that a site can have — it’s infinitely better than no content, for example. However, serving up duplicate content to search engines can cause problems. This is because although the major search engines are dedicated to crawling the entire web and indexing every single page, they also are constantly striving to present as many unique and relevant results to their users as possible. To do this, they have to filter out duplicate content particularly when it occurs on the same site.

How do duplicate content filters work? Every search engine is different and this is an aspect of search engines that is changing all the time. In fact, Google recently made major changes to the way they handle duplicate content. Prior to fall of 2007, they maintained two indexes: a main index where most search results pages were called from and a supplementary index. Pages in the supplementary index were much less likely to appear in the SERPS. Google has now eliminated the distinction between indexes and started using other methods to ensure that pages from a single site do not overwhelm the search engine results. Some of these include:

  • Grouping duplicate URLs into a “cluster” and consolidating their properties including inbound links in one URL which is then displayed in the SERPS.
  • Only displaying a maximum of two results from any one domain (including sub-domains) in the results pages and providing a link to display more results if the searcher wishes.

So, if Google is taking care of this issue, why should we care? There are two main reasons:

  1. Search engines do not crawl all the pages of a site on every visit. How often a site is subject to a “deep crawl” depends on how important the search engines view the site, but even very important sites are not fully crawled every time. How many pages are crawled can be dependent on how much time the search engines have allocated to crawling your site. If they are wasting time collecting the same content over and over rather than crawling and indexing the unique pages, some of your content may not be included in search engine results as quickly as you would like.
  2. When Google chooses which URL to display, they may not be considering issues like which page has the best title or meta tags or URL filename. If you have gone to the trouble of optimizing a specific page for search engines, your work is all for naught if they choose to display a non-optimized page in the SERPs instead.

The bottom line is that a well designed site that takes care to serve only one version of a page to both search engines and visitors will be crawled more efficiently and will be less confusing for visitors to navigate. Furthermore, as the site owner, you will choose which pages will be displayed and not some anonymous algorithm. Google has provided some tips on how to streamline your site and avoid duplicate content issues. How important this is to your site can depend on many factors, but taking any advantage you can when competing for those all-important first page positions is just good sense.

© 2022 MoreVisibility. All rights reserved.