Articles in the SEO & Content Category

Duplicate Content in Search Engine Indexes: Too much of a good thing?

March 7th, 2008 by Marjory Meechan

Having duplicate content on your site may not seem like it could cause a problem in search engine indexes. After all, the more keyword relevant pages that a site has in indexes, the more likely that a page from the site will appear in the search engine results pages (SERPS) for that keyword, right? It’s true that duplicate content in search engine indexes is not the worst problem that a site can have – it’s infinitely better than no content, for example. However, serving up duplicate content to search engines can cause problems. This is because although the major search engines are dedicated to crawling the entire web and indexing every single page, they also are constantly striving to present as many unique and relevant results to their users as possible. To do this, they have to filter out duplicate content particularly when it occurs on the same site.

How do duplicate content filters work? Every search engine is different and this is an aspect of search engines that is changing all the time. In fact, Google recently made major changes to the way they handle duplicate content. Prior to fall of 2007, they maintained two indexes: a main index where most search results pages were called from and a supplementary index. Pages in the supplementary index were much less likely to appear in the SERPS. Google has now eliminated the distinction between indexes and started using other methods to ensure that pages from a single site do not overwhelm the search engine results. Some of these include:

  • Grouping duplicate URLs into a “cluster” and consolidating their properties including inbound links in one URL which is then displayed in the SERPS.
  • Only displaying a maximum of two results from any one domain (including sub-domains) in the results pages and providing a link to display more results if the searcher wishes.

So, if Google is taking care of this issue, why should we care? There are two main reasons:

  1. Search engines do not crawl all the pages of a site on every visit. How often a site is subject to a “deep crawl” depends on how important the search engines view the site, but even very important sites are not fully crawled every time. How many pages are crawled can be dependent on how much time the search engines have allocated to crawling your site. If they are wasting time collecting the same content over and over rather than crawling and indexing the unique pages, some of your content may not be included in search engine results as quickly as you would like.
  2. When Google chooses which URL to display, they may not be considering issues like which page has the best title or meta tags or URL filename. If you have gone to the trouble of optimizing a specific page for search engines, your work is all for naught if they choose to display a non-optimized page in the SERPs instead.

The bottom line is that a well designed site that takes care to serve only one version of a page to both search engines and visitors will be crawled more efficiently and will be less confusing for visitors to navigate. Furthermore, as the site owner, you will choose which pages will be displayed and not some anonymous algorithm. Google has provided some tips on how to streamline your site and avoid duplicate content issues. How important this is to your site can depend on many factors, but taking any advantage you can when competing for those all-important first page positions is just good sense.

Posted in SEO News, SEO & Content, SEO & Marketing | No Comments » |

How External Duplicate Content Effects Your Site

March 6th, 2008 by Grant Wolz

There was recently a frantic post on the Google groups by a gentleman who was sure that his entire website was de-indexed by Google because another domain had a cached version of it indexed. After he saw what had happened he researched the matter himself and assumed that he had been hijacked by this proxy cache and that he needed to take action to block any further problems. His response was to block all robots to his site with nofollow and noindex meta tags which only made matters worse. His actions caused his entire site of 4000+ pages to be removed from all search engine indexes and destroyed his business.

Of course this example is a bit extreme, but would your response have been any better? It’s time we educated ourselves about the mystery behind the dreaded duplicate content matter and learned how to really deal with it.

By basic definition, duplicate content refers to an exact copy of webpage or content on a page that is listed under a different URL. Meaning that the pages look exactly the same but the URL in the address bar is different. This could either be internally (within your site) or externally (on another website). For today, we are going to stick with external duplicate content since this is what is described in the example.

But first, before we begin, we should look at why we are concerned about what other people do online with our content. What caused this whole duplicate content beast to appear anyway? The true cause of the fear of duplicate content was Google’s supplemental index (which is now gone). The problem was that Google wanted to find a way to limit the number of results from a single site about a single keyword. For example if you had a page about green tea on your site and you also had ten copies of the page under different categories still on the same site Google had to pick one of them so your single site did not take up multiple spots in the rankings. These duplicate pages were placed in supplemental index to show the owners that Google knew the page was there, but didn’t want to put it in the search results because either the page itself or something very similar was already there.

Many site owners had problems with this because they did not have enough unique pages. Simply replacing green tea with white tea did not make a page unique enough to be listed as a different page. Pages needed to be clearly different with different, text to be unique, but no one knew. And so the dreaded duplicate content page missing issue began. The beast had been born.

So how does external duplicate content actually affect your site? The truth of the matter is that it doesn’t affect it at all. The stories we hear of cached versions of pages replacing the real site all have underlining nonrelated problems that we never hear of. If for example you were caught and deindexed for taking part in a link farm, it’s only natural that a copy of your site takes its place. It’s still your site and still your content it’s just listed somewhere else on the internet that’s not in trouble with the search engine.

If we really take the time to think about this whole issue of external duplicate content before we panic and make matters worse, we can see just how unfounded it really is. Could it really be so simple to destroy your competitors that all you needed to do was make a copy of their site? Heck, even multiple copies of a website could be done with just a few dollars. The internet would be in total anarchy as site after site would compete in terms of who could copy each other the most. Major sites like WhiteHouse.gov could be removed from Google because of the actions of the average middle school child with internet access and fifty bucks. Do we really want to think this is how the internet works?

In the end, we should actually consider these duplicate external websites and caches to be a good thing. If by some off-chance some user finds a cache version of your site online in the farthest reaches of the internet, it will still have your content on it and your contact information. This copy somehow could reach a user that in a million years could have not found your real site for some reason or another. Right now your articles and products could be being viewed by people you never even thought of targeting. This is a good thing for your business and your website. Some of these random cached pages might even be considered backlinks. Albeit this is a far fetched notion, but it is very possible.

I hope this has somehow cleared the air around the notion of external duplicate content and that you may feel more at ease when you see copy of your page somewhere online. It won’t hurt you or your SEO practices; all it can do is help spread your content. Remember copying is considered the most sincere form of flattery.

Posted in SEO News, SEO & Content, SEO & Marketing | No Comments » |

Keyword Targeting Influenced by Website Size

March 5th, 2008 by Karen Luther

The task of keyword targeting your website for natural search results can be a daunting task and if done incorrectly can actually hurt the performance of your site in the search engines. One way keyword targeting can go horribly wrong is when you try to target a laundry list of keywords to one page (many times it’s the homepage) in hopes of getting good positions in the search results. Jamming multiple keyword phrases on one page is called keyword dilution and can cause your site to drop rank in the search engines.

What many people don’t realize is that a major factor that determines the number of keywords that can be targeted on a website is the size of the website. What it boils down to is that you should only be targeting 1 to 2 unique keyword phrases per page. So if your site only has ten pages then the maximum number of unique keyword phrases it can support is 20. And it may even be less then that depending on the competitiveness of the keyword phrase. For example if a keyword phrase that you want your site to show up highly for in the search engines is very competitive, then you will want it to be the only keyword phrase targeted on a single page, at about a 4% keyword density.

But what if there are different variations of a keyword phrase that’s really important to the website? Which variation do you choose? The answer is in finding the balance between popularity and competition of the keywords. That’s were keyword research comes into play. Find out what version of the keyword phrase users are searching for. From there, decide which keyword phrase your site has the best chance of ranking highly for based on the competitiveness of the keyword. Then, according to the number of pages you have on your site choose the number of keywords you will target. If you want to increase that list of keywords, then you will have to create new pages with unique content.

Posted in SEO News, SEO & Design, SEO & Content, SEO & Marketing | No Comments » |

« Previous Entries Next Entries »