Content Scraping is Poor Strategy
Recently I read an article entitled Site content and use of web catalogues at Google Webmaster Central. It is disturbing that people are still trying to get results on the SERP pages by scraping content from other sources. Let’s face it, this is just poor strategy. I cannot stress this enough; when it comes to content, the only strategy that works in the long term is creating pages with original creative content that is relevant to the subject on the page.
Look at it like this, the job of the search engines is to provide its users with the information that is closely related to the products and services that they are searching for. When that content is copied (scraped) from somewhere else, it detracts from the user experience. This might work in the short term, but as the engines filter out duplicate content you are sure to lose whatever rankings you achieve with this method.
If you want your site to have a longstanding presence, write original copy for each page. Make sure that what you are writing about is related to the information, service or product that your website provides. If you have trouble coming up with content, do some research or hire a professional copywriter.
There are several steps you can take to ensure that your site is optimized from the backend, but all else being equal, content is king!
Posted in SEO & Content |
|


Not only is content scraping poor SEO strategy — the verbatim copying of content from other websites and blogs is illegal copyright infringement.
It’s open for debate whether the duplication also dilutes the search rankings of the original content.
Carol Shepherd, Attorney
Hi Carol and Welcome,
I totally agree that publishing scraped content is illegal. But do you think usage should be taken into consideration? If not then Google’s cached versions of pages would be illegal would they not?
As far as the debate, here’s my stance. Duplicate content can not dilute specific search engine rankings. In the search engines, duplicate content is normally handled via a binary filter. The content is either shown or not. I am sure there are cases that dont fit the norm as with almost everything related to the algorithms. The main issue related to the original content is whether it is selected as the primary version and displayed in the natural results. If not, the results are not diluted, they are delegated to the supplemental index. Unlike most things in SEO, this is fairly black and white.
Here’s some additional reading on duplicate content if you are so inclined… Google Related Duplicate Content Data Surfacing