Articles in The 'Google Indexing' Tag

May 18 2012

Google’s Search for Your Website’s Pages

by Melanie Wahl

Google can be a valuable source of traffic for your website.   Googlers who search for a specific keyword or keyphrase benefit from Google’s curated results.   These results, separated into Search Engine Results Pages, deliver the best quality content that makes sense with the query entered.   Behind the scenes, Google goes through a number of steps before displaying (or serving) the queried content to the user.   These include: Crawling, Indexing, and Serving.

Crawling refers to the GoogleBot, Google’s web crawling bot (or spider), that “crawls” or discovers new and updated pages by following links from site to site.   This is why the “nofollow” attribute (rel=”nofollow”) was created, to prevent GoogleBot from following a link.

Indexing refers to the process of sorting which GoogleBot conducts to organize different content types.   Information processed to help GoogleBot sort a page includes tags and attributes.   Some rich media files or pages with dynamic features are not able to be processed, which is why it is best to try to simplify coding on your website if you find that a page is not showing up in Google’s Index.

Serving is the end result, the displayed snippet when a Google searcher enters a query and results are “served” to the Search Engine Results Page (SERP).   Google strives to serve the most relevant pages to a search query and it is a very complex process algorithm which weights results and orders accordingly.

If you are not already familiar, we urge you to read Google’s Webmaster Guidelines to learn Google’s best practice suggestions for helping find, crawl, and index your website.

April 3 2012

Canonicalization: Canonical URL for Content Files

by Melanie Wahl

In our previous two blog posts on canonicalization we covered the definition of the term, the most common webmaster concern of www versus non-www versioning of a domain, and the canonical tag.

This blog post will dig a little deeper into canonicalization for website owners who link to content files, such as PDFs, from their website. Your noting of the preferred canonical version of a content file to search engines should be considered just as important as when you note the canonical version of a page of your website to the search engines.

The canonical tag can be used to signal the preferred or canonical URL for various content file types as well as HTML documents.

For example, if your website offers a number of PDF documents for download, such as a series of how-to guides and also has matching pages containing how-to information, there ends up being two URLs for each how-to instructionable:

You have the option of using rel=”canonical” in the HTTP header of the HTML webpage when the PDF file is requested to indicate to Google and other search engines, that the HTML version is preferred.

By marking that page as the canonical version of the PDF, visitors finding the information in search engine results would be taken to your website, tracked by analytics (assuming you are using an analytics platform), and will see all the navigation and onsite linking you have set-up.   If the PDF was clicked through to from search engine results, it could be considered a dead-end due to lack of links, navigation, and tracking.

It should be noted that this technique requires not only access to the source code, but the ability to configure the server hosting your website to use the canonical tag in HTTP headers for files such as PDFs.   This method could also be used to identify the canonical version of a URL for a PDF in the case of one PDF file being located in multiple directories.

Please don’t hesitate to contact MoreVisibility if you need help with optimizing your website or assigning the canonical version of your website pages or documents.

April 2 2012

Canonicalization: The Canonical Tag

by Melanie Wahl

In our previous blog post on canonicalization we gave a brief definition and covered the most common webmaster concern of www versus non-www versioning of a domain.

This blog post will cover the canonical tag which was released in February 2009 to help website owners and their webmasters regulate how search engines sorted their content in the hope of keeping duplicate content out of each search engine’s index.

First, what is the canonical tag and what does it look like?   The canonical tag is marked by rel=”canonical” and is placed in the header of the source code for a webpage.

Here is an example:

<link rel=”canonical” href=”” />

This tag notes that the URL version listed as the href would be the version that is canonical, or preferred by you, the webmaster.   It should be noted that rel=”canonical” is not the same as a 301 redirect.   The 301 redirect automatically re-directs both search engine bots and human visitors to your perferred version of the page. The canonical tag is only ever seen by search engine bots, not users.   A 301 redirect is also considered a stronger signal that there is a canonical source, whereas the canonical tag is taken as more of a suggestion and may be overruled by the search engines if their analysis or algorithms see that a different page seems to be a better fit.   The canonical tag also has cross-domain functionality; it can be used in cases where it may not be easy to set up redirects, such as when you are migrating from one domain to another and are unable to create server-side redirects.   Google has stated that they consider the rel=”canonical” link element as a hint or suggestion and not as an absolute directive.

These two techniques can help webmasters convey to the search engines which version of their webpages to include (or exclude if needed) from the indices and how they prefer link metrics (such as link juice) to be assigned.   Contact MoreVisibility if you need help with the canonicalization of your website.

© 2024 MoreVisibility. All rights reserved.