Articles written in July, 2008

Fixing Un-Canonical URLs. Oh joy! Part 1

http://www.morevisibility.com/seoblog/fixing-un-canonical-urls.html July 25th, 2008 by

In my previous post, Filenames, host names and canonicalization, oh my!, I talked about how the duplicate content issues affect your search engine rankings, and specifically how un-canonical URLs create this issue. I mentioned different types of URL canonicalization issues you are likely to deal with in your SEO work. Since that post, I created a new, more complete list of canonical URL issues, and will go into a bit more detail in this and future posts describing how the issues actually arise and how to fix these issues.

Let’s start with a detailed guide on how to pronounce canonical: ca • NON • uh • cul. Also, canonicalization is pronounced: ca • non • uh • cul • i • ZAY • shun. That was easy!

Without further ado, here is the new list of areas of canonical URL issues:

1. Protocols (http and https)
2. Domain and subdomain names (sometimes referred to as host names)
3. URL paths
4. File names
5. Case sensitivity (when myPage.html is handled differently than MYPage.HTML)
6. Query strings
7. Combinations of any or all of the above

Let’s break these down using an example URL:

https://example.org/blog/Colors/default.asp?action=go&sessid=6468439#section3

1. https is the protocol
As far as search engines go, there is only http and https.

2. org is the top level domain

3. example.org is the domain name
No subdomain is used. (Technically, www is a subdomain just like any other subdomain.)

4. /blog/Colors/default.asp is the URL path (everything after the top level domain, including the file name, and before the query string, if any)

5. default.asp is the file name
Many times, the URL path may not contain the file name at all and will just end with the forward slash (e.g. /blog/Colors/).

6. ?action=go&sessid=6468439 is the query string (everything after the question mark and before the # sign, if any)
The query string is made up of parameters and their values. “action” is parameter and “go” would be its value.

7. #section3 is the URL fragment (also known as a named anchor or bookmark)
A URL fragment tells the web browser where inside a specific web page to go to or to scroll to. It does not tell the web browser (nor the search engine) which specific page to go to, so therefore does not contribute to any URL canonicalization issues.

To illustrate each of these, here is perhaps the worse case scenario as it relates to canonical URLs (I will continue to add to this scenario from post to post in this series):
You have a site that has forms visitors fill out. You have an SSL certificate and people can go to (https://www.example.com) to see your site. Now, you want to give your visitors peace of mind by letting them know that your site is secured (by making sure that browsers see the lock icon that indicates a secure site). When your https version of your site is setup, the web server may pull files from the non-secure section, but send them back to the browser over a secure connection. (This ‘usability feature’ provides the convenience of only having to maintain one set of files instead of two sets.) Now, when visitors go to https://www.example.com/blog, they’ll see the same content as http://www.example.com/blog. This is your first duplicate content issue.

There are different ways to handle protocols when dealing with canonical URLs. You can block all access to the secure version of your site using a robots.txt file. However, if your web server or web hosting account is setup to serve your secure site from the same files as the non-secure files, search engines will see the regular robots.txt file when going to https://www.example.com/robots.txt. The most flexible way to circumvent this is to create a version of the robots.txt file that’s used only for https and name it appropriately. Then use rewrite rules to internally redirect all requests of https://www.example.com/robots.txt to robots-ssl.txt. The search engines will still think it’s looking at https://www.example.com/robots.txt, and the industry standard. If your site is running on an Apache web server, see apache.org’s URL Rewriting Guides and yourhtmlsource.com’s URL Rewriting Guide.

Another way to handle this will only work if all your files on your web server can run server-side scripts. Usually, this is the case with .asp, aspx, .cfm or .php files. If you have .htm or .html files, you may be able to ask your web hosting company to allow server-side scripts to be ran in these files. Once all files on your web server can run server-side scripts, make a script that checks if the file was accessed via https, and if so, add a robots meta tag that disallows the page from being indexed. This script needs to go in every one of your files, either by ‘including’ them or pasting the script in at or near the beginning of the page. The file-include method can drastically reduce the time it takes to administer this script since you only have to make changes to the script in one place; all files that are including the file that has changed will be essentially updated automatically.

My next post will discuss canonical domain names, including subdomains, and URL paths. In following posts in this series, I will provide methods that maintain the usability features of the web (for both the webmaster and the visitor) yet prevent the duplicate content/canonical URL issues. Stay tuned!

Posted in SEO & Technology

Geo-targeted Link Building for Local Search Optimization

http://www.morevisibility.com/seoblog/geo-targeted-link-building-for-local-search-optimization.html July 24th, 2008 by

Making sure that your site is optimized for your geographical location and the locations where your business operates is just one aspect of optimizing for local search results. Another good way to build local relevance for your site is by encouraging links from other trusted and relevant websites operated by businesses and organizations in your area. This establishes keyword relevance for your site, for the name of your region and, if these sites are hosted in your area as well, establishes that your website is well-linked physically to that region.

One good way to find sites that could give your site some local relevance is to search for “your location” and “business” in Google, Yahoo and MSN as well as any other popular search engines and see what comes up. A search for “Boca Raton business” found these listings:

boca business
 
Any one of these sites would be a great link to have for a Boca Raton Business. Once you have located some potential sources of valuable links, go to the site and see if there is any place on the site that might offer an opportunity to get a link.

Another good search term to use for geo-targeted link building would be “Your Area Directory” – in the case of Boca Raton – “Boca Raton Directory”. If there are any local business directories, it would be a good idea for my Boca Raton business to be listed. Just make sure that any directory pages that come up are truly local as some larger directories may have a section devoted to your area. By choosing carefully, good sources for local links can be found.

Of course, not all links are necessarily going to be valuable for increasing search engine rankings. Some top ranking local business sites will provide links to local businesses, but the links may not be direct or they may have a rel=”nofollow” attribute on them advising search engines not to count them, as in the case of the Boca Raton Chamber of Commerce site. When evaluating if a site is a good source for a link, consider the potential for traffic before considering search engines. If the site has a good potential for providing your site with direct traffic, then getting your site listed is a good idea – even if search engines do not value the site. If they do count, then it’s all the better.

Posted in Link Building

Google Webmaster Central

http://www.morevisibility.com/seoblog/google-webmaster-central.html July 22nd, 2008 by

While most of us are primarily familiar with Google as a search engine, there are several products and services offered by the company that can help webmasters gain visibility into their websites. While Google Analytics is one of the better known of this suite of tools (we have an in-house Web Intelligence team as well as a web analytics blog which features articles on it), there is another very useful tool webmasters can use. Conveniently enough, it’s called Google Webmaster Central.

Just like Google Analytics, Google Webmaster Central is offered as freeware. All you need to gain access to these services is a Google account. But, once you are in, what can you do? Well, more than you might realize. Once you verify that you are the website owner (by the addition of a slim-line authentication code in your site metadata or via upload of a validation HTML file), Google Webmaster Central offers a fairly robust selection of services, among which are:

  • Diagnostic information, such as the ability to identify “crawl errors”. Google Webmaster Central will show  you the number of each type of crawl error the search engine has found on your website (with links to the individual URLs with errors).
  • Top search queries. You are able to view and research the top searches that bring visitors to your site from the Google search engine results pages (SERPs).
  • Visibility into what the GoogleBot sees. This is basically a detail of the words used most often in your website. As the search engine’s spiders are essentially “blind” (i.e. they can’t see the images used on your site), the way in which relevance to a particular search term is determined is from reading the words on the site’s pages. Knowing what the GoogleBot sees can help you with your search engine optimization (SEO) efforts, as well as your web accessibility compliance efforts.
  • A listing of external links. This will show you a list of pages on your site with external links to them, along with the number of links to each page. With this listing you can also click through to see the list of external URLs.
  • A listing of internal links. This list is presented in an alphabetized format, showing links from your website to other pages within the site (this is also referred to as inter-linking).
  • Statistical information for RSS/ATOM feeds. With Google Webmaster Central you can obtain information on the number of subscribers to each feed on your website via the Google Reader and iGoogle. It’s important to note that, if your site offers feeds using a service like FeedBurner, the data in Google Webmaster Central may not match the data from FeedBurner. The reason is that there’s currently no way for site owners to upload a FeedBurner file to the domain or to put an authentication/verification meta tag on the home page. Without this authentication, feeds served up via FeedBurner can’t get added to Google Webmaster Central.
  • A listing of site links. This is the list of links and titles that Google has generated for the site and appears in the SERPs.
  • Identification of site issues. Your site may have content problems. If there are any issues with missing, duplicate, or short titles or meta descriptions, you can find this information as part of the webmaster toolset.

As Google has the lion’s share of Internet search traffic (with an active reach for 59.41%*), understanding how Google views your site, and diagnosing potential problems, is crucial to increasing your site’s visibility. Learning how Google’s robots crawl and index your website, learning what drives traffic to your site so you can refine your SEO efforts, and actually telling Google about your site by using Google Webmaster Central can help to improve your crawl-ability.

Google Webmaster Central
*Figure 1: Nielson Online – Top Online Web Brands in the U.S.

By using the Google Webmaster Central service and its various tools, you can obtain information on how Google, and, by extension Yahoo, MSN, and the other search engines, sees your website. Google Webmaster Central is an excellent way to obtain direct, expert support, diagnose any site errors, and improve your site’s search visibility.

Posted in Google

« Previous Entries Next Entries »


Subscribe rss feed Login or Register

Recent Articles

Article Categories

Articles by Month

Related Sites


Inc 5000 Google Analytics Authorized Consultant Google Qualified Company Microsoft adExcellence Member Greenified 2009

MoreVisibility
925 South Federal Highway, Suite 750
Boca Raton, FL 33432 www.morevisibility.com

800.787.0497

ph: 561.620.9682

fx:  561.620.9684


© 1999 - 2012 MoreVisibility ® All Rights Reserved. Privacy | Legal

MoreVisibility Social Networking Links Google+ YouTube LinkedIn Facebook Twitter