This week, MSN Live Search announced changes to their robot including a new name: msnbot/1.1. Currently, the name change just applies to their main robot, but look forward to more new robots from MSN in the coming months.
The new robot comes equipped with features designed to put less of a load on servers while still collecting the maximum number of new pages. Some of the new features include:
These features are great because it means that their robot will waste less time looking at pages it already has and more time looking for new pages. This should result in more pages of a site being indexed — particularly in the case of large sites and less time spent crawling around websites using up bandwidth and putting extra pressure on servers.
This was good news for webmasters who have complained that MSN’s robot was taking up too much bandwidth while crawling their site, but the technology is not new to web hosts. The other search engines have been employing these methods for some time to reduce the load they place on web servers. Yahoo has supported both these technologies since 2005. Ask supports http compression and Google supports conditional get and other methods to reduce bandwidth already. This just shows that MSN is still playing catch-up with the other major search engines. Still, it’s welcome news and MSN has been kind enough to provide us with a tool to check if server settings support http compression and conditional get. For more information on this, just check out the Live Search blog and put out the welcome mat for msnbot/1.1.
Resolving the canonicalization issue can be a big headache because not all webmasters have the kind of server administration control that allows complete control over which version of your domain name is displayed to the world. However, it is important to note that while the optimal way to ensure that all search engines see only one version of your domain is by redirecting the non-canonical version to the canonical one, it’s not the only way.
First, the canonicalization issue is only an issue if search engines find your site under the “wrong” version. If search engine indexes are only showing your site under your preferred version, then it is less of a problem.
If you want to find out if your site has a problem, type site:yourdomain.com into the query box for each search engine like this:
If all the listings are for the same version of your domain as shown here, canonicalization is not yet a pressing issue for you:
However, this does not mean you are in the clear. Depending on how your site is built, all it would take is one other site to link to you using the non-canonical version and your site could end up with pages in search engine indexes from both versions of your domain. This could result in a canonicalization issue whereby some pages don’t receive the link value that they deserve and that can affect rankings.
To prevent this with Google, just register with Webmaster Tools and set a preferred domain. Other search engines do not have this option, so if you’re worried about how your results may be displayed in Yahoo or Live.com, a more universal fix may be in order.
What if you do have a problem? If your site is already showing pages from both versions of the domain in the index, using Google’s preferred domain tool is not the way to go. It can actually cause pages of your site to fall out of the index and this is definitely not a good solution. Poorly linked pages are better than no pages at all!
To make this a more universal fix, ensure that all the link references on your site are absolute rather than relational. What this means is that instead of linking to internal pages using code like this:
format all anchor tag link references to include the complete domain name:
This way, even if a spider does find one of your pages under the wrong version, it will not proceed to crawl the rest of the site under the wrong domain.
This last solution will work to resolve the canonicalization issue even if the search engines have already found your site under both versions, although not as quickly. By changing all link references on the site to refer to the absolute version of the preferred domain, eventually search engines will also prefer the pages from that preferred domain because the pages of the non-preferred version will show fewer in-bound links. Once these non-preferred pages have been replaced in the indexes, you can clinch the deal by setting your preferred domain with Google’s Webmaster Tools.
Search engine optimization to many people means optimizing for Google, but there are good reasons why this is a shortsighted approach. For one thing, search engine algorithms are always changing and any site that can do well on all three search engines is naturally more likely to be resilient to any changes that any one search engine might make. This is because the criteria that each search engine uses are different. For another thing, not all searchers are as savvy as you might think. There are lots of internet users out there that only know how to search with the buttons that came with the browser and if that browser is Internet Explorer, they’re probably searching on MSN. A search for [presidential election] on Google, Yahoo and Live Search give roughly the same results for Google and Live and a radically different result for Yahoo. In addition, both Google and Live show big changes depending on whether you search for election or elections while Yahoo’s results are largely unchanged. Results like this may cause a poor website owner to wonder if it is possible to rank well for any keywords in all three search engines, but we can attest that it can be done. We have seen some of our clients do it by providing good quality sites, with lots of relevant content. Relying only on Google for traffic can leave you high and dry when the algorithm changes as we saw it do just a few short weeks ago.
Now that Live.com’s new Live Search has been up and running for a while, everyone might want to bone up on their Live Search webmaster site optimization recommendations. Here are a few we found while looking at Live’s webmaster help.