Page Name Canonicalization and the htaccess File

Lee Zoumas - January 21, 2010

Sometimes the same page in a website can get indexed multiple times, which could potentially create a duplicate content issue and penalty. The best example of this is a website’s default or home page…

http://www.domain.com/
http://www.domain.com/index.htm

Although both of these URLs resolve to the same page, the search engines could index both of them, or possibly one and not the other. However, situations like this are not just isolated to the home page. Most websites will have default filenames in URLs contained in subdirectories like this…

http://www.domain.com/about/
http://www.domain.com/about/index.htm

… which causes the same issues as the home page.   Additionally, sometimes the internal page linking structure could link to the page with or without the index.htm filename present. To prevent the default page from being accessed by its filename, we can add the following rule to our .htaccess file:

RewriteRule (.*)(index|home|default).(html|asp|aspx|htm|php)$ $1 [NC, R=301]

With this rule in place, when any page is requested with the default filename (index.htm) in the URL, the user will get 301 redirected to the default page without the filename. This will ensure that the default filename is never in the URL and that only one version of that page will get indexed by search engines.

© 2023 MoreVisibility. All rights reserved.