ASP.NET is notorious for generating some ugly markup in the HTML it generates. The idea behind search engine readability is to get the relevant content as close to the top of the HTML Source as possible.
One of the biggest culprits in terms of unnecessary code at the top of the page is ASP.NET’s VIEWSTATE. VIEWSTATE is ASP.NET’s way of making it easier for the programmer and end user to remember certain pieces of information about a page as it is being used. It does this via a hidden form field. The best way to minimize the amount of VIEWSTATE text that appears in the HTML source code is to disable it (it’s enabled by default) and only add it when you need it. Another technique for dealing with VIEWSTATE is to move it to the bottom of the HTML source code. A while back I wrote a blog post on how to achieve this:
A technology that ASP.NET employs that “makes things easier for the developer” and offers a rich end-user experience is the AJAX Control Toolkit. The AJAX Control Toolkit lets you quickly implement things such as modal dialogs (e.g., Lightbox), form field enhancements and partial page updates. The problem with using it is that it also adds an enormous amount of code to the HTML output. The best way to minimize this is to use an alternative to the Toolkit such as jQuery or MooTools. None of the code gets auto-generated in these libraries and the end result is usually nice clean HTML source code.
If you follow the above suggestions, you can greatly reduce the amount of unnecessary code in your HTML output. One of the key points to remember is that anything ASP.NET auto-generates, though great for you as a developer, probably isn’t that good for SEO.
Best web development and SEO practices dictate that any webpage which does not exist, return an HTTP response code of 404, or Not Found. Basically, this response code means that the URL that you’re requesting does not exist. It could have existed in the past, may in fact exist in the future, but definitely does not exist right now. More often than not, websites will issue the generic 404 page (which I’m sure you’ve all seen many times) when the requested resource cannot be found. While this tells you the page that you’re looking for does not exist, it also takes you outside of the site’s design and navigation structure, which can be quite annoying. To combat this annoyance, web developers can create a custom 404 error page, which issues the correct 404 response code, shows custom error handling text, and keeps the browser within the site’s design and navigation.
Fortunately, ASP.NET provides several ways to issue a custom 404 error page. For this to work properly all of these procedures should be in place simultaneously or unanticipated results may occur.
The first step in creating a custom 404 error page is to actually create the HTML for the page. How to do this is outside the scope of this post, but basically the page should mimic the site’s design, navigation and present the user with a description of the error.
The next step is to create an entry in the website’s web.config file that points to the page you created when a 404 error occurs:
Now when an ASP.NET page is requested that does not exist, the user will be presented with the custom error page you created. An important, an often overlooked factor, is that this only covers ASP.NET pages because (unless you do a lot of finagling) the web.config page will only process ASP.NET pages. In my experience, this is where most web developers leave off with their custom error handling, which quite frankly, just isn’t good enough.
OK, so what about the pages in your website that aren’t ASP.NET pages, such as images, PDFs and HTML pages? To get those to work, you need to edit the Custom Errors tab in the Website properties in IIS. If you are using a shared hosting environment, there is usually a Custom Error section in the control panel. Basically what you want to do here is create an entry for a 404 error. In Windows, you need to make sure it is of type ‘URL’ and provide the URL to the page. This will cover all other non-ASP.NET pages on your site.
There’s still one more thing you need to do to really make sure you’re handling all your 404 errors properly. Sometimes an ASP.NET page will in fact exist, but critical, dynamic information is not supplied for it to display properly. The information could be a missing query string variable or perhaps the server is requesting a row from the database that has been deleted. Rather than show the page with the wrong information (a topic that requires another blog post altogether) here is what you need to do:
Create the following function; this one happens to be in VB.NET:
Public Sub Thow404Error()
Response.StatusCode = 404
Server.Transfer(“/errorpages/404.html”)
Response.End
End Sub
Now anytime you are relying on dynamic information to properly display a page, you want to call the above function if that data is not present:
If PageHasNecessaryData() Then
‘Display Page
Else
Throw404Error()
End If
If you implement the procedures mentioned above into your ASP.NET website, then you can be confident that you are covering all of the necessary aspects of proper 404 error handling. In a future post, I will explain why failing to implement the last procedure can cause all kinds of negative SEO implications.
A common SEO technique is to make sure the content of your web page is placed as close to the top of your HTML as possible. This will help ensure that your relevant content achieves higher priority by search engine spiders. ASP.NET provides a great framework for developing feature rich web applications, especially with the addition of view state. View state gives web forms the ability to persist changes across postbacks. Other web scripting languages are not able to accomplish this easily, however, this benefit may have some negative SEO implications.
The view state of a page is placed by default in a hidden form field named ___VIEWSTATE at the top of the html source code. The contents of the __VIEWSTATE form field contain serialized information, which can get very large (tens of kilobytes), about various controls on the web page. When a web page does not have a lot of controls using view state, the hidden form field will look something like this…
… which is probably fine at the top of the page. But, often times a web page may have numerous controls, no matter how much it is optimized to minimize view state, which produce a view state value that looks something like this…
… actually it could go on and on. This particular view state sample (this is just a small portion) was 10 pages long! Needless to say, you don’t want that to appear before your precious web page content.
There is an easy way to move the __VIEWSTATE form field to the bottom of the html source code. By pasting the following VB.NET code, “as is”, into your web form, the view state will be moved to the bottom of the html source code right above the closing </form> tag…
Protected Overrides Sub Render(ByVal writer As System.Web.UI.HtmlTextWriter)
Dim stringWriter As System.IO.StringWriter = New System.IO.StringWriter
Dim htmlWriter As HtmlTextWriter = New HtmlTextWriter(stringWriter)
MyBase.Render(htmlWriter)
Dim html As String = stringWriter.ToString()
Dim StartPoint As Integer = html.IndexOf("
If StartPoint >= 0 Then
Dim EndPoint As Integer = html.IndexOf("/>", StartPoint) + 2
Dim viewstateInput As String = html.Substring(StartPoint, EndPoint - StartPoint)
html = html.Remove(StartPoint, EndPoint - StartPoint)
Dim FormEndStart As Integer = html.IndexOf("") - 1
If FormEndStart >= 0 Then
html = html.Insert(FormEndStart, viewstateInput)
End If
End If
writer.Write(html)
End Sub
Now when you browse your web page, the __Viewstate hidden form field and its ridiculously long value, will be at the bottom of the page, and your precious content will be closer to the top, just how the search engines like it.