OdeToCode IC Logo

The Passion and the Fury of URL Rewriting

Thursday, September 23, 2004

Any problem in computer science can be solved with another layer of indirection. – Butler Lampson

Since the dawn of the HttpModule class in .NET 1.0, we’ve seen an explosion of URL rewriting in ASP.NET applications. Before HttpModule, URL rewriting required an ISAPI filter written in C++, but HttpModule brought the technique to the masses.

URL rewriting allows an application to target a different internal resource then the one specified in the incoming URI. While the browser may say /vroot/foo/bar.aspx in the address bar, your application may process the request with the /vroot/bat.aspx page. The foo directory and the bar.aspx page may not even exist. Most of the directories and pages you see in the address bar when viewing a .Text blog do not physically exist.

From an implementation perspective, URL rewriting seems simple:

  1. Intercept the request early in the processing cycle (by hooking the BeginRequest event, for example).
  2. Use Context.RewritePath to point the runtime to a different resource.

If you want a comprehensive article about implementing URL rewriting and why, see Scott Mitchell’s article: “URL Rewriting in ASP.NET”, although I believe there is a simpler solution to the post-back problem (see the end of this post). I’m still looking for an ASP.NET topic on which Scott Mitchell has not written an article. If anyone finds one, let me know. Scott’s a great writer.

Conceptually, URL rewriting is a layer of indirection at a critical stage in the processing pipeline. It allows us to separate the view of our web application from it’s physical implementation. We can make one ASP.NET web site appear as 10 distinct sites. We can make one ASPX web form appear as 10,000 forms. We can make perception appear completely different from reality.

URL rewriting is a virtualization technique.

URL rewriting does have some drawbacks. Some applications take a little longer to figure out when they use URL rewriting. When the browser says an error occurred on /vroot/foo/bar.aspx, and there is no file with that name in solution explorer, it’s a bit disconcerting. Like most programs driven by meta-data, more time might be spent in an configuration file or database table learning about what the application is doing instead of in the code. This is not always intuitive, and for maintenance programmers can lead to the behavior seen in Figure 1.

Figure 1: Maintenance programmer who has been debugging and chained to a desk for months, finally snaps. It's the last time that machine will do any URL rewriting.

Actually, I’m not sure what the behavior in figure 1 is. It might be a Rob Zombie tribute band in Kiss makeup, or some neo-Luddite ritual, but either way it is too scary to look at and analyze.

One important detail about URL rewriting that is often left out. In the destination page, during the Page_Load event, it’s important to use RewritePath again to point to the original URL if your form will POST back to the server. This will not redirect the request again, but it will let ASP.NET properly set the action tag of the page’s form tag to the original URL, which is where you want to post back to.

Context.RewritePath(Path.GetFileName(Request.RawUrl));

Without this line, the illusion shatters, and people rant. I learned this trick from the Community Starter Kit, which is a great example of the virtualization you can achieve with URL rewriting.