The Passion and the Fury of URL Rewriting

Thursday, September 23, 2004

Any problem in computer science can be solved with another layer of indirection. – Butler Lampson

Since the dawn of the HttpModule class in .NET 1.0, we’ve seen an explosion of URL rewriting in ASP.NET applications. Before HttpModule, URL rewriting required an ISAPI filter written in C++, but HttpModule brought the technique to the masses.

URL rewriting allows an application to target a different internal resource then the one specified in the incoming URI. While the browser may say /vroot/foo/bar.aspx in the address bar, your application may process the request with the /vroot/bat.aspx page. The foo directory and the bar.aspx page may not even exist. Most of the directories and pages you see in the address bar when viewing a .Text blog do not physically exist.

From an implementation perspective, URL rewriting seems simple:

  1. Intercept the request early in the processing cycle (by hooking the BeginRequest event, for example).
  2. Use Context.RewritePath to point the runtime to a different resource.

If you want a comprehensive article about implementing URL rewriting and why, see Scott Mitchell’s article: “URL Rewriting in ASP.NET”, although I believe there is a simpler solution to the post-back problem (see the end of this post). I’m still looking for an ASP.NET topic on which Scott Mitchell has not written an article. If anyone finds one, let me know. Scott’s a great writer.

Conceptually, URL rewriting is a layer of indirection at a critical stage in the processing pipeline. It allows us to separate the view of our web application from it’s physical implementation. We can make one ASP.NET web site appear as 10 distinct sites. We can make one ASPX web form appear as 10,000 forms. We can make perception appear completely different from reality.

URL rewriting is a virtualization technique.

URL rewriting does have some drawbacks. Some applications take a little longer to figure out when they use URL rewriting. When the browser says an error occurred on /vroot/foo/bar.aspx, and there is no file with that name in solution explorer, it’s a bit disconcerting. Like most programs driven by meta-data, more time might be spent in an configuration file or database table learning about what the application is doing instead of in the code. This is not always intuitive, and for maintenance programmers can lead to the behavior seen in Figure 1.

Figure 1: Maintenance programmer who has been debugging and chained to a desk for months, finally snaps. It's the last time that machine will do any URL rewriting.

Actually, I’m not sure what the behavior in figure 1 is. It might be a Rob Zombie tribute band in Kiss makeup, or some neo-Luddite ritual, but either way it is too scary to look at and analyze.

One important detail about URL rewriting that is often left out. In the destination page, during the Page_Load event, it’s important to use RewritePath again to point to the original URL if your form will POST back to the server. This will not redirect the request again, but it will let ASP.NET properly set the action tag of the page’s form tag to the original URL, which is where you want to post back to.

Context.RewritePath(Path.GetFileName(Request.RawUrl));

Without this line, the illusion shatters, and people rant. I learned this trick from the Community Starter Kit, which is a great example of the virtualization you can achieve with URL rewriting.


Comments
Jon Galloway Friday, September 24, 2004
Great tip! We[1] were using javascript and fixing this on the client side, but this is much cleaner.
<br>
<br>Thank you!
<br>
<br>We can now shave our scruffy hair and beard and remove the manacles and stop bashing things with sledge hammers. And we owe it all to you.
<br>
<br>[1] A corporate &quot;we&quot;, not the royal &quot;we&quot;.
Jeff Atwood Wednesday, December 22, 2004
Good tip. I'm glad you cross linked the original rant, otherwise I'd be stuck with that solution.
<br>
<br>I still like using IIS to remap 404 handlers to start this process. As long as you're rewriting only non-asp handled filetypes such as http://mywebsite/browsestuff/ --&gt; http://mywebsite/browse.aspx?stuffcategory=3 , a Server.Transfer is then effectively a rewrite.
<br>
<br>It just gets annoying when .aspx files are thrown into the remapping mix.
Jeff Atwood Wednesday, December 22, 2004
Erm.. I'd be very careful with this &quot;fix&quot;. If it works for you, great, but I am getting HIGHLY erratic results from Context.RewritePath -- either overloaded version. I cannot get it to behave sensibly at all, even when I hard-code strings in there. :P
<br>
<br>I think it's because I am doing Server.Transfer from &quot;404.aspx?404;http://testsite.net:80/webform1.asp&quot;.
<br>
<br>Bah. Why can't we just have a blank action attribute for postback target (eg, let the browser default to the current page all by itself?)
Malaga Sunday, March 27, 2005
Greetings from Malaga (Spain). Antonio :-)
Scott Sunday, March 27, 2005
And a cheery hello to you, Antonio!
Eduardo Friday, July 22, 2005
Thank you for the excellent advice, this has helped immensely!

The Misfits rule!
Ben Friday, October 14, 2005
THis article got me the closest to solving my problem however I still have one...

If I load a URL http://yadda/form.aspx?QS=1

The resulting form.aspx's form's action will be equal to "form.aspx?QS=1".

I need to remove "?QS=1" as if this form then posts the querystring gets sent to.

Using Context.RewritePath() I cannot seem to change the value (I place this inthe PageLoad of form.aspx) ... Incidentally if I call Context.RewritePath("form.aspx") the resulting action id equal to "?QS=1" which I thought was really odd!

Hope someone can shed some light!

cheers
ben
David Tuesday, November 15, 2005
Thank you for the tip from Barcelona.
Frances Thursday, January 19, 2006
I used this for one site and it worked perfectly. The url rewriting happened in global.asax.

Now I'm doing another site and I have the same or similar problem as Ben above. The url is rewritten from http://www.template.aspx?id=x to http://www.filename_x.aspx. But the querystring remains in the action attribute of the form - http://www.filename_x.aspx?id=x

No idea why so I'm going to try Scott Mitchell's actionless form unless anyone has any bright ideas (doesnt look like it).
Will Asrari Thursday, February 23, 2006
I've successfully created a URL rewriting application via Global.asax. It's very straight-forward and includes Regular Expression support.

It works on shared hosting as well so you don't have to deal with IIS configuration.

www.willasrari.com/.../00043.aspx
Will Asrari Saturday, April 1, 2006
I created a new rewriting solution (again using Global.asax) that doesn't require a file extension. Instead it uses post slugs as unique identifiers.

www.willasrari.com/.../000109.aspx

Full source and sample database included!
Richard Lavey Wednesday, May 24, 2006
I've found a way of stripping the querystring params from the form action for postbacks. In OnLoadComplete() do :

Context.RewritePath(Request.FilePath + "?");

The ? causes the current querystring to be dropped. Rewriting in OnLoadComplete ensures that the Request.QueryString collection is populated properly.
Will Thursday, June 8, 2006
Thanks Richard, overriding OnLoadComplete() worked perfectly... and that, combined with the original suggestion, was much simpler than any of the other workarounds for this problem that I've seen.
AndrewM Tuesday, July 11, 2006
I am trying the last stuff posted here by Richard. I am being a bit stupid but is there a way to get this working for v1.x?
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!