Argument Validation

Sunday, February 6, 2005 by scott

I’m still poking around the common validation areas of Enterprise Library.

The ArgumentValidation class has 6 public static methods:

public static void CheckForEmptyString(string variable, string variableName) ...
public static void CheckForNullReference(object variable, string variableName) ...
public static void CheckForInvalidNullNameReference(string name, string) ...
public static void CheckForZeroBytes(byte[] bytes, string variableName) ...
public static void CheckExpectedType(object variable, Type type) ...
public static void CheckEnumeration(Type enumType, object variable, string) ...

CheckForEmpty string has the following implementation:

   21 public static void CheckForEmptyString(string variable, string variableName)
   22 {
   23    CheckForNullReference(variable, variableName);
   24    CheckForNullReference(variableName, "variableName");
   25    if (variable.Length == 0)
   26    {
   27         throw new ArgumentException(SR.ExceptionEmptyString(variableName));
   28    }
   29 }

I like this approach to validation. It feels much cleaner to start a method off with a handful of static method calls to validate arguments compared to starting with a handful of conditional blocks sprinkled with throw statements. The checks are explicit and readable here.

I know some people go nuts defining custom exception hierarchies – I’ve never been a big fan of that approach. The EntLib uses exception classes from the BCL, and keeps the error messages in a resource file and ready for localization. Clean. Flexible.

Most checks are in the form of if (variable == null), although there is at least one place where the code looks like: if (null == variable). I understand the reasons why we did this in C++: it was easy to mix up the assignment operator with the equality operator, but I never cared for expressions in this form and never subscribed to the practice. The C# compilers are better at warning about this subtle error, so I don't plan on ever switching either.

Next up: unit tests in EntLib. I might be looking these over during the trek to VSLive! tomorrow.

I've Been Waiting

Saturday, February 5, 2005 by scott

I’m not right all of the time – just ask anyone I work with! I’ve been waiting for someone to take me to task over something I’ve written badly, and this evening I was introduced to Dinis Cruiz in his post Some Comments to Misleading and False Information…”.

Dinis makes a great point – an application domain is not a secure boundary if code is running with full trust. Unfortunately, it seems most ASP.NET shared hosting providers are running ASP.NET code with full trust. I’ve updated my article on application domains to make this point stand out.

Error Handling

Friday, February 4, 2005 by scott

When looking at the code for a product or library, I find I’m always drawn to look at the error handling and validation pieces. Since every piece of software has these pieces in one form or another it’s always interesting to compare and contrast the different practices and methodologies used throughout the industry. The newly released Enterprise Library, being from the patterns and practices group, obviously had some thought put into this area, so I was keenly interested to peek at the code.

Here is one of the constructors from the Enterprise Library’s ConfigurationBuilder class:

   66 public ConfigurationBuilder(string configurationFile) 
   67 {
   68    ArgumentValidation.CheckForNullReference(configurationFile, "configurationFile");
   69    ArgumentValidation.CheckForEmptyString(configurationFile, "configurationFile");
   70 
   71    LoadMetaConfiguration(configurationFile);
   72 }

The code snippet is short, but tells a tale. It tells me I can expect the Enterprise Library to validate explicitly all the parameters I pass.

The code snippet also tells me the team made a decision to let constructors throw exceptions. Deciding to let exceptions escape from a constructor is not an easy decision, particularly in this case because the ConfigurationBuilder class inherits from IDisposable. If the constructor acquires a resource (say, opens a configuration file) and then throws - there is no object reference returned by the constructor for the client to invoke Dispose upon.

On the other hand, covering this edge condition always increases the complexity of the internal implementation, and can also make the public interface harder to use (for instance if the client has to invoke both a constructor and an Initialize method).

It's a tough tradeoff - 'simple elegance' or 'paranoid error handling'. Which do you prefer?

Global.asax or an HttpModule?

Thursday, February 3, 2005 by scott

We know HttpApplication events like BeginRequest, AuthorizeRequest, and Error can be extremely useful for implementing functionality like URL re-writing and custom authentication schemes. We can choose to catch these events in one of two places: inside of the methods provided by Global.asax, or inside a custom HttpModule.

I favor the global.asax approach only if the logic inside the events is heavily application specific. In general, an HttpModule is the best approach for a number of reasons.

Firstly, I don’t want global.asax to become a dumping ground for code snippets that need to execute during the request lifetime. Using one or more HttpModules allows me to cleanly factor out responsibilities into distinct classes. A clean design also allows me to place the HttpModule in a class library for reuse across multiple projects.

Secondly, HttpModules are a bit more flexible since I can add them and remove them at runtime with a modification to web.config (or machine.config). A testament to the flexibility and power of an HttpModule is the ValidatePathModule released by Microsoft last year to avoid canonicalization bugs in ASP.NET (note to self: make sure the spell checker doesn’t replace canonicalization with cannibalization). Looking far down the road to the Longhorn timeframe, HttpModules will have even more flexibility in the componentized .NET-aware IIS 7.0. More details on new features in IIS 7.0 next week…

A Brief History Of Electronic Reporting

Tuesday, February 1, 2005 by scott

It’s difficult to say there was a beginning, really. Let’s just say it started from a singularity.

There was no space.

No time.

No reports.

In 1961, IBM put forth the first official release of the Report Program Generator (RPG) language. The universe of reporting expanded rapidly. Reams of paper began to spill onto the desks of corporate middle managers across the globe. Neatly formatted tables of numbers exploded in every direction – driven by the deafening cadence of teletype printers.

March of 1962 is the first known use of the following phrase: “I swear, if the finance people request another report format change, I’m going to roll the paper into a ball and set it on fire in their office.” This oath continues to echo in the halls of business today.

Will it ever end?

Reporting Services From The Command Line

Monday, January 31, 2005 by scott

This past week I managed to:

Shovel snow
Catch a cold
Sleep for 16 hours on a Saturday
Model the process of cross matching blood specimens to blood units for transfusion with a database schema.
Meet various other work related hard deadlines…

All of this fun activity didn’t leave much time for:

Blogging

This evening I finally got to jump back into a Reporting Services frame of mind in preparation for San Fran next week. Reporting Services ships with a utility to run “scripts” from the command line: the rs (rs.exe) utility. Most Microsoft server applications expose a COM object model that admins can script against using VBScript, but Reporting Services exposes only a web service API. The “scripts” that rs.exe executes are actually little hunks of VB.NET code that invoke methods on the web service API.

Reporting Services only ships two sample scripts. One script cancels running jobs on the server (CancelRunningJobs.rss), the second script (PublishSampleReports.rss) demonstrates how to deploy reports to the report server (yes, the default extension is unfortunately “rss”). A really short script might look like the following:

Sub Main() 
  Dim permissions As String() 
  permissions = rs.GetSystemPermissions() 

  For Each permission As String in permissions 
    Console.WriteLine(permission) 
  Next 
End Sub

I can save the above snippet in a text file and execute the instructions from the command line like so:

rs -i scriptname.rss -s http://localhost/reportserver

The rs utility compiles the above snippet (without option strict, so there is no need to declare permissions with a specific type) and executes the code. Obviously, there is some additional infrastructure in place to let the variable rs invoke web service methods. A little poking around with reflector reveals the utility sets up our code with imports for the System, System.Web, System.WebServices, System.WebServices.Protocols, System.IO, and Microsoft.SqlServer.ReportingServices namespaces. The utility wraps the code chunk shown above inside of a class declaration (____ScriptClass) and adds a member variable (rs) defined as type ReportingService.

What’s also interesting is an embedded resource in rs.exe by the name of StartupClass.vb. The StartupClass, stripped down to essentials (I removed the code for exception handling and for batching commands in the script.), looks like the following:

Public Module MainModule 

  Public Sub Main(ByVal args as string()) 

    Dim rs as new WebServiceWrapper(args(0), args(1), args(2), args(3)) 
    Dim clientScript as new ____ScriptClass() 
    clientScript.rs = rs 
    clientScript.Main() 

  End Sub 

End Module

What intrigues me about rs.exe is how easy it is to pull off this “scripting” technique in .NET. Think about what it would take for an application to pick up a hunk of C++ source code from the disk and just execute it in-process. It is possible, of course, but fraught with peril.

The tricks you can pull off safely in .NET just never cease to amaze me.

Funny Numbers In My Stack Trace

Tuesday, January 25, 2005 by scott

Someone in the newsgroups recently asked what the numbers mean in a release mode stack trace. For instance, what can you do with the following information?

[NullReferenceException: Object reference not set to an instance of an object.]
   aspnet.debugging.BadForm.Page_Load(Object sender, EventArgs e) +34
   System.Web.UI.Control.OnLoad(EventArgs e) +67
   System.Web.UI.Control.LoadRecursive() +35
   System.Web.UI.Page.ProcessRequestMain() +720

The stack trace gives us a little bit of information to go on, at least we can narrow the problem down to a specific method inside of a specific type, but +34 doesn’t seem to be particularly helpful – what does it mean? There are two answers to this question.

The first answer is: don’t worry about it. To narrow the source of the exception down to a single line of code, just build with debugging information (Project -> Properties -> Configuration Properties -> Build -> Generate Debugging Information). You should now have a PDB file output alongside your release mode assembly. With a PDB in place the runtime can give you a line number close to where the exception occurred.

 
[NullReferenceException: Object reference not set to an instance of an object.]
   aspnet.debugging.BadForm.Page_Load(Object sender, EventArgs e) in 
e:\dev\xprmnt\aspnet\debugging\badform.aspx.cs:16
   System.Web.UI.Control.OnLoad(EventArgs e) +67
   System.Web.UI.Control.LoadRecursive() +35
   System.Web.UI.Page.ProcessRequestMain() +720

The above answer is the practical answer, but it doesn’t satisfy the curious person who insists that all numbers have a meaning and a purpose. +34 isn’t some random number the runtime spits out after sampling the microphone input - so where does it come from?

For reference, the Page_Load method looks like the following:

using System;
using System.Web.UI;
using System.Web.UI.WebControls;
 
namespace aspnet.debugging
{
   public class BadForm : Page
   {
      protected Label Label1;
   
      private void Page_Load(object sender, EventArgs e)
      {         
         if(!IsPostBack)
         {
            string s = null;
            Label1.Text = s.Insert(0, "foo");
         }  
      }
 
      #region Web Form Designer generated code …
   }
}

We already know +34 doesn’t refer to a line of code. So perhaps the number refers to an offset into the IL instructions? Here is what ILDASM shows as the instructions for Page_Load.

.method private hidebysig instance 
void  Page_Load(object sender,
                class [mscorlib]System.EventArgs e) cil managed
{
  // Code size       34 (0x22)
  .maxstack  4
  .locals init ([0] string s)
  IL_0000:  ldarg.0
  IL_0001:  call       instance bool [System.Web]System.Web.UI.Page::get_IsPostBack()
  IL_0006:  brtrue.s   IL_0021
  IL_0008:  ldnull
  IL_0009:  stloc.0
  IL_000a:  ldarg.0
  IL_000b:  ldfld      class [System.Web]System.Web.UI.WebControls.Label aspnet.debugging.BadForm::Label1
  IL_0010:  ldloc.0
  IL_0011:  ldc.i4.0
  IL_0012:  ldstr      "foo"
  IL_0017:  callvirt   instance string [mscorlib]System.String::Insert(int32,
                                                                       string)
  IL_001c:  callvirt   instance void [System.Web]System.Web.UI.WebControls.Label::set_Text(string)
  IL_0021:  ret
} // end of method BadForm::Page_Load

No, 34 bytes puts us beyond the end of the IL instructions.

Let’s go one level lower into native code. Using WindDBG and SOS, it’s easy to view a disassembly of Page_Load after the JIT compiler has partied on the method. I use WinDBG to attach to the asp.net worker process after the page has executed and do a .load sos for managed debugging.

To view a disassembled method, you first need the address of the MethodDesc structure for the method, which name2ee can yield (aspnet.dll is the unfortunate name I gave my assembly):

!name2ee aspnet.dll aspnet.debugging.BadForm.Page_Load
--------------------------------------
MethodDesc: 1b873a0
Name: [DEFAULT] [hasThis] 
Void aspnet.debugging.BadForm.Page_Load(Object,Class System.EventArgs)

Now pass the address to the (u)nnassemble command.

!u 1b873a0
Normal JIT generated code
[DEFAULT] [hasThis] 
Void aspnet.debugging.BadForm.Page_Load(Object,Class System.EventArgs)
Begin 07f805b8, size 52
07f805b8 55               push    ebp
07f805b9 8bec             mov     ebp,esp
07f805bb 83ec10           sub     esp,0x10
07f805be 57               push    edi
07f805bf 56               push    esi
07f805c0 53               push    ebx
07f805c1 8955f8           mov     [ebp-0x8],edx
07f805c4 8bf1             mov     esi,ecx
07f805c6 33db             xor     ebx,ebx
07f805c8 8bce             mov     ecx,esi
07f805ca e819a8fef9       call    01f6ade8 (System.Web.UI.Page.get_IsPostBack)
07f805cf 0fb6f8           movzx   edi,al
07f805d2 85ff             test    edi,edi
07f805d4 752a             jnz     07f80600
07f805d6 33db             xor     ebx,ebx
07f805d8 8bbec4000000     mov     edi,[esi+0xc4]
07f805de ff357c40f906     push    dword ptr [06f9407c] ("foo")
07f805e4 8bcb             mov     ecx,ebx
07f805e6 33d2             xor     edx,edx
07f805e8 3909             cmp     [ecx],ecx
07f805ea ff1564ddb779  call dword ptr [mscorlib_79980000+0x1fdd64 (79b7dd64)]
                         (System.String.Insert)
07f805f0 8945f0           mov     [ebp-0x10],eax
07f805f3 8b55f0           mov     edx,[ebp-0x10]
07f805f6 8bcf             mov     ecx,edi
07f805f8 8b01             mov     eax,[ecx]
07f805fa ff9094010000     call    dword ptr [eax+0x194]
07f80600 90               nop
07f80601 5b               pop     ebx
07f80602 5e               pop     esi
07f80603 5f               pop     edi
07f80604 8be5             mov     esp,ebp
07f80606 5d               pop     ebp
07f80607 c20400           ret     0x4

The method starts at address 07f805b8. Moving 34 bytes into the method lands us on the ‘cmp [ecx], ecx’ instruction. CMP is an x86 opcode for compare, which seems like an odd thing to do until you realize the funny quirks of assembly language. For instance, earlier in the method you’ll see xor edx, edx. This instruction isn’t really trying to do an exclusive OR operation (XOR). Instead, this is an old trick used to set the EDX register to 0 in the fastest manner possible: by XORing the register against itself. The more obvious “mov edx, 0” (move 0 into edx) might cost an extra half nanosecond on a 3 GHz machine.

What the cmp instruction is doing here is carrying out part of the IL’s callvirt contract (referring to the IL you’ll see we do a callvirt to the String.Insert method). We can see in the dissasembly the next instruction calls the Insert method. callvirt is documented as throwing a NullReferenceExcecption if the object reference for the instance method you want to call is null. To make a long story a tad shorter: the ecx register contains the this pointer for the string instance that we want to call Insert on, and cmp [ecx], ecx generates a compact, 2 byte instruction to check ecx for null by dereferencing the ecx register. When the instruction dereferences the null pointer in ecx – program go boom.

In conclusion, +34 does have a meaning – it’s an offset into the native instructions for the method. Debugging at the assembly level doesn’t make any sense because you should always have PDBs available, but at least we can all sleep at night knowing the number 34 holds some meaning … somewhere.

All My Pluralsight Courses

OdeToCode by K. Scott Allen