November 2004 - Posts

ISNULL versus COALESCE

SQLJunkie Adam Machanic has a good analysis of the SQL functions ISNULL and COALESCE, both of which we can use to replace NULL values in T-SQL.

I wasn't aware of the perf difference, but I quit using ISNULL when it bit me in some code I was using to extract records from a client’s database. They had a field allowing NULL values where we did not. I was using something like the following batch, which has a bug.

CREATE TABLE #Foos
(
  FooID int PRIMARY KEY IDENTITY,
  FooDescription varchar(5) NULL
)
 
CREATE TABLE #Bars
(
  BarID int PRIMARY KEY IDENTITY,
  BarDescription varchar(15) NOT NULL
)
 
INSERT INTO #Foos (FooDescription) VALUES(NULL)
INSERT INTO #Foos (FooDescription) VALUES(NULL)
 
INSERT INTO #Bars (BarDescription)
  SELECT ISNULL(FooDescription, 'No Description')
  FROM #Foos
 
INSERT INTO #Bars (BarDescription)
  SELECT COALESCE(FooDescription, 'No Description')
  FROM #Foos
            
 
SELECT BarID, BarDescription FROM #Bars
 
DROP TABLE #Bars
DROP TABLE #Foos

The resultset from the previous batch will make my bug stand out:

1          No De
2          No De
3          No Description
4          No Description

Oops. ISNULL apparently truncates the replacement value (2nd parameter) to the size of the check expression (1st parameter), while COALESCE does not. Long live COALESCE.

posted by scott with 4 Comments

Arrogance

Richard Morin wrote in the Washington Post:

“…perhaps people will start ignoring the bloggers, who proved once more that their spectacular lack of judgment is matched only by their abundant arrogance.”

This line from the article “Surveying The Damage” spent time bashing bloggers who released early exit poll data on the U.S. presidential race.

I read this line thinking Richard Morin had blogging envy and was taking a few shots at the competition. Then I listened to two IT Conversation podcasts and started to agree. The podcasts were the Election 2004 session from BloggerCon III, and the Gillmor Gang’s interview with eBay tech evangelist Jeff McManus. There was more than one voice in these shows demonstrating a self-important arrogance.

If bloggers can convince politicians to blog, perhaps some good could result. If bloggers want to ram blogging and podcasting down everyone’s throats as the solution to all political, marketing, and business problems, then I don’t think they will get very far.

Poking fun at people who “just don’t get it” doesn’t work either.

Persistence and favorable results do work.

I think blogs are working extremely well in the tech community. I’ve made a number of friends I’ve never met (but hope to), and there is a great community atmosphere. It's passion, not arrogance, which makes it fun.

posted by scott with 6 Comments

BI and SQL Server 2005

Just to finish up the story I started a few days ago (and get the albatross off my neck, so to speak), here is how the exploration of SQL 2005 DTS and OLAP features turned out. The end result was a cube of all domestic commercial flights for January 2003 (about 560,000 records) with delay, cancellation, flight time, taxi time information and more.

There were a few disappointments, like the handful of unhandled exceptions that crashed the Business Intelligence Development Studio. I have a feeling some of the problems were related to the source control features of the beta, as most of the weirdness would occur after a checkout operation (I’m using SourceGear Vault at home – Vault is free for a single user). I’ve never had a problem with Vault, so I’m guessing all the problems are in the BIDS beta.

The most frustrating problem was getting the OLAP designer to pick up schema changes in the underlying fact table. I had to delete the cube, dimensions, and data source views and start over.

The good news is, 2005 not only fixes many of the frustrations in 2000, but adds a good many features.

It requires considerably less work to get a basic cube up and running thanks to the ability of the IDE to analyze relationships and figure out a base set of dimensions, levels, and measures. Perhaps 160 mouse clicks down to 6.

The brain dead cube browser in 2000 has been replaced with what appears to be a version of the pivot table in Office Web Components.

The GUI remains responsive during cube processing (although it does not appear to show the number of records processed).

Everything appears to save as XML, the project, the structures, the packages. The source control integration and deployment options appear that they will be CM and deployment friendly without hand rolling tools.

The performance of the import and of Analysis Services processing was very good considering this is all running on a Virtual PC with 512 MB of RAM.

It appears everything is looking pretty good, hopefully the documentation and features will shore up by beta 3. There are so many options and buttons it is hard to stay oriented at times. In one week there will be a flurry of SQL Server 2005 webcasts, unfortunately only 1 appears to be related to Analysis Services. Hopefully there will be more coming down the road.

posted by scott with 2 Comments

Give Me More Text

To continue with my previous post (Give Me Text), I wanted to show a little UI from the SQL Server 2005 Business Intelligence Studio. There are obvious improvements in the feature set formerly known as Data Transformation Services. The toolbox is brimming with tasks and transformations you can drag and drop into the Data Flow designer. Stuff like “Pivot”, “Aggregate”, and “Fuzzy Grouper”.

There are parts, however, that remind me why I still prefer programming with keyboards and text files to GUIs. GUIs do not scale. In order to perform multiple operations I need to perform multiple actions. Like the following dialog: 

The problem here is it took at least 34 mouse clicks to change the behavior of each column. from “Fail Component” to “Redirect Row” (to send erroneous records from a flat file to an error file instead of halting). UI designers do this to me all the time, and it’s the reason I get frustrated with them.

Text files scale because you can select one operation and have it applied to multiple locations. UIs do not do this too well – they make you click, click, clickety drag click. It’s why I try to avoid the DTS designer in 2000. The OLAP designer in 2000 is even worse.

The good news in 2005 is that there really is a text file underneath, and perhaps with some more documentation on the XML format these tasks might become easier.

It might also be interesting to see what UI accessibility improvements come along with Avalon and Longhorn. Perhaps they can provide some sort of meta layer where I can group similar controls easily and modify them with one action.

Even better would be if I could speak to the computer and ask it to “change all error drop down selections to redirect rows”. The magical accessibility layer can find drop down controls under the Error column and set everything without me moving a mouse.

That would be cool, but I’m still not giving up my text files.

posted by scott with 3 Comments

Give Me Text

Every now and then someone will talk about the sad fact that we still use text files and text editors to build software. I used to think about this when I was in the DTS or OLAP cube designers of SQL Server 2000. I’d think about how writing software with a GUI wasn’t all it was cracked up to be. UI designers can be tedious, and in the Microsoft world they usually save data in opaque blobs.

Because of the blob factor, both DTS and OLAP resist attempts at serious configuration management. The deployment strategy for an OLAP cube is to drag the cube with a mouse from one server to the next in the MMC. I kept looking for the “generate script” option in the MMC – there is none*. One of my first tasks with OLAP was to write a utility to walk the object model exposed by OLAP’s COM interface and serialize the structure into an XML file.

XML is easier on the eyes than a structured storage file, in the same way it looks prettier on the wire when compared to a DCOM packet. You can check XML into source control, diff it, parse it, query it, and transform it. XML is friendly and comfortable. Using an XML format paid off for us recently when we needed to localize cubes for a client in another country. Add a little XPath, a little config file, and mes cubes parlent français.

In playing with the Business Intelligence Development Studio of SQL Server 2005, I am happy to see both Integration Services (DTS) packages and OLAP cubes save into files with an XML format. I’m even happier to see the “Add Solution To Source Control” option. It’s good to see progress. It’s good to see text!

* On the SQL 2000 Resource CD Microsoft released the MetaDataScripter. You can point this tool at an OLAP database and it will generate VBScript that can create the cube. For non-trivial databases though, the resulting vbs script can easily exceed 1MB. It is text, but it is not too friendly.

posted by scott with 1 Comments

Array Size

During my occasional forays into the land of Dim, Sub, and IsNothing, I’ve needed to remind myself that the number in an array definition represents the upper bound.

Dim months(11) As String 'an array with upper bound 11 has 12 elements

It feels weird, but as long as I kept telling myself the number represented the upper bound, and not the length of the array, everything seemed OK.

Then one day I needed an array of 0 elements.

Dim foo(-1) As String 'an array of 0 elements is declared with 0 – 1 = -1. 

I can’t get this statement out of my head. The illusion is broken. I now think of the number not as the upper bound, but as the length of the array minus 1.

Suddenly, writing array code in VB requires more thought and attention to detail with all of these subtractions going on.

 

posted by scott with 3 Comments

In Search Of Raw Data

Perhaps if Samuel Taylor Coleridge were alive today, he’d write verse like the following:

Data, data, every where,
And all the disks did shrink;
Data, data, everwhere,
Nor any bit to think.

Then again, if Coleridge were alive today maybe he’d be spending all of his time watching reenactments of Mongol military campaigns on cable TV’s History Channel and never write a single poem. Who knows? At least he had the foresight to end lines with semi-colons.

I’ve wanted to experiment with the new DTS and Analysis Services offerings in SQL Server 2005 for some time. I’ve spent a fair amount of time with the current versions and have come to love and loath certain features. I decided the first order of business would be to find an interesting set of data to work with from start to finish. Some google searches for raw data in CSV format turned up several possibilities.

The Federal Justice Statistics Resource Center collects information on crime, but inappropriate use of the data violates three federal regulations. Next.

The Baseball Archive is highly touted as having the most complete set of baseball statistics anywhere. These stats are already sliced and diced to death. Next.

Carnegie Mellon has StatLib – a collection of various datasets and statistical software. There were some interesting possibilities here.

The Center For Disease Control and Prevention publishes tons of data, but not all in easy formats, and I’m already hip deep in mortality data during the day. Next.

A Jounalist’s Database of Databases has a collection of links to national and international sources of data – this is where I found the following entry.

The Bureau Of Transportation Statistics has plenty of information about planes, trains, and automobiles. The web interface allows you to pick which fields to export. One data set that caught my eye was the on-time performance of domestic flights. It’s a simple CSV import and might make for an interesting OLAP cube. One month of data looks like it has about 1 million records.

Perhaps if Coleridge were alive today, and had a bad experience at the airport, he’d write something like this:

Day after day, day after day,
We stuck, nor breath nor motion ;
As idle as a painted jet,
On this tarmac of commotion.

posted by scott with 3 Comments

Remoting Over TDS

This week saw the release of a long awaited white paper: Using CLR Integration in SQL Server 2005. This white paper is the first step in what I am sure will be an ongoing stream of best practice information for using the common language runtime inside the database engine.

One area in need of some additional guidance will be the design and usage of user defined types (UDTs). In SQL 2005, if you need a columnar data type outside of the normal varchar, int, and datetime etc. types, you can write your own. As the paper outlines, UDTs are not a way to achieve object-relational extensibility, so we won’t want to create a Customer UDT. Instead, UDTs will augment the basic scalar data types in SQL Server. The engine can index a column of UDTs and enforce RI constraints against them.

All of the examples I’ve seen to date deal with UDTs from TSQL and CLR code executing inside database. Wally McClure found the following error message when using a SELECT statement from the query tool:

Could not load file or assembly "Assembly Name, Version=xxxx.xxxx.xxxx.xxxx, Cultere=neutral, PublicKeyToken=null' or one of its dependencies. HRESULT: 0x80070002 The system cannot find the file specified

Niels Berglund explains why this happens. If you use a ToString on the column in the SELECT statement you’ll get back strings, but with no methods invoked you fetch actual instances of the UDT type. The query tool’s appdomain has to load the assembly for this type. Unless the assembly is properly deployed the load will fail.

The interesting thought here is that there is a new .NET remoting channel in town: the tabular data stream protocol used natively by SQL Server. Along with this capability comes all of the intricacies of sharing type across a boundary, like versioning and GAC deployments. Whoa.

posted by scott with 2 Comments

Indigo Day

VSLive! just added an entire day of Indigo content to the conference schedule for San Francisco. February 8th is looking like a busy day.

I wonder if it will snow in the Northeast while I am away. It's always fun to call up snowbound  colleagues back in the office and ask them how the weather is. Last February was so cold here I had soda cans exploding in my car. Brrr.

posted by scott with 0 Comments

Virtual Machine Freestyle Battle

I’m driving through the streets today when I pull up beside a car with a personalized license plate: “JAVA VM”. I’m obviously beside a developer who is fanatic about the Java virtual machine.

I guess I’ve been reading too many of the accidental freestyle battles on Bill’s blog lately. I tossed in a CD with some bass thumping action, wound down my window, and started screaming out loud:

Yo, VM boy, you got a queued up packet,
You better call Receive and prepare for the racket.
You didn’t hit return before the left curly bracket,
    gonna send you upstate, in a straight white jacket.

Ya’ll used to say we were the bourgeoisie,
Now .NET is spreading like hepatitis B.
I know you is scared, cuz I’m a wack MC,
    and I got more sources than ODBC.

Better throw up your hands, and your pocket protector,
I’m gonna whip out my king sized garbage collector.
It’s coming after you like a bee after nectar,
    you’ll never add another String to an object typed Vector.

I was ready to lay down some more verse, but the light turned green. The guy put the gas pedal to the floor and drove off in a hurry. I began to revel in my freestyle geek-rap victory.

Then the phone rang, and I woke up.

I swear, if I doze off at the computer, I can have the most pathetic dreams…

posted by scott with 25 Comments

New User Group In Town

The BaltoMSDN .NET User Group is just getting started. This month’s meeting is Wednesday, November 17th in Hunt Valley, Maryland. Lonnie Wall from RDA Consultants will be giving an overview of Service Oriented Architecture. If you are in the Baltimore / D.C. area, keep an eye on their calendar and drop in often.

posted by scott with 0 Comments

64 bits

I think 2005 will be a big year for AMD in the 64 bit server market. Dell has only shipped Intel chips for the longest time, but they are hinting about building 64 bit servers and workstations with AMD CPUs in the future. The move to AMD would make sense. From everything I’ve read, the 64 bit AMD chips are giving Intel a pretty thorough trouncing in performance and Intel is not catching up soon enough.

Most of our customers have a data warehouse in the 40GB – 300GB range, and some are growing at 100 MB a day. Not the biggest databases in the world by any means, but most of these hospitals are using conservative hardware. We do most of our performance testing against a 500 Mhz Pentium database server with 1 GB of RAM and a 100GB warehouse. If we can get a 10 second response times on this machine, it’s optimized for production use.

Most of our customers are not willing to fork over the money to keep the OLAP database and the warehouse on separate machines. They want to keep everything on a single machine to lower licensing and administration costs. The problem with the single machine setup is apparent when watching Analysis Services crunch through 100 million records during cube processing. It’s like watching a steel cage death match between OLAP and SQL Server as to who can grab the most RAM.

The single machine might work if only more memory were available. The 32 bit version of Analysis Services does not support Address Windowing Extensions (AWE), so the most memory the process can have is 3 GB with the /3GB switch enabled, a barrier the 64 bit version overcomes. SQL 2000 64 bit only supports Itanium, whose sales cannot seem to get off the ground. SQL 2005 will support AMD64. I’m thinking this will fly off the shelf for anyone doing OLAP work with a warehouse (though we’ve yet to see pricing for 2005).

I did a quick search for dual 64 bit Opteron servers on eBay today and found this. The server uses 40 400GB SATA drives to offer 16TB of storage, and has 8 GB of RAM. Yours for only $30,000 US. Just think, in five years, you could probably replace it with a notebook or cell phone.

posted by scott with 0 Comments

SSRS Report Builder

Reporting Services in SQL Server 2005 will introduce a new ad-hoc reporting control – the Report Builder. You can view a demo of the new features in a Microsoft On-Demand Webcast: End-User, Ad Hoc Reporting in SQL Server 2005 Reporting Services with Brian Welcker and Carolyn Chau as the presenters.

The Report Builder is not available in the current beta release (beta 2) of 2005, but should be available for beta 3.

Some interesting features include:

- The Report Builder will give end users a way to edit and create reports in Reporting Services. You can restrict who can use the Report Builder, in fact, the feature is disabled by default.

- The Report Builder Client does not have a crappy web interface, it uses ClickOnce deployment to put a rich .NET WinForms application on the client!

- The Report Builder uses semantic metadata to present a “report model“ of the data source to the end user. The end user does not need to know SQL or MDX to build reports.

- SSRS can generate the report model, you’ll also be able to use the Model Designer to edit and augment the model (this will be a new project type in VS.NET). It looks as if the model designer and auto-generation is only available for MS SQL Server and MS Analysis Services.

- The report model also allows tools to automatically generate dill-through reports. If there is a link to describe another layer of data, the user can drill.

- There is still a Report Designer in Visual Studio.NET, which is more powerful than the Report Builder. The Report Builder is for end users, the Report Designer is for report developers. You can import a Report Builder report into the Report Designer to add lower level features, but not vice versa.

- Licensing: there is no additional license to purchase for Report Builder, it is part of the SQL Server 2005 license. For per-CPU licensing, this means unlimited users.

posted by scott with 4 Comments

VSLive! 2005

In February I’ll be giving a presentation at VSLive! in San Francisco. The talk will focus on how to coax SQL Server Reporting Services to do your application’s bidding with URL Access and Web Services. No report designing in this presentation, just code, code, and then a bit more code.

I’m looking forward to the conference. In addition to the usual VBITS, C#, ASP.NET, and SQL tracks there are two co-located events: the Software Architecture Summit and the Microsoft Windows Anywhere track for Tablet PC and mobile PC development. Who’s gonna be there?

posted by scott with 3 Comments

Disposal Anxiety

I came across this again in the newsgroups today:

 
sqlConnection.Close();
sqlConnection.Dispose();
sqlConnection = null;
 

There has been plenty of debate about the confusion resulting from aliasing the Dispose method. There has also been heaps of information to explain finalization, garbage collection, connection pooling, and IDisposable. Still, none of the debate or information addresses the fundamental problem in this scenario.

For some developers, the instantiation of a disposable object results in high levels of anxiety. The anxiety produces the obsessive-compulsive behavior demonstrated in the above code snippet.

I have a solution to propose. The using keyword only works well for the passive-aggressive types (I’ll dispose the object, but only as a side-effect), while those calling Dispose or Close explicitly do so with a clean and clinical approach – there is no emotion involved.

What the IDisposable interface needs is a method that promotes self-efficacy in a developer. A method name that can stir up primal urges as the developer types. What we need is a method like die_you_gravy_sucking_pigdog() from BSD’s shutdown.c module.

Now, I know this function was written back in the days when steam engines still ruled the world, but we could modernize the function by applying some .NET naming standards.

sqlConnection.DieYouGravySuckingPigDog();

Can you feel the passion behind this statement? This statement carries the emotion that is hard to find in today's code. I hope you’ll support this proposal. Good people will be able to sleep at night once again.

posted by scott with 4 Comments

Crunch Time

There is nothing like an upcoming release to knock you out of a blogging rhythm.

Last minute tweaks. Build engines humming. Hours feeling like minutes.

The growing anticipation to give the baby a spank on the butt and send it out into the cold, cruel world of users, customers, clients.

Then of course, the best part. The sigh of relief. The satisfaction of a job complete. The …

Oh, crap.

Be right back.

posted by scott with 0 Comments