I had an interesting experience this week with a client using Analysis Services. They have an OLAP cube which holds all the transactions at a hospital for the past 3 years. There are 35 million records to keep track of every $3200 pacemaker and $3 aspirin tablet used.
The cube reprocesses on schedule every weekend – this typically takes less than 8 hours. This week the cube had not refreshed and I had an email asking to investigate. I logged in over a VPN and saw the SELECT statement from Analysis Services was still going strong after 52 hours. Not good!
First we checked current activity to see if any SPIDs were blocked - no. Any other extraordinary activity on the servers? No. As the plot thickened, I decided to look at the estimated query plan for the OLAP SELECT statement in Query Analyzer.
The cube has 8 date dimensions, meaning 8 joins to a date dimension table that yields the fiscal quarters and calendar quarters and other date attributes analysts love to use when producing trendy charts for board meetings. The estimated query plan showed SQL Server was thinking each join would double the size of the resultset. By the time all the joins are complete SQL estimates there will be almost 9 billion rows instead of the 35 million there should be.
I’m making a wild guess that when SQL Server thinks it needs to process 9 billion rows it figures 2GB of RAM won’t cut it and goes straight to the hard drive. Even though the query wasn’t producing 9 billion records – it was certainly taking a long time and not using much CPU.
I immediately thought this had to do with indexes or index statistics. I looked at the indexes on the date dimension table. The clustered index was on the column used in the 8 joins. I realized the column was not the primary key column but did hold unique date values – one row for each day over a 20 year span. Unfortunately, nobody told SQL Server the values were unique. Just adding a unique constraint put everything back to normal- the query plan looked better and processing finished in 8 hours.
The moral of the story is: proper design can also yield better performance. The more you can tell SQL Server about the data the less it has to guess and prepare for worst case scenarios.
Driving home the other day I was thinking about various snippets of recent conversation over Yukon. One underlying theme is the question: Will developers go crazy with the CLR integration and hang themselves? String and array manipulation will be much easier, but the ability to throw blocking I/O statements into stored procedures is a sobering thought.
I started to do some googling to see what has turned up in the other database circles where JVM hosting has been around for a few years. It appears there are not too many complaints, and hardly any “best practice” documents to address performance related problems when mixing SQL and Java in the database.
I don’t think this sets a precedence, however. The J2EE camp often avoids stored procedures because they don't port to other platforms easily. In contrast, Microsoft has long advocated the use of stored procedures as good practice for performance and security, and as a layer of abstraction. Also, (and this section is based on my experience and may bear no relation to reality), the J2EE teams are generally bigger and have a dedicated DBA who keeps procedural thinkers from mucking up the prize jewel. At some companies the DBAs even carry small taser pistols so they can incapacitate developers who start tinkering in SQL code.
In any case, it should be interesting to see what turns up in the forums and message boards when Yukon goes live. You always have the people who say “Doc, it hurts when I do this”. Answer: “Then stop doing that!”.
Although Bill Gates says it won’t happen, here are 10 things I’d like to see if Microsoft would acquire Disney:
10. Free passes to Epcot Center for all MSDN Universal subscribers.
8. Windows ships with “It’s A Small World After All” as startup wav file.
7. Office Assistants Clippy the paperclip and Kairu the dolphin replaced by Grumpy and Sneezy dwarves. [“For crying out loud, you still need help with the mail-merge?”]
6. “Tomorrowland” attraction replaced with “Longhornland”.
5. “Honey I Shrunk The Kids Movie Set Adventure” replaced by “Windows Build Lab and War Room adventure”.
4. Wrong memo sent to Magic Kingdom security guards - incoming patrons inspected for buffer overflows.
3. The namespace System.Drawing.Animations.ToyStory
2. Free theme park advocates build the amusement park “Lisney”. Patrons disappointed when they need to edit /etc/lilo.conf with vi before gaining entrance to the “Rural Bear Jamboree”.
1. .NET Rocks! on the Disney Channel.
I finally put some time into Longhorn this weekend. At times I was a little bit frustrated (detail below), but overall the experience was exciting. Working with just what I had installed on the Longhorn Virtual PC, I decided to do a little Visual SourceSafe automation to display the file changes between two labels. Here is a screenshot:
If anyone is interested in the code I’ll be happy to post it up for download, but I would like to improve the app some (details below). At this point I wanted to play with some of the basic Avalon controls, including the GridPanel, and of course I had to try out a gradient fill.
(Update: Here is the code, just remember, no warranties express or implied, etc. etc.. Feedback and constructive criticism more than welcomed...)
Along the way I kept some notes on what I liked, what I didn’t like, and what I need to find out more about. I’m sure some of the problems I experienced are just alpha software glitches that will be fixed soon. These notes are not particularly focused on any specific area, but more general to the overall experience.
GUI design with XAML is going to be fun. Using XAML will bring some of the best parts of web design to WinForms while leaving behind all the restrictions. The idea of having a Style element to make sweeping changes to the appearance of a form is particularly powerful.
The refactoring features in Whidbey are worth the price of admission alone. The Extract Method and Rename commands are going to save untold hours of error prone work.
I find the task based documentation (How Do I?) in the SDK extremely useful, particularly when first delving into a new area.
I like the way the Internet Explorer menu and toolbar can occupy the same horizontal area. I’ve always set my IE toolbar icons to “small icons” to conserve space, but this is even better.
When doing a search in the Longhorn SDK (local copy), the “Location” for the search results always returns “Longhorn SDK”. When doing a search in the Longhorn SDK online, I can at least take a guess where I am going to end up by peeking at the URL. For instance, when looking for control documentation, the URL will tell me if I am going to end up in the MSAvalon.Windows.Controls area or the System.Web.UI.WebControls area.
The title bars are so tall they seem to waste space. At first look the contrast to previous versions is pleasing, but when I get down to work I’d like to have more space for real content.
After 20 hours or so of uptime, Internet Explorer (and dexplorer.exe) become so unresponsive as to require a reboot. Is this just me?
I’m starting to wonder what impact Avalon will have on the makeup of software development teams. These days, you can bang out a UI that competes with Microsoft Office applications (in terms of look and feel) as a solo developer. Taking a look at the Longhorn concept videos reminds me of the gaming industry. If you look at the postmortems for today’s best selling games – the skills required go far beyond bit twiddling. The teams usually have half as many graphic designers and artists as there are developers. Plus sound FX people, camera people, voice-over people, and an occasional composer.
Post build steps do not appear to function as yet in Whidbey.
It appears the automagic copying of app.config and COM interop assemblies does not function as yet in Whidbey. I also had some difficulties successfully referencing an interop assembly until I TLBIMP’ed it myself and added a reference by hand.
The app currently does all the work on the UI thread. I didn’t see an Invoke method on the Control class to marshal back to the UI thread, so I abandoned my plan to use a background thread for the VSS automation calls. I did notice the UIContext parameter on the control constructor, which is new, so I’ll need to dig around some more.
Databinding with the GridPanel control? The app currently builds the GridPanel from scratch in the C# code. The next step is to read more about databinding in Longhorn – I’m looking for something similar to today’s DataGrid in Avalon with sorting built in.
Is there an Avalon status bar control?
I’ve heard there is another version of Whidbey coming along soon to play with, but I’m also wondering when there will be another rev of Longhorn to experiment with. Overall - a fun experience with the new technology.
I went out this evening to vote in the Maryland primary. The voting machine was the somewhat controversial Diebold machine. After all the bad press I expected the machine to do something sinister while I was there – but alas – nothing extraordinary to report. No exit poll afterwards either. Bummer.
The only thing to note is the final summary screen which shows all of the selections. In some circumstances there must be a scroll bar present which comes dangerously close to the final “Cast Vote” button. There is a caution at the top of the screen in a small light blue font saying something like “Don’t accidently hit submit when using the scroll bar.” Seems like something they should design around - particularly when there is no confirmation after hitting the big green button.
Dana Epp points to an Internet News article stating XP SP2 will include a virus scanner. I can see the controversy coming, as many people will be unhappy about MS bundling more software into the operating system.
Let’s face it – anti virus as an add-on product just doesn’t work. When someone walks into BestBuy with $50 to spend, I’m sure the products like Halo move much faster than any anti-virus software. It’s just not human nature to spend money on something preventive when you can buy something fun. Halo has much better graphics then any anti-virus software I've ever seen - the choice is obvious.
Here is a new article demonstrating how to use the web service API of Reporting Services to build a tree view of reports available to a user. The ASP.NET project also hosts the ReportViewer sample component in a page to display reports.