Compound Keys: The Ally Of Evil

Thursday, June 10, 2004

I’m having a little fun with Brett’s post (Surrogate Keys … The Devil’s Spawn). The fact is, I like surrogate keys, but I’m picking a different battle. There is a vendor in the healthcare industry using compound keys in their data warehouse product. (The fact that they use surrogate identifiers to compose the compound key and other pleasantries puts the entire design squarely in the 5th circle of hell, but I’ll stick to the topic at hand).

While compound keys aren’t quite as bad as Satan’s spawn, there are issues to be aware of:

1) A bigger key means more work for the server.

2) I’m looking at a design where the first column of the compound key has no selectivity, there is only one distinct value in 33 million rows. All the selectivity is in the second column. Unfortunately, the histogram SQL Server builds for the optimizer only uses values from the first column in a compound key/index. Queries using <, >, BETWEEN, (non-equality predicates) suffer.

3) You can reduce the number of joins required by Analysis Services when processing an OLAP cube by running the “Optimize Schema” tool, but it only works on dimensions where a single column joins the fact table to the dimension table. This optimization is potentially in the “2 hour” to “2 minute” category of optimizations.

With the gauntlet cast, I hereby challenge readers to devise the following posts:

Primary Keys: The Fallen Angel

Foreign Keys: Agents Of Deceit

Healthcare IT Vendors: Taking Customers To Hell In A Handbasket


Comments
Brett Thursday, June 10, 2004
I am, in fact, impressed that ANYONE reads my drivel.
<br>
<br>My Goal is to create a composite list of pros and cons for various sql concepts.
<br>
<br>Just some comments if I may.
<br>
<br>0. &gt;&gt; The fact that they use surrogate identifiers to compose the compound key...
<br>
<br>is pure lunacy...3rd party vendors often try to be all things to all people..
<br>
<br>1. &gt;&gt; A bigger key means more work for the server
<br>
<br>But what do you mean by bigger? In most cases it's not big enough to justify the surrogate
<br>
<br>2&gt;&gt; I’m looking at a design where the first column of the compound key has no selectivity, there is only one distinct value in 33 million rows.
<br>
<br>Hey, Bad design, is bad design. Plenty of scrubs out there. It's not a reason to make a case for surrogayes though.
<br>
<br>3&gt;&gt;You can reduce the number of joins required by Analysis Services
<br>
<br>Have to take your word for it there. Gapping hole in the resume. But I've been building OLAP database before all the terms were coined, and still the need to display the actual values cause an extra join, not less. In addition if you use a model where you cascade the great great grandparents key down, you can skip over all the relationship and go from great great grandparent to child in 1 join. No need to navigate.
<br>
<br>Well that's my 1/2 cents worth.
<br>
<br>Thanks again for the read.
<br>
<br>And is it 5:00 yet? Hell it's gotta be 5:00 somewhere!
<br>
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!