I’m having a little fun with Brett’s post (Surrogate Keys … The Devil’s Spawn). The fact is, I like surrogate keys, but I’m picking a different battle. There is a vendor in the healthcare industry using compound keys in their data warehouse product. (The fact that they use surrogate identifiers to compose the compound key and other pleasantries puts the entire design squarely in the 5th circle of hell, but I’ll stick to the topic at hand).
While compound keys aren’t quite as bad as Satan’s spawn, there are issues to be aware of:
1) A bigger key means more work for the server.
2) I’m looking at a design where the first column of the compound key has no selectivity, there is only one distinct value in 33 million rows. All the selectivity is in the second column. Unfortunately, the histogram SQL Server builds for the optimizer only uses values from the first column in a compound key/index. Queries using <, >, BETWEEN, (non-equality predicates) suffer.
3) You can reduce the number of joins required by Analysis Services when processing an OLAP cube by running the “Optimize Schema” tool, but it only works on dimensions where a single column joins the fact table to the dimension table. This optimization is potentially in the “2 hour” to “2 minute” category of optimizations.
With the gauntlet cast, I hereby challenge readers to devise the following posts:
Primary Keys: The Fallen Angel
Foreign Keys: Agents Of Deceit
Healthcare IT Vendors: Taking Customers To Hell In A Handbasket