Versioning Databases – Views, Stored Procedures, and the Like

Saturday, February 2, 2008

What started as a short brain dump is tuning in to a longer series of posts thanks to all the feedback and questions. In this post, I want to explain some of my thoughts on controlling objects like database views, stored procedures, functions, and triggers.

But first...

I haven't actually used a trigger in years. This isn't to say that triggers aren't valuable, but I've tended to shy away. Jon Galloway has posted a good example of what you can do with triggers.

Secondly, stored procedures have fallen out of favor in my eyes. I came from the WinDNA school of indoctrination that said stored procedures should be used all the time. Today, I see stored procedures as an API layer for the database. This is good if you need an API layer at the database level, but I see lots of applications incurring the overhead of creating and maintaining an extra API layer they don't need. In those applications stored procedures are more of a burden than a benefit.

One File per Object

My strategy is to script every view, stored procedure, and function into a separate file, then commit the files to source control. If someone needs to add a new view, they script the view into a file and commit the file to source control. If someone needs to modify a view, they modify the view's script file and commit again. If you need to delete a view from the database, delete the file from source control. It's a relatively simple workflow.

The magic happens when a developer, tester, or installer updates from source control and runs a tool that updates their local database. The tool uses a three step process:

The tool applies new schema changes by comparing the available schema change files to the SchemaChangeLog records in the database.
The tool will DROP all stored procedures, views, and functions in the database.
The tool will run all of the scripts needed to add the views, stored procedures, and functions back into the database.

Say What?

People often wonder why I want to drop and destroy all these objects and then re-add them. You could lose work if you update your database and haven't scripted out a view you were in the middle of creating.

The simple reason is to find problems as early as possible. If someone commits a schema change and the change removes a column used by a view, you'll find out there is an error early – hopefully before the build escapes for testing. Likewise, if someone checks in a view but forgets to publish a schema change the view needs, someone else is going to show up at their desk a few minutes later asking why they are breaking the software.

A second reason is to avoid some rare errors I've seen. Some databases have a tendency to screw up execution plans when the schema changes underneath a view. Dropping everything and starting over avoids this problem, which is really hard to track down.

Of course, this does mean a database will need a little down time to apply updates. I realize not everybody has that luxury.

Summary

Through the power of source control and automation, every database from development to production can keep a schema in synch. If you need to go back in time to see what the database looked like on July 20, 2007, or in build 1.58 of the application, you can do that with this strategy, too.

Next question: what happens if there is a branch in the source control tree? That's a topic for the next post.

All My Pluralsight Courses

OdeToCode by K. Scott Allen