Three Rules for Database Work

Some developers love working with relational databases, and other developers can't stand to touch them. Either way - if your application uses a database, you have to treat the database with some respect. The database is as much a part of an application as the code and the models inside the software.

Here are three rules I've learned to live by over the years of working with relational databases.

1. Never use a shared database server for development work.

Fossil!The convenience of a shared database is tempting. All developers point their workstations to a single database server where they can test and make schema changes. The shared server functions as an authoritative source for the database schema, and schema changes appear immediately to all team members. The shared database also serves as a central repository for test data.

Like many conveniences in software development, a shared database is a tar pit waiting to fossilize a project. Developers overwrite each other's changes. The changes I make on the server break the code on your development machine. Remote development is slow and difficult.

Avoid using a shared database at all costs, as they ultimately waste time and help produce bugs.

2. Always Have a Single, Authoritative Source For Your Schema

Ideally, this single source will be your source control repository (see rule #3). Consider the following conversation:

Developer 1: It's time to push the app into testing. Do we copy the database from Jack's machine, or Jill's machine?

Developer 2: Ummmmmmmm, I don't remember which one is up to date.

Developer 1: We're screwed.

Everyone should know where the official schema resides, and have a frictionless experience in getting a fresh database setup. I should be able to walk up to a computer, get the latest from source control, build, and run a simple tool to setup the database (in many scenarios, the build process can even setup a database if none exists, so the process is one step shorter).

How you put your database into source control depends on your situation and preferences. Any decent O/R mapping tool should be able to create a database given the mappings you've defined in a project. You can also script out the database as a set of one or more files full of SQL DDL commands. I generally prefer to keep database views and programmatic features (including functions, triggers, and stored procedures) as separate files - but more on this in a later post.

There are plenty of tools to help. Leon Bambrick has a long list (albeit one year old list) of tools and articles that can help, while Jeff Atwood gushes over the virtues of Visual Studio for Database Professionals.

3. Always Version Your Database

There are many ways to version databases, but the common goal is to propagate changes from development, to test, and ultimately to production in a controlled and consistent manner. A second goal is to have the ability to recreate a database at any point in time. This second goal is particularly important if you are shipping software to clients. If someone finds a bug in build 20070612.1 of your application, you must be able to recreate the application as it appeared in that build - database and all.

In a future post, I'll describe an approach I've used for database versioning that has worked well for many years of commercial development.

In the meantime, if you are looking for more database rule, then Adam Cogan and SSW maintain an excellent list.

Print | posted @ Thursday, January 31, 2008 2:15 AM

Comments on this entry:

Gravatar # re: Three Rules for Database Work
by rams at 1/31/2008 1:25 PM

Thank you for reinforcing my beliefs.
  
Gravatar # re: Three Rules for Database Work
by Steve Campbell at 1/31/2008 2:52 PM

Excellent observation! On my current project, we have a 10 year old application database that has evolved at an average rate of a couple of changes *per day* over the last year. I apply the same basic rules, and they have served me well.
  
Gravatar # re: Three Rules for Database Work
by Ian Hughes at 1/31/2008 6:45 PM

I really look forward to your future post on database versioning. It seems like it should be no-brainer since all of your code is in an SCM system of some type or another, but I have not yet worked on a project with the db in source control.
  
Gravatar # re: Three Rules for Database Work
by Sergei at 2/2/2008 10:34 AM

Heh. In my last job (I left and am between jobs now) we implemented 2 & 3 but admittedly used production database for developer workstation.

The app we write is the one we sell and also the one our company uses, so if anything broke (it was very rare, I can hardly remember last occurrence) we saw it immediately in error logs.
I admit it's sortof irresponsible but it made you be Really Thoughtful and Careful when you were making schema changes... worked every time :)
  
Gravatar # re: Three Rules for Database Work
by Sava Chankov at 2/4/2008 7:16 PM

These issues are pretty well addressed in Ruby on Rails.
  
Gravatar # re: Three Rules for Database Work
by Scott at 2/6/2008 6:35 PM

Please, please, please... hurry and elaborate on the tools you've used for #3. The gaps here are enormous - we have beautiful tools that help us do this in most coding environments but in database environments we are left wanting and are often stuck with an arcane, time-intesive, human process that leaves much to be desired and offers very little by way of consistancy or reproduceability. Its not just a question of code migration but structure and data migration as well.

I'm weary of half-solutions and manual processes.
  
Gravatar # re: Three Rules for Database Work
by Alex at 2/12/2008 11:15 AM

We have been using shared sql server for a long time and didn't meet with problems you described. Just if there are errors a developer needs to get the latest code and that's all.
Also the rule 1 leads to other problems which perfectly described in a dialog in Rule 2.
  
Gravatar # re: Three Rules for Database Work
by Darryl L. Pierce at 2/16/2008 9:35 PM

Excellent points. It's nice to see someone putting to words what ought to be a common practice. At my work, before we start a sprint, I take a snapshot (via mysqldump) of the schema as it stands before any changes are made. That gets checked into the development branch and is named "create_rhxuser_schema.sql" after our primary schema. Each branch has its own version of that file so we can always revert. Then, as changs are made, we capture those in another script, "update_rhxuser_schema.sql". So the convention has been to run the create script and then the update script at any point to verify updates before we go to production.
  
Gravatar # re: Three Rules for Database Work
by Ron Piterman at 2/18/2008 4:13 PM

for java developers there is liquibase, am wondering why its not already mentioned everyhwere, seems very promissing.
  
Gravatar # re: Three Rules for Database Work
by Lieven Cardoen at 11/5/2009 4:12 PM

I created a C#.NET Console Application implementing your explanation of database versioning. You can check it at blog.johlero.eu/... .
  
Gravatar # re: Three Rules for Database Work
by Brian S. at 12/19/2009 12:23 PM

Ditto! http://sqlinstaller.codeplex.com
  
Gravatar # re: Three Rules for Database Work
by Petr Kozelek at 1/12/2010 4:30 AM

I think that SQL database should be the only source for maintaining database schema but concept like Migrator.NET holds database schema in its own structure. The problem with such a concept is that DBAs must keep the rules of using Migrator.NET for every single schema change. Well, it might be acceptable by them after an argue but the problem comes if somebody would do schema changes directly in database without tracking them also in Migrator.NET. I posted a couple of articles about issue of database continuous integration at petrkozelek.e-blog.cz/....
  
Gravatar # re: Three Rules for Database Work
by granadaCoder at 1/13/2010 9:03 AM

//Quote
1. Never use a shared database server for development work.
//End Quote

AMEN!

//Quote
2. Always Have a Single, Authoritative Source For Your Schema
//End Quote

DOUBLE AMEN!


I actually was criticized (from a manager) at a previous job because I promoted a non-shared database approach for a team of developers.

The peculiarity of the situation is that our team had the *most complex* database to deploy (filegroups/files, permissions set for specific dbusers, indexes created on specific filegroups, etc), yet we had the *least* amount of deployment issues compared to other teams.

My personal opinion is that the Continuous Integration environment (CC.NET for example) is the place where all the developer's changes get "merged" and then via UnitTests, the integrity of the scripts are validated.
Of course your #2 is the reason a Continuous Integration environment is possible.


Martin Fowler Quote:
A common mistake is not to include everything in the automated build. The build should include getting the database schema out of the repository and firing it up in the execution environment. I'll elaborate my earlier rule of thumb: anyone should be able to bring in a virgin machine, check the sources out of the repository, issue a single command, and have a running system on their machine.
/////End Quote




Thanks for the reinforcement of previous thoughts. The tar-pit metaphor is a good one.
  

Your comment:

Title:
Name:
Email:
Website:
 
Italic Underline Blockquote Hyperlink
 
 
Please add 7 and 6 and type the answer here: