Three Rules for Database Work

Thursday, January 31, 2008

Some developers love working with relational databases, and other developers can't stand to touch them. Either way - if your application uses a database, you have to treat the database with some respect. The database is as much a part of an application as the code and the models inside the software.

Here are three rules I've learned to live by over the years of working with relational databases.

1. Never use a shared database server for development work.

Fossil!The convenience of a shared database is tempting. All developers point their workstations to a single database server where they can test and make schema changes. The shared server functions as an authoritative source for the database schema, and schema changes appear immediately to all team members. The shared database also serves as a central repository for test data.

Like many conveniences in software development, a shared database is a tar pit waiting to fossilize a project. Developers overwrite each other's changes. The changes I make on the server break the code on your development machine. Remote development is slow and difficult.

Avoid using a shared database at all costs, as they ultimately waste time and help produce bugs.

2. Always Have a Single, Authoritative Source For Your Schema

Ideally, this single source will be your source control repository (see rule #3). Consider the following conversation:

Developer 1: It's time to push the app into testing. Do we copy the database from Jack's machine, or Jill's machine?

Developer 2: Ummmmmmmm, I don't remember which one is up to date.

Developer 1: We're screwed.

Everyone should know where the official schema resides, and have a frictionless experience in getting a fresh database setup. I should be able to walk up to a computer, get the latest from source control, build, and run a simple tool to setup the database (in many scenarios, the build process can even setup a database if none exists, so the process is one step shorter).

How you put your database into source control depends on your situation and preferences. Any decent O/R mapping tool should be able to create a database given the mappings you've defined in a project. You can also script out the database as a set of one or more files full of SQL DDL commands. I generally prefer to keep database views and programmatic features (including functions, triggers, and stored procedures) as separate files - but more on this in a later post.

There are plenty of tools to help. Leon Bambrick has a long list (albeit one year old list) of tools and articles that can help, while Jeff Atwood gushes over the virtues of Visual Studio for Database Professionals.

3. Always Version Your Database

There are many ways to version databases, but the common goal is to propagate changes from development, to test, and ultimately to production in a controlled and consistent manner. A second goal is to have the ability to recreate a database at any point in time. This second goal is particularly important if you are shipping software to clients. If someone finds a bug in build 20070612.1 of your application, you must be able to recreate the application as it appeared in that build - database and all.

In a future post, I'll describe an approach I've used for database versioning that has worked well for many years of commercial development.

In the meantime, if you are looking for more database rule, then Adam Cogan and SSW maintain an excellent list.


Comments
rams Thursday, January 31, 2008
Thank you for reinforcing my beliefs.
Steve Campbell Thursday, January 31, 2008
Excellent observation! On my current project, we have a 10 year old application database that has evolved at an average rate of a couple of changes *per day* over the last year. I apply the same basic rules, and they have served me well.
Ian Hughes Thursday, January 31, 2008
I really look forward to your future post on database versioning. It seems like it should be no-brainer since all of your code is in an SCM system of some type or another, but I have not yet worked on a project with the db in source control.
Sergei Saturday, February 2, 2008
Heh. In my last job (I left and am between jobs now) we implemented 2 & 3 but admittedly used production database for developer workstation.

The app we write is the one we sell and also the one our company uses, so if anything broke (it was very rare, I can hardly remember last occurrence) we saw it immediately in error logs.
I admit it's sortof irresponsible but it made you be Really Thoughtful and Careful when you were making schema changes... worked every time :)
Sava Chankov Monday, February 4, 2008
These issues are pretty well addressed in Ruby on Rails.
Scott Wednesday, February 6, 2008
Please, please, please... hurry and elaborate on the tools you've used for #3. The gaps here are enormous - we have beautiful tools that help us do this in most coding environments but in database environments we are left wanting and are often stuck with an arcane, time-intesive, human process that leaves much to be desired and offers very little by way of consistancy or reproduceability. Its not just a question of code migration but structure and data migration as well.

I'm weary of half-solutions and manual processes.
Alex Tuesday, February 12, 2008
We have been using shared sql server for a long time and didn't meet with problems you described. Just if there are errors a developer needs to get the latest code and that's all.
Also the rule 1 leads to other problems which perfectly described in a dialog in Rule 2.
Darryl L. Pierce Saturday, February 16, 2008
Excellent points. It's nice to see someone putting to words what ought to be a common practice. At my work, before we start a sprint, I take a snapshot (via mysqldump) of the schema as it stands before any changes are made. That gets checked into the development branch and is named "create_rhxuser_schema.sql" after our primary schema. Each branch has its own version of that file so we can always revert. Then, as changs are made, we capture those in another script, "update_rhxuser_schema.sql". So the convention has been to run the create script and then the update script at any point to verify updates before we go to production.
Ron Piterman Monday, February 18, 2008
for java developers there is liquibase, am wondering why its not already mentioned everyhwere, seems very promissing.
gravatar Lieven Cardoen Thursday, November 5, 2009
I created a C#.NET Console Application implementing your explanation of database versioning. You can check it at blog.johlero.eu/... .
gravatar Brian S. Saturday, December 19, 2009
gravatar Petr Kozelek Tuesday, January 12, 2010
I think that SQL database should be the only source for maintaining database schema but concept like Migrator.NET holds database schema in its own structure. The problem with such a concept is that DBAs must keep the rules of using Migrator.NET for every single schema change. Well, it might be acceptable by them after an argue but the problem comes if somebody would do schema changes directly in database without tracking them also in Migrator.NET. I posted a couple of articles about issue of database continuous integration at petrkozelek.e-blog.cz/....
gravatar granadaCoder Wednesday, January 13, 2010
//Quote
1. Never use a shared database server for development work.
//End Quote

AMEN!

//Quote
2. Always Have a Single, Authoritative Source For Your Schema
//End Quote

DOUBLE AMEN!


I actually was criticized (from a manager) at a previous job because I promoted a non-shared database approach for a team of developers.

The peculiarity of the situation is that our team had the *most complex* database to deploy (filegroups/files, permissions set for specific dbusers, indexes created on specific filegroups, etc), yet we had the *least* amount of deployment issues compared to other teams.

My personal opinion is that the Continuous Integration environment (CC.NET for example) is the place where all the developer's changes get "merged" and then via UnitTests, the integrity of the scripts are validated.
Of course your #2 is the reason a Continuous Integration environment is possible.


Martin Fowler Quote:
A common mistake is not to include everything in the automated build. The build should include getting the database schema out of the repository and firing it up in the execution environment. I'll elaborate my earlier rule of thumb: anyone should be able to bring in a virgin machine, check the sources out of the repository, issue a single command, and have a running system on their machine.
/////End Quote




Thanks for the reinforcement of previous thoughts. The tar-pit metaphor is a good one.
gravatar elmimmo Thursday, April 29, 2010
"Never use a shared database server for development work."

Yeah, I know the cons of doing so. But just how does one work with a team, where each of its members updates their own database and later on merge. Just how do you merge each individual's delta?

My scenario is developing a Joomla project, where pretty much all users are affecting the structure of the database and their working copy's structure.
gravatar arowla Wednesday, May 5, 2010
Often, changes that go into production need to have a multi-phase deployment:

1) additive, non-breaking schema changes
2) code which makes use of them
3) schema changes which break compatibility with old code

Developing on a shared development/testing database can aid in thinking this way and separating out the steps, because if you ensure that you're not breaking code for your fellow developers, then you're ensuring that you won't break code in production. For that reason, I don't fully agree with #1. I've worked in environments where the DB was "shared unless breaking changes absolutely needed to be made," in which case we could set up our own. Now, having separate development, testing and production databases, *that's* a must. :)
gravatar Michael Francis Monday, September 13, 2010
Hi Scott

Great post. And a very timely one for us at Red Gate Software. We have a new tool, SQL Source Control, that provides source control for SQL Developers through SSMS using SVN or TFS. Details at www.red-gate.com/.../index.htm

Would you be able/willing to write a review of the tool?

Regards
Michael
gravatar cheap nfl jerseys Tuesday, September 14, 2010
I dont know what to say. This web site is superb. Thats not actually a seriously huge statement, but its all I could come up with immediately after reading this. You realize much about this issue. A whole lot to ensure you manufactured me would like to understand additional about this. Your blog is my stepping stone, my friend. Thanks for that heads up on this subject matter
gravatar cheap nfl jerseys Tuesday, September 14, 2010
I dont know what to say. This web site is superb. Thats not actually a seriously huge statement, but its all I could come up with immediately after reading this. You realize much about this issue. A whole lot to ensure you manufactured me would like to understand additional about this. Your blog is my stepping stone, my friend. Thanks for that heads up on this subject matter
gravatar asics shoes Wednesday, September 15, 2010
Wonderful post about "La voiture, l'emploi et l'argent", thanks for putting this together! "This is obviously one great post.
gravatar asics shoes Wednesday, September 15, 2010
Wonderful post about "La voiture, l'emploi et l'argent", thanks for putting this together! "This is obviously one great post.
gravatar Harish Nair Thursday, September 16, 2010
An eye opener!! thanks
gravatar Christophe Fondacci Saturday, October 23, 2010
Agreeing with @arowla, database developments should never break backward compatibility...so there *should* not be any problem with using a shared database in development as it can enforce this development constraint. Breaking compatibility would generally mean that application deployment and database deployment must be synched, which *generally* means you will have shutdown a website for upgrading it, which is not acceptable.

To my point of view, database developments should not be done on development databases at all. It should be done in an offline version-controlled repository which keeps track of every change, handles locks and merges and which will be the source when the delivery needs to be made.

During development, as soon as a non-breaking change is done, the change could be synchronized with the development database to publish it to app developers. Since you have developed on an *offline* repository, your "work-in-progress" development is never available to other developers and you only publish it when finished.

The is the exact philosophy of the neXtep designer product which is a database development environment allowing you to version control every database in an "offline" repository. You work in the repository, you synchronize your changes to development database(s) (it no longer matters whether it is shared or not), you generate your deliveries automatically from version differences.

Learn more on this product in the wiki :
http://www.nextep-softwares.com
http://www.nextep-softwares.com/wiki

PS : this is a free GPL v3 product which currently supports Oracle, MySql and PostgreSql database vendore (DB2 support coming soon)
gravatar cheap nfl jerseys Friday, October 29, 2010
So informative things are provided here,I really happy to read this post,I was just imagine about it and you provided me the correct information I really bookmark it,for further reading,So thanks for sharing the information.
gravatar fitness wear Friday, October 29, 2010
This is such a great resource that you are providing and you give it away for free. I love seeing websites that understand the value of providing a quality resource for free. It’s the old what goes around comes around routine.
gravatar luis Tuesday, January 18, 2011
Hi! This looks quite good, would you allow me to make a translation to spanish and publish it in my site and share it, I'll give you all the credit :)
Thanks!
gravatar scott Tuesday, January 18, 2011
@luis - Yes - sounds good to me.
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!