Some developers love working with relational databases, and other developers can't stand to touch them. Either way - if your application uses a database, you have to treat the database with some respect. The database is as much a part of an application as the code and the models inside the software.
Here are three rules I've learned to live by over the years of working with relational databases.
1. Never use a shared database server for development work.
The convenience of a shared database is tempting. All developers point their workstations to a single database server where they can test and make schema changes. The shared server functions as an authoritative source for the database schema, and schema changes appear immediately to all team members. The shared database also serves as a central repository for test data.
Like many conveniences in software development, a shared database is a tar pit waiting to fossilize a project. Developers overwrite each other's changes. The changes I make on the server break the code on your development machine. Remote development is slow and difficult.
Avoid using a shared database at all costs, as they ultimately waste time and help produce bugs.
2. Always Have a Single, Authoritative Source For Your Schema
Ideally, this single source will be your source control repository (see rule #3). Consider the following conversation:
Developer 1: It's time to push the app into testing. Do we copy the database from Jack's machine, or Jill's machine?
Developer 2: Ummmmmmmm, I don't remember which one is up to date.
Developer 1: We're screwed.
Everyone should know where the official schema resides, and have a frictionless experience in getting a fresh database setup. I should be able to walk up to a computer, get the latest from source control, build, and run a simple tool to setup the database (in many scenarios, the build process can even setup a database if none exists, so the process is one step shorter).
How you put your database into source control depends on your situation and preferences. Any decent O/R mapping tool should be able to create a database given the mappings you've defined in a project. You can also script out the database as a set of one or more files full of SQL DDL commands. I generally prefer to keep database views and programmatic features (including functions, triggers, and stored procedures) as separate files - but more on this in a later post.
There are plenty of tools to help. Leon Bambrick has a long list (albeit one year old list) of tools and articles that can help, while Jeff Atwood gushes over the virtues of Visual Studio for Database Professionals.
3. Always Version Your Database
There are many ways to version databases, but the common goal is to propagate changes from development, to test, and ultimately to production in a controlled and consistent manner. A second goal is to have the ability to recreate a database at any point in time. This second goal is particularly important if you are shipping software to clients. If someone finds a bug in build 20070612.1 of your application, you must be able to recreate the application as it appeared in that build - database and all.
In a future post, I'll describe an approach I've used for database versioning that has worked well for many years of commercial development.
In the meantime, if you are looking for more database rule, then Adam Cogan and SSW maintain an excellent list.
Comments
The app we write is the one we sell and also the one our company uses, so if anything broke (it was very rare, I can hardly remember last occurrence) we saw it immediately in error logs.
I admit it's sortof irresponsible but it made you be Really Thoughtful and Careful when you were making schema changes... worked every time :)
I'm weary of half-solutions and manual processes.
Also the rule 1 leads to other problems which perfectly described in a dialog in Rule 2.
1. Never use a shared database server for development work.
//End Quote
AMEN!
//Quote
2. Always Have a Single, Authoritative Source For Your Schema
//End Quote
DOUBLE AMEN!
I actually was criticized (from a manager) at a previous job because I promoted a non-shared database approach for a team of developers.
The peculiarity of the situation is that our team had the *most complex* database to deploy (filegroups/files, permissions set for specific dbusers, indexes created on specific filegroups, etc), yet we had the *least* amount of deployment issues compared to other teams.
My personal opinion is that the Continuous Integration environment (CC.NET for example) is the place where all the developer's changes get "merged" and then via UnitTests, the integrity of the scripts are validated.
Of course your #2 is the reason a Continuous Integration environment is possible.
Martin Fowler Quote:
A common mistake is not to include everything in the automated build. The build should include getting the database schema out of the repository and firing it up in the execution environment. I'll elaborate my earlier rule of thumb: anyone should be able to bring in a virgin machine, check the sources out of the repository, issue a single command, and have a running system on their machine.
/////End Quote
Thanks for the reinforcement of previous thoughts. The tar-pit metaphor is a good one.
Yeah, I know the cons of doing so. But just how does one work with a team, where each of its members updates their own database and later on merge. Just how do you merge each individual's delta?
My scenario is developing a Joomla project, where pretty much all users are affecting the structure of the database and their working copy's structure.
1) additive, non-breaking schema changes
2) code which makes use of them
3) schema changes which break compatibility with old code
Developing on a shared development/testing database can aid in thinking this way and separating out the steps, because if you ensure that you're not breaking code for your fellow developers, then you're ensuring that you won't break code in production. For that reason, I don't fully agree with #1. I've worked in environments where the DB was "shared unless breaking changes absolutely needed to be made," in which case we could set up our own. Now, having separate development, testing and production databases, *that's* a must. :)
Great post. And a very timely one for us at Red Gate Software. We have a new tool, SQL Source Control, that provides source control for SQL Developers through SSMS using SVN or TFS. Details at www.red-gate.com/.../index.htm
Would you be able/willing to write a review of the tool?
Regards
Michael
To my point of view, database developments should not be done on development databases at all. It should be done in an offline version-controlled repository which keeps track of every change, handles locks and merges and which will be the source when the delivery needs to be made.
During development, as soon as a non-breaking change is done, the change could be synchronized with the development database to publish it to app developers. Since you have developed on an *offline* repository, your "work-in-progress" development is never available to other developers and you only publish it when finished.
The is the exact philosophy of the neXtep designer product which is a database development environment allowing you to version control every database in an "offline" repository. You work in the repository, you synchronize your changes to development database(s) (it no longer matters whether it is shared or not), you generate your deliveries automatically from version differences.
Learn more on this product in the wiki :
http://www.nextep-softwares.com
http://www.nextep-softwares.com/wiki
PS : this is a free GPL v3 product which currently supports Oracle, MySql and PostgreSql database vendore (DB2 support coming soon)
Thanks!