I am part of a development team currently working with a database that does not have any kind of source control. We work with SQL Server 2008 R2 and have always managed the DB directly with SSMS. It now has ~340 tables and ~1600 stored procedures, plus a few triggers and views, so it is not a small DB.
My goal is to have the DB under version control, so I have been reading articles, like Scott Allen’s series and many old SO related questions. But I am still unable to decide on how to proceed.
What I’m thinking of is to script the database schema in one file, then procedures, triggers and views in one file each. Then keep everything versioned under Mercurial. But of course, every member of the team can access SSMS and directly change the schema and procedures, with the possibility that any of us can forget to replicate those changes in the versioned files.
What better options are there? And, did I forget any element worth having source control of? My biggest concern is that most of the literature I found explains how to do version control when creating a new database, but not when it is already old and relatively big.
The General Process
We create a baseline for a particular version (say,
v1.0). A baseline includes one complete schema creation script, as well an upgrade script from allowed previous versions, if any (more on that in a moment). So forv1.0, we’d have just one script:From that baseline, we create incremental change scripts as we work from the previous baseline. These scripts are created in a way that they are reentrant, so that they can be run safely multiple times (where the first time only does any actual work; see the next paragraph on a suggestion how). We just create a file for each change script with the baseline name and a timestamp (which we call the version). So for example, say we create two change scripts after a baseline. We’d have the following files:
We create a
schema_versiontable that has two columns:baselineandversion.baselineis some label (such asv1.0mentioned above), andversionis just a timestamp of when the change script was created (we chose to do this because creating arbitrary version numbers created annoying administrative overhead, where a timestamp was easy to use). So before running the change script, we check to see if the change script has been applied yet, by querying for it bybaselineandversion. If it’s already present, just return out of the script or whatever. Otherwise, apply the change and insert into theschema_versiontable to mark the change script completed.Example change script:
Now, when we actually install the software, we run the relevant
baselinescript. As we upgrade that version, we just apply the change scripts in order.When we hit a significant milestone in our product development phase, we create a new baseline. So, we create a new baseline script (again, this is a snapshot of the DB as a baseline), plus an upgrade script from the previous baseline. So let’s say we have a new baseline,
v2.0, we’d have the following files:Then the process continues.
How We Apply Changes
The scripts are all kept in source control. We do have a tool that packages these files and automatically upgrades databases, which our support and installation teams use. The tool figures out the current baseline of the target database, and asks the user if they wish to upgrade to the baseline in the package. If they do, and there is a valid upgrade path from the current version, it applies the upgrade script, and updates the
schema_version.baseline, and deletes all entries for change scripts from the previous baseline. If the database is new, it applies the regular baseline script. Either way, after the baseline is achieved, it applies all change scripts from the baseline that are present in the package, one at a time, in order, in a transaction. If a particular change script fails, it rolls back the last set of changes and errors out. We look at the log, fix any issues, then rerun the package again. At that point, it should just pick up at the last change script that succeeded, saving time.Automation and Diff Tools
We do not allow diff tools to upgrade production databases directly. It’s just too risky. We do use diff tools, of course, to help create our upgrade and change scripts, but once we have them, we comb through them, massage them, test them, etc., then create the upgrade or change script according to the specs above. We do use tools/shell scripts to create the change script files and put the boiler plate
schema_versionchecking.Caveats
It’s actually pretty straight-forward and it works well. The only time it really gets tricky is with branches. For the most part, branches are handled well. If we need a change script for a particular branch’s work, it will fold into the mainline very well once we merge the branch back in. No problem. Where it gets tricky is when two branches try to do similar things, or where one branch relies on another. That’s mostly a process and planning issue, though. If we get stuck in such a situation, we just create a new baseline (say
v2.1), then update the branches accordingly.Another thing to keep in mind is if an installation wants to be upgraded from one baseline to another, it has to apply all outstanding changes for the current baseline, before we upgrade to the new one. In other words, we don’t let installations jump right from where ever they are to the next baseline (unless, of course, they’re already at the most recent version for the current baseline).