So I’ve gotten a project and got the db team sold on source control for the db (weird right?) anyway, the db already exists, it is massive, and the application is very dependent on the data. The developers need up to three different flavors of the data to work against when writing SPROCs and so on.
Obviously I could script out data inserts.
But my question is what tools or strategies do you use to build a db from source control and populate it with multiple large sets of data?
Good to see you put your database under source control.
We have our database objects in source control but not data (except for some lookup values). To maintian the data on dev, we refresh it by restoring the latest prod backup, then rerunning the scripts for any database changes. If what we were doing would require special data (say new lookup values that aren’t on prod or test logins), we have a script for that as well which is part of source control and which would be run at the same time. You wouldn’t want to script out all the data though as it would be very timeconsuming to recreate 10 million records through a script (And if you have 10 million records you certainly don’t want developers developing against a database with ten test records!). Restoring prod data is much faster.
Since all our deployments are done only through source controlled scripts, we don’t have issues getting people to script what they need. Hopefully you won’t either. When we first started (and back when dev coudl do their own deployments to prod) we had to actually go through a few times and delete any objects that weren’t in source control. We learned very quickly to put all db objects in source control.