My development team of four people has been facing this issue for some time now:
Sometimes we need to be working off the same set of data. So while we develop on our local computers, the dev database is connected to remotely.
However, sometimes we need to run operations on the db that will step on other developers’ data, ie we break associations. For this a local db would be nice.
Is there a best practice for getting around this dilemma? Is there something like an “SCM for data” tool?
In a weird way, keeping a text file of SQL insert/delete/update queries in the git repo would be useful, but I think this could get very slow very quickly.
How do you guys deal with this?
You may find my question How Do You Build Your Database From Source Control useful.
Fundamentally, effective management of shared resources (like a database) is hard. It’s hard because it requires balancing the needs of multiple people, including other developers, testers, project managers, etc.
Often, it’s more effective to give individual developers their own sandboxed environment in which they can perform development and unit testing without affecting other developers or testers. This isn’t a panacea though, because you now have to provide a mechanism to keep these multiple separate environments in sync with one another over time. You need to make sure that developers have a reasonable way of picking up each other changes (both data, schema, and code). This isn’t necesarily easier. A good SCM practice can help, but it still requires a considerable level of cooperation and coordination to pull it off. Not only that, but providing each developer with their own copy of an entire environment can introduce costs for storage, and additional DBA resource to assist in the management and oversight of those environments.
Here are some ideas for you to consider: