I am trying to devise rules for a small group of people collaborating on software that is used for data analysis. It is important to have a means to reproduce the running of the code at some point in the past, i.e. to revert to a state in the past (something that version control should allow). In the past this has been possible for us with svn. We can then tag our data analysis results with the svn revision number used for that run.
There are stories about how through branching, merging and rebasing, histories are lost/made inaccessible/a nightmare to get to etc. At the same time, the easy handling of branching for experimental feature development is what makes us consider a switch from svn to git.
So: What rules should we follow that would make sure we will easily and always be able to retrieve a state of code that was run for a given analysis? Only use the main branch for analysis runs? If so what operations should be disallowed on the main branch?
EDIT: Two good suggestions are explained below: Tagging of commits that are important will make the analysis transparent and reproducible (antlersoft). This requires no new rules other than to leave the tags in peace. This tagging workflow does not require rules for rebasing and merging. Tom Anderson’s suggestion is useful in that a central repo that is supposed to house all code that has tags attached (this would be a convention/rule) could serve to allow other members access to these bits of code.
The solution to this doesn’t have to involve restricting what you can do on any branch. Just use git tags, and don’t remove or move them. Tag the commit you use to run each analysis, and record the commit tag with the analysis (this is very similar with what you do in svn, except instead of a revision number generated by the VCS it is a tag name you supply). Then the version for the analysis and all its history will always be available, regardless of what else you do (rebase, etc.) on the branch.