I’m working on a C# application that needs to store all the successive revisions of a given report file to a single project file: each time the (plain text) report file changes, the contents of the new version shall be appended to the project file, along with some metadata. Other requirements:
- each version of the report file is 100 kB to 1 MB. Theoritically, the maximum number of revisions is unlimited but it should be less than 1000 in practice.
- to keep things simple, I’d like to avoid computing differences between the revisions of the report – just store the whole report to the project file every time it has changed.
- the project file should be compressed – it doesn’t need to be a text file
- it should be easy to retrieve a given version of the report from the application
How can I implement this in an efficient way? Should I create a custom binary file, consider using a database, other ideas?
Many thanks, Guy.
What’s wrong with the simple workflow?
Gzip is a standard format, so it’s easily accessible. Subsequent reports probably won’t change that much, so you’ll have a great compression ratio. To file every report, just open the file and scan the headers. (If scanning doesn’t work, also mirror the metadata in an SQLite database, and make sure to include offsets into the project file so you can seek to the right place quickly.)
If your requirements are flexible (e.g. that “shall append” part) and you just want something to keep track of past versions of the file, a revision control system will do all of what you need quite easily.