There are formats that are actually zip files in disguise, e.g. docx or odt. If I store them directly in version control, they are handled as binary files. My ideal solution would be
- have a hook that creates a
foo.docx/directory for eachfoo.docxfiles before commit, unzipping all files into it - optionally, have a hook that reindents the xml files
- have a hook that recreates
foo.docxfrom the stored files after update
I don’t want the docx files themselves to be version-controlled. (I am aware of a related question where a different approach with a custom diff was suggested.)
Is this doable? Is this doable with mercurial?
UPDATE:
I know about hooks. I am interested in the specifics. Here is a session to demonstrate the expected behavior.
> hg add foo.docx
> hg status
A foo.docx
> hg commit
> # Change foo.docx with external editor
> hg status
M foo.docx
> hg diff
+++ foo.docx/word/document.xml
- <w:t>An idea</w:t>
+ <w:t>A much better idea</w:t>
If you can get past the hurdle of succesfully unzipping and zipping the Openoffice documents, then you should be able to use the filter system we have in Mercurial. That lets you transform files on every read/write from/to the repository.
You will unfortunately have to do more than just unzip the foo.docx file. The problem is that you need to generate a single file as output — so perhaps you can
unzip foo.docxand thentarup the generated files. You’ll then be versioning the tarball, which should work since a tarball is just an uncompressed concatenations of all the individual files with some meta information. Come to think of it, a simpler solution would be to zip the unpacked foo.docx file again but specify no compression. That should give similar results as using tar.Solving this problem is something I’ve wanted to do myself, so please report back by sending a mail to Mercurial mailing list.