Possible Duplicate:
Why does Mercurial think my SQL files are binary?
I generated a complete set of scripts for the stored procedures in a database. When I created a Mercurial repository and added these files they were all added as binary. Obviously, I still get the benefits of versioning, but lose a lot of efficiency, ‘diff’ing, etc… of text files. I verified that these files are indeed all just text.
Why is it doing this?
What can I do to avoid it?
IS there a way to get Hg to change it mind about these files?
Here is a snippet of changeset log:
496.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFindCustomerByMatchCode.StoredProcedure.sql has changed
497.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFindUnreconcilableChecks.StoredProcedure.sql has changed
498.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFixBadLabelSelected.StoredProcedure.sql has changed
499.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFixCCOPL.StoredProcedure.sql has changed
500.1 Binary file SQL/SfiData/Stored Procedures/dbo.pFixCCOrderMoneyError.StoredProcedure.sql has changed
Thanks in advance for your help
Jim
In fitting with Mercurial’s views on binary files, it does not actually track file types, which means that there is no way for a user to mark a file as binary or not binary.
As tonfa and Rudi mentioned, Mercurial determines whether a file is binary or not by seeing if there is a NUL byte anywhere in the file. In the case of UTF-[16|32] files, a NUL byte is pretty much guaranteed.
To “fix” this, you would have to ensure that the files are encoded with UTF-8 instead of UTF-16. Ideally, your database would have a setting for Unicode encoding when doing the export. If that’s not the case, another option would be to write a precommit hook to do it (see How to convert a file to UTF-8 in Python for a start), but you would have to be very careful about which files you were converting.