I was looking inside the Infinity Engine (the one used in games like Baldur’s Gate) file formats, and looking in the file structures I get to the point to ask, why save the engine files in hexadecimal format (I think this is also referred like “binary format”) instead of just plain text?
I have some hypothesis (different to the stackoverflow rule of “avoid your answer is provided along with the question, and you expect more answers”):
- To try to preserve intelectual property of the file (useless).
- To save space, because some of the files can get really big, and at the time (end 1990’s) space wasn’t so big like now.
- To have a structured data (this can also be achieved by using tabs of fix length in flat files).
Hexadecimal is not binary.
In hexadecimal, each character (byte/octet with ASCII coding) can represent 16 values: 0-9 and A-F. It takes two hexadecimal characters to represent 256 values.
In a “binary file”, each byte (octet) can represent 256 values.
Now, the point of using binary is that it is the “most raw” format and does not need to lend itself to human consumption in any way. This greatly increases the data/octet ratio of the file.
Binary formats can aid in “IP”, but this is not necessarily a primary aim. There are plenty of editor tools written by fans (as evidenced), as well as official tools from time to time.
A binary file structure can be determined in offsets and can be seek’ed to directly instead of needing to “read to line X and column Y” as with a text file. This makes random-access of resources much more efficient. In the case of many of the Baulder’s Gate files, many are in a “fixed format” and can be trivially (and quickly) loaded into identically laid-out in-memory structures. (Sometimes heaps and other data-structures will be directly encoded in a binary file.)
A binary file is an opaque structure for data: it is just the means to end. The information inside is only designed to be accessed in specific scenarios. (Compare this with JSON or an SQL Database; the former is not opaque insofar as it is “human consumable” and the latter only exposes information.)
Take a look at the different kinds of binary files and the service that each one provides: audio? images? routes? entity locations and stats? dialog? Each one is crafted to the very specific needs of the engine and, while some “could be done differently”, if there is no need and the existing tooling is in place: why change?