I’m creating an XML schema and need to include some fields of raw binary data. What built-in datatype offers the most space efficient representation? I see base64Binary and hexBinary as two possibilities but they both appear to be string representations of hexadecimal codes and thus not space efficient and also incur a time penalty to encode them. What built-in datatype would offer me the best space and time efficient representation of my binary data?
I’m creating an XML schema and need to include some fields of raw binary
Share
There are no other types out of the box that deal with binary content.
The most efficient is base64, with around a 30% overhead; hex is at least twice the size. It is also assumed that you’re using a predominantly single byte-character set such as utf-8. Encoding XML using utf-16 will see the above numbers double.
The advantage of using these built in types pays off with typical xml to code bindings libraries; for e.g. JAXB will give you a byte[], so encoding/decoding is tranparent to you.
It also depends on how you move/store your XML; if you use a SOAP based serializer with support for binary attachments, then particularly for large sets, it pays off to go down this route.