I have a class that I need to binary serialize. The class contains one field as below:
private T[,] m_data;
These multi-dimensional arrays can be fairly large (hundreds of thousands of elements) and of any primitive type. When I tried standard .net serialization on an object the file written to disk was large and I think .net is storing a lot of repeated data about element types and possibly not as efficiently as could be done.
I have looked around for custom serializers but have not seen any that deal with multi-dimensional generic arrays. I have also experimented with built-in .net compression on a byte array of the memory stream following serializing with some success, but not as quick / compressed as I had hoped.
My question is, should I try and write a custom serializer to optimally serialize this array for the appropriate type (this seems a little daunting), or should I use standard .net serialization and add compression?
Any advice on the best approach would be most appreciated, or links to resources showing how to tackle serialization of a multi-dimensional generic array – as mentioned existing examples I have found do not support such structures.
Here’s what I came up with. The code below makes an int[1000][10000] and writes it out using the BinaryFormatter to 2 files – one zipped and one not.
The zipped file is 1.19 MB (1,255,339 bytes) Unzipped is 38.2 MB (40,150,034 bytes)
I can’t think of a better/easy way to do this. The zipped version is pretty damn tight.
I’d go with the BinaryFormatter + GZipStream. Making something custom would not be fun at all.
[edit by MG] I hope you won’t be offended by an edit, but the uniform repeated Range(0,width) is skewing things vastly; change to:
And try it; you’ll see
temp_notZipped.txtat 40MB,temp_zipped.txtat 62MB. Not so appealing…