I have huge amont of geographic data represented in simple object structure consisting only structs. All of my fields are of value type.
public struct Child
{
readonly float X;
readonly float Y;
readonly int myField;
}
public struct Parent
{
readonly int id;
readonly int field1;
readonly int field2;
readonly Child[] children;
}
The data is chunked up nicely to small portions of Parent[]-s. Each array contains a few thousands Parent instances. I have way too much data to keep all in memory, so I need to swap these chunks to disk back and forth. (One file would result approx. 2-300KB).
What would be the most efficient way of serializing/deserializing the Parent[] to a byte[] for dumpint to disk and reading back? Concerning speed, I am particularly interested in fast deserialization, write speed is not that critical.
Would simple BinarySerializer good enough?
Or should I hack around with StructLayout (see accepted answer)? I am not sure if that would work with array field of Parent.children.
UPDATE: Response to comments – Yes, the objects are immutable (code updated) and indeed the children field is not value type. 300KB sounds not much but I have zillions of files like that, so speed does matter.
BinarySerializer is a very general serializer. It will not perform as well as a custom implementation.
Fortunately for your, your data consists of structs only. This means that you will be able to fix a structlayout for Child and just bit-copy the children array using unsafe code from a byte[] you have read from disk.
For the parents it is not that easy because you need to treat the children separately. I recommend you use unsafe code to copy the bit-copyable fields from the byte[] you read and deserialize the children separately.
Did you consider mapping all the children into memory using memory mapped files? You could then re-use the operating systems cache facility and not deal with reading and writing at all.
Zero-copy-deserializing a Child[] looks like this: