I have a Java program which prepares data into a fairly complex and big data structure in memory (several GB) and serializes it to disk, and another program which reads back the serialized data structure in memory. I was surprised to notice that the deserialization step is pretty slow, and that it is CPU-bound. (100% CPU usage in top but only 3 to 5 MB/s read with iotop, which is very low for what should be sequential reads on a hard drive). The CPU is fairly recent (Core i7-3820), the structure fits in memory, no swap space is configured.
Why is this so? Is there an alternative way to serialize objects in Java which does not have the CPU as bottleneck?
Here is the deserialization code, in case it matters:
FileInputStream f = new FileInputStream(path);
ObjectInputStream of = new ObjectInputStream(f);
Object obj = of.readObject();
Deserialization is pretty expensive. If you use the generic deserialization, it will use lots of reflection and creation of objects.
There are lots of alternatives which are faster and most use generated code instead of reflection.
http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
You will note that one of the fastest is using Externalizable which may be an option for you. This means adding custom methods for the serialization and deserialization of objects.
I have written much fastest approaches but this avoid creating any objects by recycling them or using the data in the file in-place (i.e. without needing to deserialize them)