First of all, I got this huge xml file that represents data collected by an equipment. I convert this into an object. In fact, this object got a list of object. These objects have three strings in them. The strings look like this:
0,12987;0,45678;…
It is some sort of a list of double arranged this way for matters of performances. There are 1k doubles in each string, so 3k by object, and there are something like 3k objects just to give you an idea of a typical case.
When I read the data, I most get all the doubles from the objets and add them to the same list. I made an “object that contains three doubles” (one for each string) in a foreach, I get every objects and I split my strings into arrays. After that, I loop to turn my arrays into a list of “objects that contains three doubles” and I add it all to one list so I can use it for further operations.
It causes an out of memory exception before the end. Ideas? Something with linq would be the best.
What I got looks like this :

Let’s do some math. 1000 values per string * 8 characters per value (6 digits plus a comma and semi-colon) * 2 bytes per character * 3 strings per object = 48,000 bytes per object. That by itself isn’t a lot, and even with 3000 objects we’re still only talking about around 150MB of RAM. That’s still nothing to a modern system. Converting to double arrays is even less, as there’s only 8 bytes per value rather than 16. String are also reference types, so there would have been overhead for that in the string version as well. The important thing is that no matter how you dice it you’re still well short of the 85,000 byte thresh-hold for these to get stuck on the Large Object Heap, the normal source of OutOfMemoryException.
Without code it’s hard to follow exactly what you’re doing, but I have a couple different guesses:
Either way, what you want to do here is stop thing in terms of lists and start thinking in terms of sequences. Rather than
List<string>orList<double>, try for anIEnumerable<string>andIEnumerable<double>. Write a parser for your string that uses theyieldkeyword to create an iterator block that will extract doubles from your string one at a time by iterating over the characters, without ever changing the string. This will perform way better, and likely fix your memory issue as well.