I know this question has been asked before, but I can’t seem to get it working with the answers I’ve read. I’ve got a CSV file ~ 1.2GB , If I’m running the process like a 32bit i get outOfMemoryException, it works if i run it as a 64bit process, but it still takes 3,4gb in memory, i do know that I’m storing a lot of data in my customData class, but still 3,4gb of ram?, Am I doing something wrong when reading the file?
dict is a dictionary in which i just have a mapping to which property to save something in, depending on the column it’s in. Am i doing the reading the right way?
StreamReader reader = new StreamReader(File.OpenRead(path));
while(!reader.EndOfStream) {
String line = reader.ReadLine();
String[] values = line.Split(';');
CustomData data = new CustomData();
string value;
for (int i = 0; i < values.Length; i++) {
dict.TryGetValue(i, out value);
Type targetType = data.GetType();
PropertyInfo prop = targetType.GetProperty(value);
if(values[i]==null)
{
prop.SetValue(data, "NULL",null);
}
else
{
prop.SetValue(data, values[i], null);
}
}
dataList.Add(data);
}
There doesn’t seem to be anything wrong in your usage of the stream reader, you read a line in memory, then forget it.
However, in C# a string is encoded in memory as UTF-16 so on the average a character consumes 2 bytes in memory.
If your CSV contains also a lot of empty fields that you convert to
"NULL"you add up to 7 bytes for each empty field.So on the whole, since you basically store all the data from your file in memory, it’s not really surprising that you require almost 3 times the size of the file in memory.
The actual solution is to parse your data by chucks of N lines, treat them, and free them from memory.
Note: Consider using a CSV parser, there is more to CSV than just comas or semi-colons, what if one of your field conatins a semi-colon, a newline, a quote… ?
Edit
Actually each string take up to 20+(N/2)*4 bytes in memory see C# in Depth