I have a service running which periodically checks a folder for a file and then processes it. (Reads it, extracts the data, stores it in sql)
So I ran it on a test box and it took a little longer thaan expected.
The file had 1.6 million rows, and it was still running after 6 hours (then I went home).
The problem is the box it is running on is now absolutely crippled – remote desktop was timing out so I cant even get on it to stop the process, or attach a debugger to see how far through etc.
It’s solidly using 90%+ CPU, and all other running services or apps are suffering.
The code is (from memory, may not compile):
List<ItemDTO> items = new List<ItemDTO>();
using (StreamReader sr = fileInfo.OpenText())
{
while (!sr.EndOfFile)
{
string line = sr.ReadLine()
try {
string s = line.Substring(0,8);
double y = Double.Parse(line.Substring(8,7));
//If the item isnt already in the collection, add it.
if (items.Find(delegate(ItemDTO i) { return (i.Item == s); }) == null)
items.Add(new ItemDTO(s,y));
}
catch { /*Crash*/ }
}
return items;
}
–
So I am working on improving the code (any tips appreciated).
But it still could be a slow affair, which is fine, I’ve no problems with it taking a long time as long as its not killing my server.
So what I want from you fine people is:
1) Is my code hideously un-optimized?
2) Can I limit the amount of CPU my code block may use?
Cheers all
Doing the find on the list is an O(n) operation, that means that as the list gets longer it takes longer to search for the items. You could consider putting the items into a HashSet in .NET 4.0/3.5 or use a Dictionary for earlier versions of .NET which can act like an index, if you need the items in the list to maintain the original order you can continue to put them in the list, but use the HashSet/Dictionary to do the checks.
You could also run this code in a BackgroundWorker thread this will help keep the UI responsive while the process is running.