I have built an application that generates a large XML file using Linq-to-Entities and XElement. It takes a whole core of our 2GHz server for about half an hour and uses ~1GB of memory.
I’m doing the following type of work:
var xml = from x in dbContext.Table1
select new XElement("Table1",
new XElement("Field1", x.Field1),
new XElement("Field2", x.Field2),
new XElement("Field3", x.Field3),
new XElement("MoreFields",
new XElement("FieldA", x.MoreFields.FieldA),
new XElement("FieldA", x.MoreFields.FieldA),
new XElement("FieldA", x.MoreFields.FieldA.DoSomeWorkWithThisField())
)
);
I have another level of depth or two and several of the fields have work done like parsing an int out of a string using RegEx.Match()
Does anyone have any optimization or refactoring recommendations? I tried using XStreamingElement but it didn’t seem to make any difference.
It looks like you’re fetching the whole of
Table1multiple times.Can you fetch it to a
List<T>and then use that repeatedly?I suspect there’s rather more involved than this, but basically – fetch what you need into memory up-front, and then work from that. If you only need to use an item of data once, that’s fine to do only when you need it – but avoid pulling the same data time and time again.
How big is the document you’re generating, and how much memory does your machine have? You might want to try looking at the performance counters – it’s possible that you’re spending most of the time garbage collecting.