I am trying to do a merge sort on sorted chunks of XML files on disks. No chance that they all fit in memory. My XML files consists of records.
Say I have n XML files. If I had enough memory I would read the entire contents of each file into a correspoding Queue, one queue for each file, compare the timestamp on each item in each queue and output the one with the smallest timestamp to another file (the merge file). This way, I merge all the little files into one big file with all the entries time-sorted.
The problem is that I don’t have enough memory to read all XML with .ReadToEnd to later pass to .Parse method of an XDocument.
Is there a clean way to read just enough records to keep each of the Queues filled for the next pass that compares their XElement attribute “TimeStamp”, remembering which XElement from disk it has read?
Thank you.
If you like the linq to xml api, this codeplex project may suite your needs.