I’ve got some XML files which are larger than available memory, and a large (!) codebase that assumes it can operate on that file using a DOM structure. However, some users have reported OutOfMemoryException s on large input sizes; and the XML is larger than the address space available on 32 bit processors.
Is there a DOM implementation out there which can deal with this case, and only “hydrates” child objects as necessary in order to achieve reasonable memory use with enormous XML files?
There’s a great solution outlined in a two part post by the MS XmlTeam for reaping the benefits of linq2xml, but streaming the file and not loading in the whole thing. After many blind alleys and dead-ends, this was the solution I settled on when reading >10GB xml files from database dumps.