I’d like to do some DOM-stlye processing on a very big xml to convert some nodes into others.
This is an example of what I have
...
<node>
<stuff>text-and-numbers</stuff>
</node>
...
And this is what I need to output
...
<node>
<info>some text</info>
<more>some text</more>
<id>some text</id>
</node>
...
All the information inside the <node>...</node> output part is extracted programatically processing the <node>text-and-numbers</node> of the input. I mean, I have a function getInfo(someText) that returns the content of output’s <node>.
I’ve got the code to do it in a DOM way, but the problem is that the XML is too big that it needs too much memory, so I’d like to do it in another way.
I hope someone can help me.
I think you should look into SAX or StAX. The former means not having to keep the whole DOM tree in memory; the latter is a streaming parser.