As the title says it, I have a huge xml file (GBs) <root> <keep>

Question

0

Asked: May 14, 20262026-05-14T16:34:50+00:00 2026-05-14T16:34:50+00:00

As the title says it, I have a huge xml file (GBs) <root> <keep>

0

As the title says it, I have a huge xml file (GBs)

<root>  
<keep>  
   <stuff>  ...  </stuff>  
   <morestuff> ... </morestuff>  
</keep>  
<discard>  
   <stuff>  ...  </stuff>  
   <morestuff> ... </morestuff>
</discard>  
</root>

and I’d like to transform it into a much smaller one which retains only a few of the elements.
My parser should do the following:
1. Parse through the file until a relevant element starts.
2. Copy the whole relevant element (with children) to the output file. go to 1.

step 1 is easy with SAX and impossible for DOM-parsers.
step 2 is annoying with SAX, but easy with the DOM-Parser or XSLT.

so what? – is there a neat way to combine SAX and DOM-Parser to do the task?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-14T16:34:50+00:00

Yes, just write a SAX content handler, and when it encounters a certain element, you build a dom tree on that element. I’ve done this with very large files, and it works very well.

It’s actually very easy: As soon as you encounter the start of the element you want, you set a flag in your content handler, and from there on, you forward everything to the DOM builder. When you encounter the end of the element, you set the flag to false, and write out the result.

(For more complex cases with nested elements of the same element name, you’ll need to create a stack or a counter, but that’s still quite easy to do.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

As the title says it, I have a huge xml file (GBs) <root> <keep>

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply