I’m currently using SAX (Java) to parse a a handful of different XML documents, with each document representing different data and having slightly different structures. For this reason, each XML document is handled by a different SAX class (subclassing DefaultHandler).
However, there are some XML structures that can appear in all these different documents. Ideally, I’d like to tell the parser “Hey, when you reach a complex_node element, just use ComplexNodeHandler to read it, and give me back the result. If you reach a some_other_node, use OtherNodeHandler to read it and give me back that result”.
However, I can’t see an obvious way to do this.
Should I simply just make a monolithic handler class that can read all the different documents I have (and eradicate duplication of code), or is there a smarter way to handle this?
Below is an answer I made to a similar question (Skipping nodes with sax). It demonstrates how to swap content handlers on an XMLReader.
In this example the swapped in ContentHandler simply ignores all events until it gives up control, but you could adapt the concept easily.
You could do something like the following:
MyContentHandler
This class is responsible for processing your XML document. When you hit a node you want to ignore you can swap in the IgnoringContentHandler which will swallow all events for that node.
IgnoringContentHandler
When the IgnoringContentHandler is done swallowing events it passes control back to your main ContentHandler.