I am trying to accomplish the following:
- load a document (done)
- go trough the document depth first and use a DefaultHandler from JDK to do some work
The reason I want to do this is that I already have my handler, and now I am using it with a SAX parser. I now want to use the handler on the in-memory document.
Note that this is useful in the following way: I have to use the handler multiple times. For large documents I want to use SAX, for small I want to use the internal representation.
Thanks!
The quickest way (quick in coding) to accomplish this is to write the portion of the internal document that you wish to parse with SAX into an internal string, and then using a
StringReaderbased on that string, pass that to a SAX parser using your handler.What you really need is to generate SAX events based on your data and feed those events to the handler. You can do that by getting the data into the form of an
InputSourceorReaderand then using that in your parse, which is the tactic described above, or you can simply simulate the SAX events by directly calling the methods of theContentHandleryou’ve already written. But calling them in the right order and feeding them the right data to accomplish what you need may be painful if your document is at all complex.If Dom4J provides a way to create an
InputSourcebased on a node in your document structure, that will be the easiest to use, and likely much more efficient than writing it to a string first.You might better consider extracting the portions of your
ContentHandlerthat do the actual work into a separate class that you can use both from theContentHandlerand from a new class that walks the internal tree.