In an open source project I maintain, we have at least three different ways of reading, processing and writing XML files and I would like to standardise on a single method for ease of maintenance and stability.
Currently all of the project files use XML from the configuration to the stored data, we’re hoping to migrate to a simple database at some point in the future but will still need to read/write some form of XML files.
The data is stored in an XML format that we then use a XSLT engine (Saxon) to transform into the final HTML files.
We currently utilise these methods:
– XMLEventReader/XMLOutputFactory (javax.xml.stream)
– DocumentBuilderFactory (javax.xml.parsers)
– JAXBContext (javax.xml.bind)
Are there any obvious pros and cons to each of these?
Personally, I like the simplicity of DOM (Document Builder), but I’m willing to convert to one of the others if it makes sense in terms of performance or other factors.
Edited to add:
There can be a significant number of files read/written when the project runs, between 100 & 10,000 individual files of around 5Kb each
It depends on what you are doing with the data.
If you are simply performing XSLT transforms on XML files to produce HTML files then you may not need to touch a parser directly:
If you need to make changes to the input document before you transform it, DOM is a convenient mechanism for doing this:
If you prefer a typed model to make changes to the data then JAXB is a perfect fit: