Here is the code am trying to merge merge multiple XML files.
public static void mergeXml(String directory) throws Exception {
File dir = new File(directory);
File[] rootFiles = dir.listFiles();
XMLEventWriter eventWriter;
XMLEventFactory eventFactory;
XMLOutputFactory outputFactory = XMLOutputFactory.newInstance();
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
eventWriter = outputFactory.createXMLEventWriter(new FileOutputStream("temp/testMerge1.xml"));
eventFactory = XMLEventFactory.newInstance();
// Create and write Start Tag
StartDocument startDocument = eventFactory.createStartDocument("ISO-8859-1");
eventWriter.add(startDocument);
for(File rootFile : rootFiles){
XMLEventReader test = inputFactory.createXMLEventReader(new StreamSource(rootFile));
while(test.hasNext()){
XMLEvent event= test.nextEvent();
//avoiding start(<?xml version="1.0"?>) and end of the documents;
if (event.getEventType()!= XMLEvent.START_DOCUMENT && event.getEventType() != XMLEvent.END_DOCUMENT)
eventWriter.add(event);
test.close();
}
eventWriter.add(eventFactory.createEndDocument());
eventWriter.close();
}
}
am getting two problems
- the output file is not having any encoding
- when am trying to parse the file created by this code am getting the following exception
[Fatal Error] :1:2493: The markup in the document following the root element must be well-formed.
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at SplitMain.validateInputFile(SplitMain.java:139)
at SplitMain.main(SplitMain.java:76)
This does not create a root element for the output XML document, it simply writes the
<?xmldeclaration. After theStartDocumentyou also need to add a suitableStartElement:The next problem is that you’re closing the
eventWriterinside the for loop:You need to move this outside the
forloop, and also end the root element we started aboveAdditionally, if any of your XML files has a
<!DOCTYPEyou may run into problems. You may just be able to ignore DTD events in the same way you’re currently ignoring start and end document events, but whether or not this works depends on exactly what is declared in that DTD. You’ll have to try it and see.