I’m trying to get an attribute id (fileID) from my XML document to use as the filename for my XML split. The split works I just need to extract the fileID to use as the name.
[EDITED] I can read the attribute now but it doesn’t create the last xml file. So in my example it create the first 2 files with the correct name but last fileID “000154OP.XML” isn’t created. Can Anyone Help?
This is my xml document
<root>
<envelope fileID="000152OP.XML">
<record id="850">
</record>
</envelope>
<envelope fileID="000153OP.XML">
<record id="850">
</record>
</envelope>
<envelope fileID="000154OP.XML">
<record id="850">
</record>
</envelope>
</root>
And here’s my Java code
public static void splitXMLFile (String file) throws Exception {
String[] temp;
String[] temp2;
String[] temp3;
String[] temp4;
String[] temp5;
String[] temp6;
File input = new File(file);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document doc = dbf.newDocumentBuilder().parse(input);
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("//root/envelope", doc, XPathConstants.NODESET);
int itemsPerFile = 1;
Node staff = doc.getElementsByTagName("envelope").item(0);
NamedNodeMap attr = staff.getAttributes();
Node nodeAttr = attr.getNamedItem("fileID");
String node = nodeAttr.toString();
temp = node.split("=");
temp2 = temp[1].split("^\"");
temp3 = temp2[1].split("\\.");
Document currentDoc = dbf.newDocumentBuilder().newDocument();
Node rootNode = currentDoc.createElement("root");
File currentFile = new File("C:\\XMLFiles\\" + temp3[0]+ ".xml");
for (int i=1; i <= nodes.getLength(); i++) {
Node imported = currentDoc.importNode(nodes.item(i-1), true);
rootNode.appendChild(imported);
Node staff2 = doc.getElementsByTagName("envelope").item(i);
NamedNodeMap attr2 = staff2.getAttributes();
Node nodeAttr2 = attr2.getNamedItem("fileID");
String node2 = nodeAttr2.toString();
temp4 = node2.split("=");
temp5 = temp4[1].split("^\"");
temp6 = temp5[1].split("\\.");
if (i % itemsPerFile == 0) {
writeToFile(rootNode, currentFile);
rootNode = currentDoc.createElement("root");
currentFile = new File("C:\\XMLFiles\\" + temp6[0]+".xml");
}
}
writeToFile(rootNode, currentFile);
}
private static void writeToFile(Node node, File file) throws Exception {
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(node), new StreamResult(new FileWriter(file)));
}
There is a lot of duplication in your code but I have a solution that removes a lot of it. I know there are less complex solutions (for example I don’t think the
if (i % itemsPerFile == 0)logic is required, but I do not know all of your requirements, so I have left it in.The main problems you have were overwriting the last file with wrong data but also that your looping logic was duplicated. A good rule of thumb I go by is whenever I think I might have to duplicate code there is something wrong. Your logic was considering the first
<envelope>separately to the remaining<envelope>elements, whereas they should be considered as a group of 3. Then your logic need only to apply the same searching, splitting, matching, importing, etc… to each element in turn.What complicated matters, is that your input
XMLfile had the same<record id="850">for each<envelope>. I changed mine to850,851and852. Running your original code, produced 3 files,000152OP.xml,000153OP.xmland000154OP.xml, but the first one contained the851record. So I immediately knew the looping logic was incorrect.A simpler solution is detailed below, which given your input XML file as the argument produces 3 output files in the same directory (I removed the
C:\hard-coding for simplicity), each with the correct<record>element.You should read up on Node and String::split as there was unnecessary extra code where a native method already exists (for example
[Node::getNodeValue()][3]).Edit: The source for creating 1000
<envelope>elements that I used to test the above code:I ran
java CreateXMLto create the input filesplit.xmland thenjava SplitXML split.xmlto create the 1000 files.