I am learning the Java XML API. I am using DOM.
I have a problem with even basic navigation inside the document. Here is the XML files I am working with:
<?xml version="1.0"?>
<company>
<staff>
<firstname>test</firstname>
<lastname>test2</lastname>
<nickname>test3</nickname>
<salary>test4</salary>
</staff>
<staff>
<firstname>test5</firstname>
<lastname>test6</lastname>
<nickname>test7</nickname>
<salary>test8</salary>
</staff>
</company>
And here is the code that I have so far and supposed to get the name of the parent node and it’s child nodes:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new File(pathtothefile));
Element topLevelElement = document.getDocumentElement();
NodeList secondLevelElements = topLevelElement.getChildNodes();
System.out.println("Top level element: " + topLevelElement.getNodeName());
System.out.println("Number of second level nodes: " + secondLevelElements.getLength());
System.out.println("Node at index 0: " + secondLevelElements.item(0).getNodeValue());
I get number of second level nodes (it is for some reason 5, not 2) but when I try to get the name of node at index 0 I get “#text” or if I try to get the value: nothing displays.
I would appreciate any help as I am a total beginner to all of these and fell kind of lost 🙂
UPDATE 1:
Here is the new code:
Element companyElement = document.getDocumentElement();
NodeList staffElements = companyElement.getElementsByTagName("staff");
NodeList firstNameElements = companyElement.getElementsByTagName("firstname");
NodeList lastNameElements = companyElement.getElementsByTagName("lastname");
NodeList nicknameElements = companyElement.getElementsByTagName("nickname");
NodeList salaryElements = companyElement.getElementsByTagName("salary");
System.out.println("Top level element: " + companyElement.getNodeName());
System.out.println("----");
System.out.println("Next nodes' level name: " + staffElements.item(0).getNodeName());
System.out.println("Next nodes' level number: " + staffElements.getLength());
System.out.println("----");
System.out.println("Person No. 1");
System.out.println("First name: " + firstNameElements.item(0).getNodeValue());
System.out.println("Last name: " + lastNameElements.item(0).getNodeValue());
System.out.println("Nickname: " + nicknameElements.item(0).getNodeValue());
System.out.println("Salary: " + salaryElements.item(0).getNodeValue());
System.out.println("----");
System.out.println("Person No. 2");
System.out.println("First name: " + firstNameElements.item(1).getNodeValue());
System.out.println("Last name: " + lastNameElements.item(1).getNodeValue());
System.out.println("Nickname: " + nicknameElements.item(1).getNodeValue());
System.out.println("Salary: " + salaryElements.item(1).getNodeValue());
This is because the DOM preserves white space. So what you have at that level is:
[whitespace][staff element][whitespace][staff element][whitespace]i.e. 5 nodes.
If you read the javadoc for
Node.getName(), you’d know why. The node at index 0 is a whitespace node, andgetName()on a text node returns the hard-wired string#text.Again, that’s because it’s a whitespace-only text node.
You need to fetch the nodes at index 1 and 3 if you need to access the
<staff>elements.