I’m making an android application that does DOM parsing on an xml file. I have an xml file that looks like this:
<?xml version="1.0" encoding="utf-8"?>
<family>
<grandparent>
<parent1>
<child1>Foo</child1>
<child2>Bar</child2>
</parent1>
<parent2>
<child1>Raz</child1>
<child2>Mataz</child2>
</parent2>
</grandparent>
</family>
If I run a dom parser on it, like this:
try {
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(input);
doc.getDocumentElement().normalize(); //added in since the edit
NodeList nodd = doc.getElementsByTagName("grandparent");
for (int x = 0; x < nodd.getLength(); x++){
Node node = nodd.item(x);
NodeList nodes = node.getChildNodes();
for(int y = 0; y < nodes.getLength(); y++){
Node n = nodes.item(y);
System.out.println(n.getNodeName());
}
}
}
My application prints out the following
07-20 18:24:28.395: INFO/System.out(491): #text
07-20 18:24:28.395: INFO/System.out(491): parent1
07-20 18:24:28.395: INFO/System.out(491): #text
07-20 18:24:28.395: INFO/System.out(491): parent2
07-20 18:24:28.395: INFO/System.out(491): #text
My question is, what are those #text fields and more importantly, how do I get rid of them?
Edit: So now that I know what they are, I tried to normalize it. I have updated the code to reflect the changes, but same result.
It’s whitespace (newlines, spaces, tabs) 🙂