I’m having an issue with a small program that I wrote. It does what I intended it to do (add/remove/modify attributes) very well – I’m super excited about that part. But when I output the file, my headers change and some elements have attributes added to them automatically.
Here’s what I start with:
<!DOCTYPE TEI SYSTEM "teilite-ur.dtd">
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<teiHeader>
<fileDesc>
...
<availability>
...
After transforming each element node to contain an additional attribute(name=test,value=working), here’s what I end up with:
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" test="working">
<teiHeader test="working" type="text">
<fileDesc test="working">
...
<availability default="false" status="unknown" test="working">
...
So, short overview:
- !DOCTYPE line was removed
- xmlns:xsi… was added
- type=”text”, default=”false”, status=”unknown” anchored=”true” attributes are added automatically (there may be others, but those are the ones that popped out at me).
I read in here [http://stackoverflow.com/questions/2133395/remove-xml-declaration-from-the-generated-xml-document-using-java] how to prevent the XML declaration from being added to the top. But, I’m not sure how to disable the rest of the additions.
Thanks!
Here’s some self-contained code that does basically what I want it to (little more customization in the real program, but that shouldn’t be relevant) and the relevant IBM tutorial that I used to help build it:
package xml_attrib_test;
import java.io.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;
public class Main {
public static void main(String[] args) {
//Input
File whichFile = new File("C:\\Users\\mw2xx\\Desktop\\proceedings.vol1.xml");
DocumentBuilderFactory domFactory;
DocumentBuilder builder;
Document doc;
XPathFactory factory;
XPath xpath;
XPathExpression expr;
NodeList nodes;
try {
domFactory = DocumentBuilderFactory.newInstance();
domFactory.setSchema(null);
domFactory.setValidating(false);
domFactory.setNamespaceAware(true);
domFactory.setExpandEntityReferences(false);
builder = domFactory.newDocumentBuilder();
doc = builder.parse(whichFile);
factory = XPathFactory.newInstance();
xpath = factory.newXPath();
expr = xpath.compile("//*");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
nodes = (NodeList) result;
} catch (Exception ex) {
System.out.println("Error in parser.");
return;
}
// Do Stuff With the XML Doc
String attributeTag = "test";
String attrValue = "working";
for (int j = 0; j < nodes.getLength(); j++) {
Node n = nodes.item(j);
if (n.getNodeType() == Node.ELEMENT_NODE) {
Element e = (Element) n;
e.setAttribute(attributeTag, attrValue);
} else if (n.getNodeType() == Node.ATTRIBUTE_NODE) {
Attr a = (Attr) n;
if (a.getName().equals(attributeTag)) {
a.setValue(attrValue);
}
}
}
// Output
TransformerFactory tFactory;
Transformer transformer;
DOMSource source;
File resultFile;
StreamResult result;
try {
tFactory = TransformerFactory.newInstance();
transformer = tFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
source = new DOMSource(doc);
resultFile = new File("$$$$$.tmp");
result = new StreamResult(resultFile);
transformer.transform(source, result);
} catch (Exception ex) {
System.out.println("Error in transformer.");
return;
}
whichFile.delete();
resultFile.renameTo(whichFile);
System.out.println("Success!");
}
}
After a few more days of googling and searching stack overflow I found a similar question which provided the setting I needed.
Java change and move non-standard XML file