I have used JAXB to create a class for the following schema (used in a webservice):
<xs:complexType name="ExceptionType">
<xs:attribute name="errorCode" type="xs:positiveInteger" use="required"/>
<xs:attribute name="outcomeType" use="required">
<xs:simpleType>
<xs:restriction base="xs:token">
<xs:enumeration value="rejectFile"/>
<xs:enumeration value="rejectSubmission"/>
<xs:enumeration value="continue"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:complexType>
Though the actual XML they will send is
<Exception errorCode="1503"outcomeType="continue">
(with no space with “1503” and outcomeType).
Right now, I’m replacing <Exception errorCode="(\d*)"outcomeType with <Exception errorCode="\1" outcomeType in the whole XML response before feeding it to JAXB unmarshaller and it works, but I wonder if some other XML responses will have this “bug”.
Is there an easier way have JAXB accept XML tags with this attr1="value"attr2 bug? Or maybe using some custom XMLFilterImpl?
No, because this isn’t a bug.
XML containing
attr1="value"attr2aren’t well-formed, thus JAXB cannot parse it and will throw an exception indicating a fatal, non-recoverable error.If you expect XML-ish data of this kind and you have no control over it (you receive it from a third party), then your solution seems OK. However, if I was you I would contact this third party and tell them that they’re spouting out invalid XML and that isn’t too professional.
An alternative to replacing strings with regular expressions could be something like this (but this isn’t exactly easy):
JAXB’s
Unmarshallershould be able to handlewellFormedXmlafter the process.If replacing stuff with regular expressions is good enough, because your data doesn’t contain too much stuff to search for and contains only the particular formatting error you’ve described, then don’t use my solution of course, but if you expect more formatting error you could use something like this.
Notice, that I explicitly set the reader’s error and content handler to
null. This is because given a malformed XML they’re never called; the reader will fail early, because this is a fatal, non-recoverable error. This is of course very bad for us, because if the document contains 10 errors like you’ve described, then my method parses the XML 10 times, until it founds every error. I’m not aware of an XML parser in the JDK, that would report formatting errors and continue parsing (reporting every error during the process).Using a proper
ErrorHandleryou could handle warnings and errors gracefully, however fatal errors could not be handled even with anErrorHandler(after itsfatalErrormethod gets called, processing stops).Using an
XMLFilterimplementation wouldn’t help you either, because if you simply use the defaultXMLFilterImplclass that forwards all of its calls to a delegateXMLReaderthen you would face the same problem as before: on the first error, processing stops. As a matter of fact, if you want to implement something, then implement theXMLReaderinterface directly (XMLFilteronly adds thesetParentandgetParentmethod—bad design if you ask me). But implementing anXMLReaderthat can parse malformed XML is probably going to be tedious.