I’m using Java’s DocumentBuilder.parse(InputStream) to parse an XML document. Occasionally, I get malformed XML

Question

0

Asked: May 14, 20262026-05-14T21:51:08+00:00 2026-05-14T21:51:08+00:00

I’m using Java’s DocumentBuilder.parse(InputStream) to parse an XML document. Occasionally, I get malformed XML

0

I’m using Java’s DocumentBuilder.parse(InputStream) to parse an XML document. Occasionally, I get malformed XML documents in that there is extra junk after the final > that causes a SAXException: Content is not allowed in trailing section. (In the cases I’ve seen, the junk is simply one or more null bytes.)

I don’t care what’s after the final >. Is there an easy way to parse an entire XML document in Java and have it ignore any trailing junk?

Note that by “ignore” I don’t simply mean to catch and ignore the exception: I mean to ignore the trailing junk, throw no exception, and to return the Document object since the XML up to an including the final > is valid.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-14T21:51:09+00:00

Since your sender is presenting you with invalid XML, it needs to be corrected before it hits the parser if you want to avoid this exception. If you can’t correct the sender, you’ll need a preprocessing step of some sort.

If the situation is simply that you’ve got extra null bytes after the closing tag as indeicated by one of your responses to another answer, this might be something you can accomplish easily by wrapping your input stream in a FilterInputStream that you implement to skip null bytes.

If the problem is more complex than just null characters, you’ll of course need a more complex filter, which might be difficult.

If you’re using a ContentHandler, you can add a callback to it so that it can inform the calling code when the ending root tag has been handled, and based on that knowledge, the calling code can have logic in its handler for the exception to simply ignore it if the end has been signalled.
At that point anything that had to be done by the parser has likely been done anyway! But this solution doesn’t seem to apply for your situation.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using Java’s DocumentBuilder.parse(InputStream) to parse an XML document. Occasionally, I get malformed XML

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply