Possible Duplicate:
JAVA SAX parser split calls to characters()
I have an XML file with the following syntax:
<tag ...>
a bunch of text here
<tag ...>
There aren’t any closing tags for tag. I’m grabbing the text in-between the two tags, and storing them in a List<String> in characters (). It works for the most part, but on some xml files, it reads a line terminator or something, that breaks the text into two; rather than storing a single entry, “a bunch of text here”, I get two entries: “a bunch of”, and “text here”. The difference is that unlike all the other entries, it doesn’t store a line break after “a bunch of”, or before “text here”.
I need to fix this, but don’t know how. I’d appreciate your help.
The parser is allowed to call the ContentHandler characters method multiple times for each string of element text, it’s not finding a line terminator necessarily. the Java tutorial on SAX has a short explanation of the characters method:
Also this Javaworld article has good explanations and examples.