I have a Java program that makes a request to a web service that I do not have the ability to modify. The response from one of the requests can be extremely large, to the point where the heap runs out of memory if I try to parse it into a Document object. To get around this, I’m reading the response into a byte[] buffer chunk-by-chunk and writing it to disk. Then I had planned on scanning the file line-by-line and building Document objects out of each element that I find (these are the only elements I need out of the response):
StringBuilder sb = null;
String line = null;
while( (line = reader.readLine()) != null ){
if(line.trim().equals("<bond>")){
sb = new StringBuilder(line);
}
else if(line.trim().equals("</bond>")){
Document doc = builder.parse(sb.toString());
// Process doc
}
else{
sb.append(line);
}
}
Unfortunately it seems that the newlines are converted to spaces in the response, so everything is one huge line. One solution I’m considering is using SAX to handle the parsing, and build my Document pieces in the same manner. Does anyone have another solution or is this my best bet?
Thanks,
Jared
If you wanted to use either the SAX or DOM parser, the SAX parser is probably your best bet. It doesn’t store the xml in memory so it will be able to handle larger XML files.