I have a function that does xml parsing. I want to make the function thread safe, but also as optimized (less blocking) as possible.
In short code is something as follows:
public Document doXML(InputStream s)
{
//Some processing.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder parser = factory.newDocumentBuilder();
Document xmlDoc = parser.parse(is);
return xmlDoc;
}
But I do not want to create a new DocumentBuilderFactory or DocumentBuilder in each call.
I want to reuse factory and parser, but I am not sure they are thread-safe. So what is the most optimal approach?
1) Cache a DocumentBuilderFactory in a class field and synchronize the factory.newDocumentBuilder(); so that each thread has its own instance of DocumentBuilder
2) Cache a DocumentBuilderFactory and DocumentBuilder and synchronize parser.parse(is); per thread
I think (2) is best, but is it safe to do it? Also can I avoid blocking by synchronized? I would like it to be as fast as possible.
Thanks?
If you are reusing thread (as in a thread pool) you can declare your DocumentBuilderFactory to be thread local. There is the overhead of creating a new set for each thread, but as I said, if you are reuising the subsequent overhead is very low.
Here you will only create one DocumentBuilderFactory for each thread.
I dont know if DocumentBuilder is thread safe when parsing (is it immutable?). But if DocumentBuilder is thread-safe when parsing you can use the same mechanism as I stated.
This resolution would make the overall throughput as fast as possible.
Note: This wasnt tested or compiled just gives an idea of what I am referring to.