I often have to url encode or decode a large collection or array of strings. Besides iterating through them and using the static URLDecoder.decode(string, “UTF-8”), are there any libraries out there that will make this type of operation more performant?
A colleague insists that using the static method to decode the strings in-place is not thread safe. Why would that be?
The JDK URLDecoder wasn’t implemented efficiently. Most notably, internally it relies on StringBuffer (which unnecessarily introduces synchronization in the case of URLDecoder). The Apache commons provides URLCodec, but it has also been reported to have similar issues in regards to performance but I haven’t verified that’s still the case in most recent version.
Mark A. Ziesemer wrote a post a while back regarding the issues and performance with URLDecoder. He logged some bug reports and ended up writing a complete replacement. Because this is SO, I’ll quote some key excerpts here, but you should really read the entire source article here: http://blogger.ziesemer.com/2009/05/improving-url-coder-performance-java.html
Selected quotes:
…
…
I think your colleague is wrong to suggest URLDecode is not thread-safe. Other answers here explain in detail.
EDIT [2012-07-03] – Per later comment posted by OP
Not sure if you were looking for more ideas or not? You are correct that if you intend to operate on the list as an atomic collection, then you would have to synchronize all access to the list, including references outside of your method. However, if you are okay with the returned list contents potentially differing from the original list, then a brute force approach for operating on a “batch” of strings from a collection that might be modified by other threads could look something like this:
If that does not help, then I’m still not sure what you are after and you would be better served to create a new, more concise, question. If that is what you were asking about, then be careful because this example out of context is not a good idea for many reasons.