I’m wondering what the best approach to this problem would be. I have an (abstractly speaking) simple method that calls a webservice and stores the result in a local in-memory cache, something like:
public Document callWebservice(SomeObject parameter) {
Document result = cache.get(parameter);
if (result == null) {
result = parse(retrieve(parameter));
cache.put(result);
}
return result;
}
Now, if the Document is in the cache, it can just return without problems, fine. In a singlethreaded environment, this approach works fine too. However, in a multithreaded environment, it turns out that each thread will turn to the ‘else’ bit, and call the webservice several times.
I could chuck a synchronized block in the ‘else’ part, but I believe that’s too ‘wide’ of a lock – the whole method will be unavailable for calling threads even though they call completely different things.
It’s fine if the webservice is called twice, as long as the request is different (i.e. the SomeObject parameter, in this case).
Now, question: What’s the best approach to take in this case?
I’ve thought of storing the parameter in a (threadsafe) collection object. If the parameter’s contents are the same, it’ll produce the same hashCode / equals outcome and will be found in the collection object, indicating that another thread is already processing this request. If that would be the case, the calling thread could be paused until the webservice returns. (I’d have to figure out how to make the calling thread wait though). Would this work with a lock on the SomeObject parameter object? e.g:
private Map<SomeObject, SomeObject> currentlyProcessingItems = new ConcurrentHashMap<SomeObject, SomeObject>();
public Document callWebservice(SomeObject parameter) {
if (currentlyProcessedItems.contains(parameter)) {
parameter = currentlyProcessedItems.get(parameter);
} else {
currentlyProcessedItems.putIfAbsent(parameter);
}
synchronized(parameter) {
Document result = cache.get(parameter);
if (result == null) {
Document result = parse(retrieve(parameter));
cache.put(result);
}
currentlyProcessedItems.remove(parameter);
return result;
}
}
(note: logic for keeping track of the currently processing requests, usage of ConcurrentHashMap and the locking may be suboptimal or outright wrong)
No, I never actually finished reading the book on threading. I should.
I’m pretty sure this particular problem is quite common, I just wasn’t able to find the answer. What is a situation like this called (i.e. locking on a specific object), if I may ask?
I was thinking of the problem myself, especially when retrieve(parameter) takes a long time (For me, it is connect to a server to check authentication, process/filter request at back end, etc.). Now I have not tried this out myself yet, but for the the sake of discussion, how does this sound?
To use the cache, one has to call new MyCache(key).getValue() from a thread, since getValue will be locked in the method getCacheValue() until this value becomes available (to all waiting threads).