I was recently looking for a way to implement a doubly buffered thread-safe cache for regular objects.
The need arose because we had some cached data structures that were being hit numerous times for each request and needed to be reloaded from cache from a very large document (1s+ unmarshalling time) and we couldn’t afford to let all requests be delayed by that long every minute.
Since I couldn’t find a good threadsafe implementation I wrote my own and now I am wondering if it’s correct and if it can be made smaller… Here it is:
package nl.trimpe.michiel
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
/**
* Abstract class implementing a double buffered cache for a single object.
*
* Implementing classes can load the object to be cached by implementing the
* {@link #retrieve()} method.
*
* @param <T>
* The type of the object to be cached.
*/
public abstract class DoublyBufferedCache<T> {
private static final Log log = LogFactory.getLog(DoublyBufferedCache.class);
private Long timeToLive;
private long lastRetrieval;
private T cachedObject;
private Object lock = new Object();
private volatile Boolean isLoading = false;
public T getCachedObject() {
checkForReload();
return cachedObject;
}
private void checkForReload() {
if (cachedObject == null || isExpired()) {
if (!isReloading()) {
synchronized (lock) {
// Recheck expiration because another thread might have
// refreshed the cache before we were allowed into the
// synchronized block.
if (isExpired()) {
isLoading = true;
try {
cachedObject = retrieve();
lastRetrieval = System.currentTimeMillis();
} catch (Exception e) {
log.error("Exception occurred retrieving cached object", e);
} finally {
isLoading = false;
}
}
}
}
}
}
protected abstract T retrieve() throws Exception;
private boolean isExpired() {
return (timeToLive > 0) ? ((System.currentTimeMillis() - lastRetrieval) > (timeToLive * 1000)) : true;
}
private boolean isReloading() {
return cachedObject != null && isLoading;
}
public void setTimeToLive(Long timeToLive) {
this.timeToLive = timeToLive;
}
}
What you’ve written isn’t threadsafe. In fact, you’ve stumbled onto a common fallacy that is quite a famous problem. It’s called the double-checked locking problem and many such solutions as yours (and there are several variations on this theme) all have issues.
There are a few potential solutions to this but imho the easiest is simply to use a ScheduledThreadExecutorService and reload what you need every minute or however often you need to. When you reload it put it into the cache result and the calls for it just return the latest version. This is threadsafe and easy to implement. Sure it’s not on-demand loaded but, apart from the initial value, you’ll never take a performance hit while you retrieve the value. I’d call this over-eager loading rather than lazy-loading.
For example:
That takes a little explanation. Basically, you’re creating a generic interface for caching the result of a Callable, which will be your document load. Submitting a Callable (or Runnable) returns a Future. Calling Future.get() blocks until it returns (completes).
So what this does is implement a get() method in terms of a Future so initial queries won’t fail (they will block). After that, every ‘ttl’ milliseconds the refresh method is called. It submits the method to the scheduler and calls Future.get(), which yields and waits for the result to complete. Once complete, it replaces the ‘result’ member. Subsequence Cache.get() calls will return the new value.
There is a scheduleWithFixedRate() method on ScheduledExecutorService but I avoid it because if the Callable takes longer than the scheduled delay you will end up with multiple running at the same time and then have to worry about that or throttling. It’s easier just for the process to submit itself at the end of a refresh.