I’m using the Reactive Extensions (Rx) and a repository pattern to facilitate getting data from a relatively slow data source. I have the following (simplified) interface:
public interface IStorage
{
IObservable<INode> Fetch(IObservable<Guid> ids);
}
Creating an instance of the implementation of IStorage is slow – think creating a web service or db connection. Each Guid in the ids observable results in a one-to-one INode (or null) in the return observable and each result is expensive. Therefore , it makes sense to me only to instantiate IStorage only if I have at least one value to fetch and then to use IStorage to fetch only the values once for each Guid.
To limit the calls to IStorage I cache the results in my Repository class that looks like this:
public class Repository
{
private Dictionary<Guid, INode> NodeCache { get; set; }
private Func<IStorage> StorageFactory { get; set; }
public IObservable<INode> Fetch(IObservable<Guid> ids)
{
var lazyStorage = new Lazy<IStorage>(this.StorageFactory);
// from id in ids
// if NodeCache contains id select NodeCache[id]
// else select node from lazyStorage.Value.Fetch(...)
}
}
In the Repository.Fetch(...) method I’ve included comments indicating what I’m trying to do.
Essentially though, if the NodeCache contains all of the ids being fetched then IStorage is never instantiated and the results are returned with almost no delay. However, if any one id is not in the cache then IStorage is instantiated and all of the unknown ids are passed through the IStorage.Fetch(...) method.
The one-to-one mapping, including order preservation, needs to be maintained.
Any ideas?
It took a while to work it out, but I finally got my own solution.
I have defined two extension methods called
FromCacheOrFetchwith these signatures:The first uses standard CLR/Rx types and the second uses a
Maybemonad (nullable types not restricted to value types).The first just turns the
Func<T, R>intoFunc<T, Maybe<R>>and calls the second method.The basic idea behind is that when the source is to be queried the cache is examined for each value to see if a result already exists and if it does the result is immediately returned. If, however, any result is missing then and only then is the fetch function called by passing in a
Subject<T>and now all cache misses are passed through the fetch function. The calling code is responsible for adding the results to the cache. The code asynchronously processes all the values through the fetch function and reassembles the results, along with cached results, into the correct order.Works like a treat. 🙂
Here’s the code: