What is the best way to implement a cache for Sets? Particularly, what makes the best key for the cache?
In a static factory method, I want to include a caching mechanism, so that I can reuse existing (immutable) objects. This reuse should not come with a significant performance penalty. The critical data of this class is a parametrized LinkedHashSet. I’m wondering if it’s wise to use the hashCode of this Set as key for the cache (HashMap), because in the java documentation it says:
“The hash code of a set is defined to be the sum of the hash codes of the elements in the set”.
Isn’t this potentially a slow process? When is it calculated? As soon as the Set is generated or on demand? Couldn’t this actually eat up lots the performance gains I expect to gain by caching?
Furthermore, hashCode is an int, but HashMaps don’t accept primitives, so this involes boxing to Integer, right?
My current approach, would be to maintain an additional set of the lengthes of sets of the existing objects. The factory method would first check if the current set’s lenght is listed, only then looks up in the actual index. But this also involves boxing…
Is there a better solution?
You need to use some invariant as the key for each set, something that logically defines the contents of that set.
Consider creating a
NamedSeteither wrapping your existing set implementation with a simple Delegator, or subclassing it (if it is not final). Then you can provide an additional key or name field to identify the set and use that as the key for your cache.