I have a ‘black box’ application that gets a map of values as parameters, performs heavy and long (up to 5s) calculations and generates single Result which can be persisted in a database.
All I know about that application is that:
- Result is unique with respect to provided map af values
- Argument is a String->String map with known maximun length for both
key and value - Argument map is of variable length (from 2-3 up to 1000 entries or
so) - The size of list of possible key values is around 1000
Sample arguments are:
Map: {'k1'->'a', 'k2'->'b'}
Map: {'k1'->'a', 'k2'->'b', ... 'k100'->'zzz'}
Map: {'k1'->'x', 'k8'->'y'}
Map: {'k6'->'z'}
Each of the above will produce unique Result object.
Now imagine another service, which is built on top of that slow library, and which needs to go online and handle dozens of calculation requests per second.
This is impossible without caching of already calculated results.
My estimation of total number of possible cache size is somewhat around 100-500 millions of records, which leads me towards using RDBMS as cache storage.
As the result is uniquely identified by provided map, I could sort argument map by key and concatenate it into the string ‘k1:a:k2:b….’. That will definetely be the cache key, but:
- Cache key will be huge, above key size limits for many RDBMS and
require indexed CLOB’s - I will make no use of the fact that key values are limited in
possible values.
What’d be your advice? Performance is my main concern here.
Actually, this sounds more like a problem best solved by a key-value store or document database, not an RDBMS.
Another possibility worth looking into is a caching server like memcached.