Right now we are storing some query results on Memcache. After investigating a bit more, I’ve seen that many people save each individual item in Memcache. The benefit of doing this is that they can get these items from Memcache on any other request.
Store an array
$key = 'page.items.20';
if( !( $results = $memcache->get($key) ) )
{
$results = $con->execute('SELECT * FROM table LEFT JOIN .... LIMIT 0,20')->fetchAll();
$memcache->save($results, $key, 3600);
}
...
PROS:
- Easier
CONS:
- If I change an individual item, I have to delete all caches (it can be a pain)
- I can have duplicated results (the same item on different queries)
vs
Store each item
$key = 'page.items.20';
if( !( $results_ids = $memcache->get($key) ) )
{
$results = $con->execute('SELECT * FROM table LEFT JOIN .... LIMIT 0,20')->fetchAll();
$results_ids = array();
foreach ( $results as $result )
{
$results_ids[] = $result['id'];
// if doesn't exist, save individual item
$memcache->add($result, 'item'.$result['id'], 3600);
}
// save results_ids
$memcache->save($results_ids, $key, 3600);
}
else
{
$results = $memcache->multi_get($results_ids);
// get elements which are not cached
...
}
...
PROS:
- I don’t have the same item stored twice on Memcache
- Easier to invalidate results on several queries (just the item we change)
CONS:
- More complicated business logic.
What do you think? Any other PROS or CONS on each way?
Some links
Grab stats and try to calculate a hit ratio or possible improvement if you cache the full query vs doing individual item grabs in MC. Profiling this kind of code is also very helpful to actually see how your theory applies.
It depends on what the query does. If you have a set of users and then want to grab the “top 10 music affinity” with some of those friends, it is worth to have both cachings:
– each friend (in fact, each user of the site)
– the top 10 query for each user (space is cheaper than CPU time)
But in general it is worth to store in MC all individual entities that are going to be used frequently (either in the same code execution, or in subsequent requests or by other users). Then things like CPU or resource heavy queries and data processings either MC-them or delegate them to async. jobs instead of making them realtime (e.g. Top 10 site users doesn’t needs to be realtime, can be updated hourly or daily).
And of course taking into account that if you store and MC individual entities, you have to remove all referential integrity from the DB to be able to reuse them either individually or in groups.