I want to use a caching mechanism to help with performance in a PHP application and came across memcache as what seems like a good option. Then looking into it I found a more general problem with caching…
The problems I see though is that if as in the examples I simply use the SQL statement as a hashed key to the cached results, then there is a possible case of duplicate records. This would occur if SQL A would fetch records 1, 2, 4 and SQL B would fetch records 2, 3 and 6. In this case record 2 would be in both result sets. If I simply retrieve the record and then link to the same record – I’m not really getting any performance increase and would probably slow the application down.
Someone else must have come across this – so I wonder what anyone would suggest as to a better way of managing the cache. Or is it to simple not go for caching and always rely on the database?
Thanks.
Why even worry about this? The point is that the query is cached. If the results of two different queries partially overlap, well, so be it. Unless those results are many many megabytes in size and hence take a lot of space, it doesn’t matter. What does matter is that if you want to run the same query again, you don’t need to, because the results are already cached.