I have a Django app that gets near-realtime data (tweets and votes), although updates occur only every minute or two on average. However we want to show the data by updating the site and api results right when it comes in.
We might see a whole ton of load on this site, so my initial thought is of course caching!
Is it practical to have some sort of Memcached cache that gets invalidated manually by another process or event? In other words, I would cache views for a long time, and then have new tweets and votes invalidate the entire view.
- Do the perhaps modest performance enhancements justify the added complexity?
- Is there a practical implementation I could create (I work with other developers, so hacking tons of stuff around each response call isn’t a good option)?
I’m not concerned about invalidating only some of the objects, and I considered subclassing the MemcachedCache backend to add some functionality following this strategy. But of course, Django’s sessions also use Memcached as a write through cache, and I don’t want to invalidate that.
Thanks to @rdegges suggestions, I was able to figure out a great way to do this.
I follow this paradigm:
Here’s all the code you need to do it this way:
This works by setting a version, or namespace, on each cached entry, and storing that version in the cache. The version is just the current epoch time when
reset()is called.You must specify your alternate non-namspaced cache with
settings.REGULAR_CACHE, so the version number can be stored in a non-namespaced cache (so it doesn’t get recursive!).Whenever you add a bunch of data and want to clear your cache (assuming you have set this one as the
defaultcache), just do:You can access any cache with:
Finally, I recommend you don’t put your session in this cache. You can use this method to change the cache key for your session. (As
settings.SESSION_CACHE_ALIAS).