I’ve been working on creating a subclass of db.Model that is automatically cached, i.e.:

Question

0

Editorial Team

Asked: May 26, 20262026-05-26T17:54:23+00:00 2026-05-26T17:54:23+00:00

I’ve been working on creating a subclass of db.Model that is automatically cached, i.e.:

0

I’ve been working on creating a subclass of db.Model that is automatically cached, i.e.:

instance.put would store the entity in memcache before persisting it to the datastore
class.get_by_key_name would first check the cache, and if missed, would go to the datastore to retrieve it and cache it after retrieval

I developed the approach below (which appears to work for me), but I have a few questions:

I had read Nick Johnson’s article on efficient model memcaching which suggests implementing the serialization for memcache through protocol buffers. Looking at the memcache API source code in the SDK, it looks like Google has already implemented protobuf serialization by default. Is my interpretation correct?
Am I missing some important details (which could get me in the future) in the way I am subclassing db.Model or overriding the two methods?
Is there a more efficient way of implementing what I’ve done below?
Are there guidelines, benchmarks or best practices for when such entity caching would make sense from a performance perspective? Or would it always make sense to cache entities? On a related note, should I be reading anything into the fact that Google hasn’t provided a cached model in the modeling API? Are there too many special cases to be thinking about?

Below is my current implementation. I would really appreciate any and all guidance/suggestions on caching entities (even if your response is not a direct answer to one of the 4 questions above, but relevant to the topic overall).

from google.appengine.ext import db
from google.appengine.api import memcache

import os
import logging

class CachedModel(db.Model):
    '''Subclass of db.Model that automatically caches entities for put and 
    attempts to load from cache for get_by_key_name
    '''

    @classmethod
    def get_by_key_name(cls, key_names, parent=None, **kwargs):
        cache = memcache.Client()
        # Ensure that every new deployment of the application results in a cache miss
        # by including the application version ID in the namespace of the cache entry
        namespace = os.environ['CURRENT_VERSION_ID'] + '_' + cls.__name__

        if not isinstance(key_names, list):
            key_names = [key_names]
        entities = cache.get_multi(key_names, namespace=namespace)
        if entities:
            logging.info('%s (namespace=%s) retrieved from memcache' % (str(entities.keys()), namespace))

        missing_key_names = list(set(key_names) - set(entities.keys()))
        # For keys missed in memcahce, attempt to retrieve entities from datastore
        if missing_key_names:
            missing_entities = super(CachedModel, cls).get_by_key_name(missing_key_names, parent, **kwargs)
            missing_mapping = zip(missing_key_names, missing_entities)
            # Determine entities that exist in datastore and store them to memcache 
            entities_to_cache = dict()
            for key_name, entity in missing_mapping:
                if entity:
                    entities_to_cache[key_name] = entity
            if entities_to_cache:
                logging.info('%s (namespace=%s) cached by get_by_key_name' % (str(entities_to_cache.keys()), namespace))
                cache.set_multi(entities_to_cache, namespace=namespace)
            non_existent = set(missing_key_names) - set(entities_to_cache.keys())
            if non_existent:
                logging.info('%s (namespace=%s) missing from cache and datastore' % (str(non_existent), namespace))
            # Combine entities retrieved from cache and entities retrieved from datastore
            entities.update(missing_mapping)

        if len(key_names) == 1:
            return entities[key_names[0]]
        else:
            return [entities[key_name] for key_name in key_names]

    def put(self, **kwargs):
        cache = memcache.Client()
        namespace = os.environ['CURRENT_VERSION_ID'] + '_' + self.__class__.__name__
        cache.set(self.key().name(), self, namespace=namespace)
        logging.info('%s (namespace=%s) cached by put' % (self.key().name(), namespace))
        return super(CachedModel, self).put(**kwargs)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T17:54:24+00:00

Editorial Team

2026-05-26T17:54:24+00:00Added an answer on May 26, 2026 at 5:54 pm

Rather than reinventing the wheel, why not switch to NDB, which already implements memcaching of model instances?

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve been working on creating a subclass of db.Model that is automatically cached, i.e.:

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply