I have a latency problem in my application due to the datastore doing additional queries for referenced entities. I have received good advice on how to handle this for single value properties by the use of the get_value_for_datastore() function. However my application also have one-to many relationships as shown in the code below, and I have not found a way to prefetch these entities. The result is an unacceptable latency when trying to show a table of 200 documents and their associated documentFiles (>6000ms).
(There will probably never be more than 10.000 Documents or DocumentFiles)
Is there a way to solve this?
models.py
class Document(db.Expando):
title = db.StringProperty()
lastEditedBy = db.ReferenceProperty(DocUser, collection_name = 'documentLastEditedBy')
...
class DocUser(db.Model):
user = db.UserProperty()
name = db.StringProperty()
hasWriteAccess= db.BooleanProperty(default = False)
isAdmin = db.BooleanProperty(default = False)
accessGroups = db.ListProperty(db.Key)
...
class DocumentFile(db.Model):
description= db.StringProperty()
blob = blobstore.BlobReferenceProperty()
created = db.DateTimeProperty() # needs to be stored here in relation to upload / download of everything
document = db.ReferenceProperty(Document, collection_name = 'files')
@property
def link(self):
return '<a href="/file/serve/%s">%s</a>' % (self.key().id(),self.blob.filename)
...
main.py
docUsers = DocUser.all()
docUsersNameDict = dict([(i.key(), i.name) for i in docUsers])
documents = Document.all()
for d idocuments:
out += '<td>%s</td>' % d.title
docUserKey = Document.lastEditedBy.get_value_for_datastore(d)
out +='<td>%s</td>' % docUsersNameDict.get(docUserKey)
out += '<td>'
# Creates a new query for each document, resulting in unacceptable latency
for file in d.files:
out += file.link + '<br>'
out += '</td>'
Denormalize and store the link in your Document, so that getting the link will be fast.
You will need to be careful that when you update a DocumentFile, you need to update the associated Document. This operates under the assumption that you read the link from the datastore far more often than you update it.
Denormalizing is often the fix for poor performance on App Engine.