I have a task that gets called on one view. Basically the task is responsible for fetching some pdf data, and saving it into s3 via django storages.
Here is the view that kicks it off:
@login_required
@minimum_stage(STAGE_SIGN_PAGE)
def page_complete(request):
if not request.GET['documentKey']:
logger.error('Document Key was missing', exc_info=True, extra={
'request': request,
})
user = request.user
speaker = user.get_profile()
speaker.readyForStage(STAGE_SIGN)
speaker.save()
retrieveSpeakerDocument.delay(user.id, documentKey=request.GET['documentKey'], documentType=DOCUMENT_PAGE)
return render_to_response('speaker_registration/redirect.html', {
'url': request.build_absolute_uri(reverse('registration_sign_profile'))
}, context_instance=RequestContext(request))
Here is the task:
@task()
def retrieveSpeakerDocument(userID, documentKey, documentType):
print 'starting task'
try:
user = User.objects.get(pk=userID)
except User.DoesNotExist:
logger.error('Error selecting user while grabbing document', exc_info=True)
return
echosign = EchoSign(user=user)
fileData = echosign.getDocumentWithKey(documentKey)
if not fileData:
logger.error('Error retrieving document', exc_info=True)
else:
speaker = user.get_profile()
print speaker
filename = "%s.%s.%s.pdf" % (user.first_name, user.last_name, documentType)
if documentType == DOCUMENT_PAGE:
afile = speaker.page_file
elif documentType == DOCUMENT_PROFILE:
afile = speaker.profile_file
content = ContentFile(fileData)
afile.save(filename, content)
print "saving user in task"
speaker.save()
In the meantime, my next view hits (actually its an ajax call, but that doesn’t matter). Basically its fetching the code for the next embedded document. Once it gets it, it updates the speaker object and saves it:
@login_required
@minimum_stage(STAGE_SIGN)
def get_profile_document(request):
user = request.user
e = EchoSign(request=request, user=user)
e.createProfile()
speaker = user.get_profile()
speaker.profile_js = e.javascript
speaker.profile_echosign_key = e.documentKey
speaker.save()
return HttpResponse(True)
My task works properly, and updates the speaker.page_file property correctly. (I can temporarily see this in the admin, and also watch it occur in the postgres logs.)
However it soon gets stamped over, I BELIEVE by the call in the get_profile_document view after it updates and saves the profile_js property. In fact I know this is where it happens based on the SQL statements. Its there before the profile_js is updated, then its gone.
Now I don’t really understand why. The speaker is fetched RIGHT before each update and save, and there’s no real caching going on here yet, unless get_profile() does something weird. What is going on and how might I avoid this? (Also, do I need to call save on speaker after running save on the fileField? It seems like there are duplicate calls in the postgres logs because of this.
Update
Pretty sure this is due to Django’s default view transaction handling. The view begins a transaction, takes a long time to finish, and then commits, overwriting the object I’ve already updated in a celery task.
I’m not exactly sure how to solve for it. If I switch the method to manual transactions and then commit right after I fetch the echosign js (takes 5-10 seconds), does it start a new transaction? Didn’t seem to work.
Maybe not
I don’t have TransactionMiddleware added in. So unless its happening anyway, that’s not the problem.
Solved.
So here’s the issue.
Django apparently keeps a cache of objects that it doesn’t think have changed anywhere. (Correct me if I’m wrong.) Since celery was updating my object in the db outside of django, it had no idea this object had changed and fed me the cached version back when I said user.get_profile().
The solution to force it to grab from the database is simply to regrab it with its own id. Its a bit silly, but it works.
Apparently the django authors don’t want to add any kind of refresh() method onto objects, so this is the next best thing.
Using transactions also MIGHT solve my problem, but another day.
Update
After further digging, its because the user model has a
_profile_cacheproperty on it, so that it doesn’t refetch every time you grab the profile in one request from the same object. Since I was using get_profile() in the echosign function on the same object, it was being cached.