I am trying to get the maximum value from merging the results from two separate tables. I am trying to get the most updated comment for a Post. I have a model Comment that can be accessed by Post.comments. I also have a body that is separate of the Post that can be accessed by Post.body. Both comments and body have a field when that is a DateTimeField. I would like to return a sorted queryset by most recent activity such that the a Post with the most recent comment or body is shown first.
Assume the models look like this:
class Body(models.Model):
when = models.DateTimeField(default=datetime.datetime.now)
class Post(models.Model):
body = models.ForeignKey(Body)
class Comment(models.Model):
post = models.ForeignKey(Post, related_name='comments')
when = models.DateTimeField(default=datetime.datetime.now)
I would like the result to remain a queryset if possible since I continue to process it and further pass it into a paginator to limit results.
Basically, I would like to be able to call:
q = Post.annotate(
activity=Max(
Union('comments__when', 'body__when')
)
)
q = q.order_by('-activity')
But I don’t know how to accomplish that Union.
I believe the SQL that accomplishes what I’m looking for is comparable to:
SELECT
...
IF(MAX(c.`when`), MAX(c.`when`), b.`when`) AS `activity`
...
FROM `post` p
...
LEFT OUTER JOIN `comment` AS c
ON c.`post_id`=p.`id`
LEFT OUTER JOIN `body` AS b
ON p.`body_id`=b.`id`
...
Can such a customized annotation and Join be accomplished?
This took me an extremely long time to figure out. One large issue is that Django does not support multiple conditions for the
ONclause of aJOINstatement. Because of the dependence of a singleSELECTstatement upon two separateJOINs, we have an issue tracking the table names accurately. When you ask django to annotate a table, say withMax(), you end up with conditions such asMax(T7.when) ... LEFT OUT JOIN table AS T7, where T7 is not consistent. Thus, I needed a way to accruately generate the expressionMax(T7.when)given theJOINsautomatically generated by Django. Many posts online tell you to use the.raw()support on querysets, but you’ll lose a lot of benefit from using an ORM in this case.The solution I came up with was to generate a custom aggregate function. I called this function
CustomMax():The usage of the function is:
I include the
.join()to allow for the alt_table to exist as aJOIN, and Django will automatically handle the naming of both theSELECTandJOINstatements for theMaxportion. The generated SQL from this usage is akin to:Note: This is just for
PostandCommentmodels from above. My actual implementation was a bit more complicated and required this solution.This would also have been a lot easier if the Django team had developed this suggested patch to include
joinstatements in.extra(): https://code.djangoproject.com/ticket/7231