I have two models:
class Source(models.Model):
name = models.CharField(max_length=200)
class Data(models.Model):
date = models.DateField(db_index=True)
metric = models.IntegerField()
source = models.ForeignKey(Source)
I want to find all the sources which don’t have any data points in the last three days. In PostgreSQL, this is doable:
select influence_source.src, max(influence_data.date) as max_date
from influence_data, influence_source
where influence_data.source_id = influence_source.id
group by influence_source.src
having max(influence_data.date) < now()::date - 7
order by max_date;
Is this possible using Django’s ORM?
I’ve tried:
>> Source.objects.raw("select influence_source.src, max(influence_data.date) as max_date from influence_data, influence_source where influence_data.source_id = influence_source.id group by influence_source.src having max(influence_data.date) < now()::date - 7 order by max_date")
<RawQuerySet: 'select influence_source.src, max(influence_data.date) as max_date from influence_data, influence_source where influence_data.source_id = influence_source.id group by influence_source.src having max(influence_data.date) < now()::date - 7 order by max_date'>
>> list(_)
InvalidQuery: Raw query must include the primary key
(Adding the primary key gives DatabaseError: column "influence_source.id" must appear in the GROUP BY clause or be used in an aggregate function)
I’ve read the aggregation docs, but it’s not obvious to me how I can do this query:
>> Source.objects.all().aggregate(Max('data__date'))
{'data__date__max': datetime.date(2012, 11, 16)} # a single result was not what I wanted
How do I find all the source objects whose most recent data object is more than three days old? I have a lot of data, so I want to do a single DB query instead of iterating over objects.
1 Answer