I have a sorting/grouping issue that I’m hoping somebody could add some insight on.
We have a table of stories that have a publish date and an updated date. I’m using Django so it looks like this:
class Story(models.Model):
pub_date = models.DateTimeField(db_index=True)
update_date = models.DateTimeField(blank=True, null=True, db_index=True)
headline = models.CharField(max_length=200)
...
We want to display the stories on a paginated page grouped by day. So…
Jan 20
Story 1
Story 2
Jan 19
Story 1
Story 3
The challenge is that if a story has an update_date it should be displayed twice, once on the pub_date day, and once on the update_day date (e.g. Story 1).
There are 10s of thousands of stories so I can’t do it all in python of course, but I don’t know of a way to do this query in SQL.
What I have right now is sorting everything by -pub_date and then getting a range of the max and min dates on a given page. I then query for any stories between those dates with an update_date and combine and group them in python. The problem is that the number of items on a page is irregular then.
So I guess my question is this: What is the best way to query a table for a list of items and sort them based on two fields, duplicating an item in the query if it has a value in the second field, and then sorting based on the two fields?
Hope that makes sense…
i can only think of “union” being able to do this.
here’s an example of what that would look like. not sure how fast or good it is for the database to have this type of query sent to it often though D:
the query assumes your table name is stories, and uses the columns headline, pub_date and update_date. it also assumes that a story that hasn’t been updated has the value null in the update_date column.
if you want to add a limit to the query, it should be done last, after the “order by” clause.