I have two tables. Posts and Replies. Think of posts as a blog entry while replies are the comments.
I want to display X number of posts and then the latest three comments for each of the posts.
My replies has a foreign key ‘post_id’ which matches the ‘id’ of every post.
I am trying to create a main page that has something along the lines of
Post –Reply –Reply –Reply
Post –Reply
so on and so fourth. I can accomplish this by using a for loop in my template and discarding the unneeded replies but I hate grabbing data from a db I won’t use. Any ideas?
This is actually a pretty interesting question.
HA HA DISREGARD THIS, I SUCK
On edit: this answer works, but on MySQL it becomes tediously slow when the number of parent rows is as few as 100. However, see below for a performant fix.
Obviously, you can run this query once per post:
select * from comments where id = $id limit 3That creates a lot of overhead, as you end up doing one database query per post, the dreaded N+1 queries.If you want to get all posts at once (or some subset with a where) the following will surprisingly work. It assumes that comments have a monotonically increasing id (as a datetime is not guaranteed to be unique), but allows for comment ids to be interleaved among posts.
Since an auto_increment id column is monotonically increasing, if comment has an id, you’re all set.
First, create this view. In the view, I call post
parentand commentchild:maxidm1is just ‘max id minus 1’;maxidm2, ‘max id minus 2’ — that is, the second and third greatest child ids within a particular parent id.Then join the view to whatever you need from the comment (I’ll call that
text):Naturally, you can add whatever where clause you want to that, to limit the posts:
where a.category = 'foo'or whatever.Here’s what my tables look like:
And a portion of child. Parent 1 has noo children:
And the view gives us this:
The explain plan for the view is only slightly hairy:
However, if we add an index for parent_fks,it gets a better:
As noted above, this begins to fall apart when the number of parent rows is few as 100, even if we index into parent using its primary key:
(Note that I intentionally test on a slow machine, with data saved on a slow flash disk.)
Here’s the explain, looking for exactly one id (and the first one, at that):
Over 56 seconds for one row, even on my slow machine, is two orders of magnitude unacceptable.
So can we save this query? It works, it’s just too slow.
Here’s the explain plan for the modified query. It looks as bad or worse:
But it completes three orders of magnitude faster, in 1/20th of a second!
How do we get to the much speedier parent_top_3a? We create three views, each one dependent on the previous one:
Not only does this work much more quickly, it’s legal on RDBMSes other than MySQL.
Let’s increase the number of parent rows to 12800, the number of child rows to 1536 (most blog posts don’t get comments, right? 😉 )
Note that these timings are for MyIsam tables; I’ll leave it to someone else to do timings on Innodb.
But using Postgresql, on a similar but not identical data set, we get similar timings on
wherepredicates involvingparent‘s columns: