I have five tables in my database. Members, items, comments, votes and countries. I want to get 10 items. I want to get the count of comments and votes for each item. I also want the member that submitted each item, and the country they are from.
After posting here and elsewhere, I started using subselects to get the counts, but this query is taking 10 seconds or more!
SELECT `items_2`.*,
(SELECT COUNT(*)
FROM `comments`
WHERE (comments.Script = items_2.Id)
AND (comments.Active = 1))
AS `Comments`,
(SELECT COUNT(votes.Member)
FROM `votes`
WHERE (votes.Script = items_2.Id)
AND (votes.Active = 1))
AS `votes`,
`countrys`.`Name` AS `Country`
FROM `items` AS `items_2`
INNER JOIN `members` ON items_2.Member=members.Id AND members.Active = 1
INNER JOIN `members` AS `members_2` ON items_2.Member=members.Id
LEFT JOIN `countrys` ON countrys.Id = members.Country
GROUP BY `items_2`.`Id`
ORDER BY `Created` DESC
LIMIT 10
My question is whether this is the right way to do this, if there’s better way to write this statement OR if there’s a whole different approach that will be better. Should I run the subselects separately and aggregate the information?
Yes, you can rewrite the subqueries as aggregate joins (see below), but I am almost certain that the slowness is due to missing indices rather than to the query itself. Use
EXPLAINto see what indices you can add to make your query run in a fraction of a second.For the record, here is the aggregate join equivalent.
However, because you are using
LIMIT 10, you are almost certainly as well off (or better off) with the subqueries that you currently have than with the aggregate join equivalent I provided above for reference.This is because a bad optimizer (and MySQL’s is far from stellar) could, in the case of the aggregate join query, end up performing the
COUNT(*)aggregation work for the full contents of theCommentsandVotestable before wastefully throwing everything but 10 values (yourLIMIT) away, whereas in the case of your original query it will, from the start, only look at the strict minimum as far as theCommentsandVotestables are concerned.More precisely, using subqueries in the way that your original query does typically results in what is called nested loops with index lookups. Using aggregate joins typically results in merge or hash joins with index scans or table scans. The former (nested loops) are more efficient than the latter (merge and hash joins) when the number of loops is small (10 in your case.) The latter, however, get more efficient when the former would result in too many loops (tens/hundreds of thousands or more), especially on systems with slow disks but lots of memory.