I’ve currently got a query that selects metrics data from two tables whilst getting the projects to query from two other tables (one is owned projects, the other is projects to which the user has access).
SELECT v.`projectID`,
(SELECT COUNT(m.`session`)
FROM `metricData` m
WHERE m.`projectID` = v.`projectID`) AS `sessions`,
(SELECT COUNT(pb.`interact`)
FROM `interactionData` pb WHERE pb.`projectID` = v.`projectID` GROUP BY pb.`projectID`) AS `interactions`
FROM `medias` v
LEFT JOIN `projectsExt` pa ON v.`projectsExtID` = pa.`projectsExtID`
WHERE (pa.`user` = '1' OR v.`ownerUser` = '1')
GROUP BY v.`projectID`
It takes too long, 1-2seconds. This is obviously the multi left-join scenario. But, I’ve got a couple of ideas to improve speed and wondered what the thoughts were in principle. Do I:-
- Try and select the list in the query and then get the data, rather than doing the joins. Not sure how this would work.
- Do a select in a separate query to get the projectIDs and then run queries on each projectID afterwards. This may lead to hundreds of potentially thousands of requests, but may be better for the processing?
- Other ideas?
There’s two questions here:
To answer #1 properly there has to be more information. Technical information, such as the explain plan for this particular query is a good start. Even better if we’d have the SHOW CREATE TABLE of all tables that you access, as well as the number of rows they contain.
But I’d also appreciate more functional information: what exactly is the question you’re trying to answer? Right now, it seems you’re looking at two different sets of medias:
By lack of enough information to answer #1, I can answer #2 – “how to avoid a left join”. Answer is: write a UNION of the two sets, one where there is a match and one where there isn’t a match.