I have this query:
SELECT p.id, r.status, r.title FROM page AS p INNER JOIN page_revision as r ON r.pageId = p.id AND ( r.id = (SELECT MAX(r2.id) from page_revision as r2 WHERE r2.pageId = r.pageId AND r2.status = 'active') OR r.id = (SELECT MAX(r2.id) from page_revision as r2 WHERE r2.pageId = r.pageId) )
Which returns each page and the latest active revision for each, unless no active revision is available, in which case it simply returns the latest revision.
Is there any way this can be optimised to improve performance or just general readability? I’m not having any issues right now, but my worry is that when this gets into a production environment (where there could be a lot of pages) it’s going to perform badly.
Also, are there any obvious problems I should be aware of? The use of sub-queries always bugs me, but to the best of my knowledge this cant be done without them.
Note:
The reason the conditions are in the JOIN rather than a WHERE clause is that in other queries (where this same logic is used) I’m LEFT JOINing from the ‘site’ table to the ‘page’ table, and If no pages exist I still want the site returned.
Jack
Edit: I’m using MySQL
If ‘active’ is the first in alphabetical order you migt be able to reduce subqueries to:
Otherwise you can replace ORDER BY line with
These all come from my assumptions on SQL Server, your mileage with MySQL may vary.