I have an SQL question, related to this and this question (but different). Basically I want to know how I can avoid a nested query.
Let’s say I have a huge table of jobs (jobs) executed by a company in their history. These jobs are characterized by year, month, location and the code belonging to the tool used for the job. Additionally I have a table of tools (tools), translating tool codes to tool descriptions and further data about the tool. Now they want a website where they can select year, month, location and tool using a dropdown box, after which the matching jobs will be displayed. I want to fill the last dropdown with only the relevant tools matching the before selection of year, month and location, so I write the following nested query:
SELECT c.tool_code, t.tool_description
FROM (
SELECT DISTINCT j.tool_code
FROM jobs AS j
WHERE j.year = ....
AND j.month = ....
AND j.location = ....
) AS c
LEFT JOIN tools as t
ON c.tool_code = t.tool_code
ORDER BY c.tool_code ASC
I resorted to this nested query because it was much faster than performing a JOIN on the complete database and selecting from that. It got my query time down a lot. But as I have recently read that MySQL nested queries should be avoided at all cost, I am wondering whether I am wrong in this approach. Should I rewrite my query differently? And how?
No, you shouldn’t, your query is fine.
Just create an index on
jobs (year, month, location, tool_code)andtools (tool_code)so that theINDEX FOR GROUP-BYcan be used.The article your provided describes the subquery predicates (
IN (SELECT ...)), not the nested queries (SELECT FROM (SELECT ...)).Even with the subqueries, the article is wrong: while
MySQLis not able to optimize all subqueries, it deals withIN (SELECT …)predicates just fine.I don’t know why the author chose to put
DISTINCThere:and why do they think this will help to improve performance, but given that
widgetIDis indexed,MySQLwill just transform this query:into an
index_subqueryEssentially, this is just like
EXISTSclause: the inner subquery will be executed once perwidgetsrow with the additional predicate added:and stop on the first match in
widgetOrders.This query:
will have to use
temporaryto get rid of the duplicates and will be much slower.