Here’s my query, it is fairly straightforward:
SELECT
INVOICE_ITEMS.II_IVNUM, INVOICE_ITEMS.IIQSHP
FROM
INVOICE_ITEMS
LEFT JOIN
INVOICES
ON
INVOICES.INNUM = INVOICE_ITEMS.II_INNUM
WHERE
INVOICES.IN_DATE
BETWEEN
'2010-08-29' AND '2010-08-30'
;
I have very limited knowledge of SQL, but I’m trying to understand some of the concepts like subqueries and the like. I’m not looking for a redesign of this code, but rather an explanation of why it is so slow (600+ seconds on my test database) and how I can make it faster.
From my understanding, the left join is creating a virtual table and populating it with every result row from the join, meaning that it is processing every row. How would I stop the query from reading the table completely and just finding the WHERE/BETWEEN clause first, then creating a virtual table after that (if it is possible)?
How is my logic? Are there any consistently recommended resources to get me to SQL ninja status?
Edit: Thanks everyone for the quick and polite responses. Currently, I’m connecting over ODBC to a proprietary database that is used in the rapid application development framework called OMNIS. Therefore, I really have no idea what sort of optimization is being run, but I believe it is based loosely on MSSQL.
I would rewrite it like this, and make sure you have an index on
i.INNUM,ii.INNUM, andi.IN_DATE. TheLEFT JOINis being turned into anINNER JOINby yourWHEREclause, so I rewrote it as such:Depending on what database you are using, what may be happening is all of the records from
INVOICE_ITEMSare being joined (due to theLEFT JOIN), regardless of whether there is a match withINVOICEor not, and then theWHEREclause is filtering down to the ones that matched that had a date within range. By switching to anINNER JOIN, you may make the query more efficient, by only needing to apply the WHERE clause toINVOICESrecords that have a matchingINVOICE_ITEMSrecord.