I need to return all values from colA that are not in colB from mytable. I am using:
SELECT DISTINCT(colA) FROM mytable WHERE colA NOT IN (SELECT colB FROM mytable)
It is working however the query is taking an excessively long time to complete.
Is there a more efficient way to do this?
In standard SQL there are no parentheses in
DISTINCT colA.DISTINCTis not a function.Added
DISTINCTto the sub-select as well. If you have many duplicates it could speed up the query.A CTE might be faster, depending on your DBMS. I additionally demonstrate
LEFT JOINas alternative to exclude the values invalB, and an alternative way to get distinct values withGROUP BY:Or, simplified further, and with a plain subquery (probably fastest):
There are basically 4 techniques to exclude rows with keys present in another (or the same) table:
The deciding factor for speed will be indexes. You need to have indexes on
colAandcolBfor this query to be fast.