I have a case where I wanna choose any database entry that have an invalid Country, Region, or Area ID, by invalid, I mean an ID for a country or region or area that no longer exists in my tables, I have four tables: Properties, Countries, Regions, Areas.
I was thinking to do it like this:
SELECT * FROM Properties WHERE
Country_ID NOT IN
(
SELECT CountryID FROM Countries
)
OR
RegionID NOT IN
(
SELECT RegionID FROM Regions
)
OR
AreaID NOT IN
(
SELECT AreaID FROM Areas
)
Now, is my query right? and what do you suggest that i can do and achieve the same result with better performance?!
Your query in fact is optimal.
LEFT JOIN‘s proposed by others are worse, as they select ALL values and then filter them out.Most probably your subquery will be optimized to this:
, which you should use.
This query selects at most 1 row from each table, and jumps to the next iteration right as it finds this row (i. e. if it does not find a
Countryfor a given Property, it will not even bother checking for aRegion).Again,
SQL Serveris smart enough to build the same plan for this query and your original one.Update:
Tested on
512Krows in each table.All corresponding
ID‘s in dimension tables areCLUSTERED PRIMARY KEY‘s, all measure fields inPropertiesare indexed.For each row in
Property,PropertyID = CountryID = RegionID = AreaID, no actual missing rows (worst case in terms of execution time).