I am trying to spot some broken records in a MS-SQL Database.
In a simplified example, the scenerio is this:
I have 2 tables, simply put:
-
Table_1 : Id,Date,OpId
-
Table_2 : Date,OpId,EventName
And I have this business rule: If there is a record in Table_1 THEN at least 1 row should exist in the Table_2 for the Table_1.Date and Table.OpId.
If there is a row in Table_1 and if there is no row matching with that row in Table_2 THEN there is a broken data -whatever the reason-.
To find out the incorrect data, I use:
SELECT *
FROM table_1 t1
LEFT JOIN table_2 t2 ON t1.Date = t2.Date AND t1.OpId = t2.OpId
WHERE t2.OpId IS NULL -- So, if there is no
-- matching row in table_2 then this is a mistake
But it takes too long to have the query completed.
Can there be a faster or better way to approach similar scenerios?
To do an anti semi join
NOT EXISTSin SQL Server is usually better than or equal to in performance the other options (NOT IN,OUTER JOIN ... NULL,EXCEPT)See Left outer join vs NOT EXISTS. You may well be missing a useful index though.