I’m trying to clean up some data in SQL server and add a foreign key between the two tables.
I have a large quantity of orphaned rows in one of the tables that I would like to delete. I don’t know why the following query would return 0 rows in MS SQL server.
–This Query returns no Rows
select * from tbl_A where ID not in ( select distinct ID from tbl_B
)
When I include IS NOT NULL in the subquery I get the results that I expect.
— Rows are returned that contain all of the records in tbl_A but Not in tbl_B
select * from tbl_A where ID not in ( select distinct ID from tbl_B
where ID is not null )
The ID column is nullable and does contain null values. IF I run just the subquery I get the exact same results except the first query returns one extra NULL row as expected.
This is the expected behavior of the
NOT INsubquery. When a subquery returns a singlenullvalueNOT INwill not match any rows.If you don’t exclusively want to do a
nullcheck, then you will want to useNOT EXISTS:As to why the
NOT INis causing issues, here are some posts that discuss it:NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL
NOT EXISTS vs NOT IN
What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL?