Example scenario:
TABLE_A contains a column called ID and also contains duplicate rows. There is another table called ID_TABLE that contains IDs. Assuming no duplicates in ID_TABLE –
If I do:
SELECT * FROM TABLE_A
INNER JOIN ID_TABLE ON ID_TABLE.ID = TABLE_A.ID
There will be duplicates in the result set. However, if I do:
SELECT * FROM TABLE_A
WHERE TABLE_A.ID IN (SELECT ID_TABLE.ID FROM ID_TABLE)
There will not be any duplicates in the result set.
Does anyone know why the JOIN clause allows duplicates while the IN clause does not? I had thought they did the same thing.
Thanks
It’s not that it’s allowing duplicates. By joining the two tables, you are creating a product from table 1 and table 2, so if TABLE_A has two records for ID=1 and ID_Table has 1 record, the resulting product is two records. Using IN doesn’t cause a multiplication of records, even if the value is listed in the IN clause multiple times as you are only getting the unique records matching the values within the IN clause.