Suppose I have 2 tables, each tables has N columns. There are NO duplicate rows in table1
And now we want to know what datasets in table2 (including duplicates) are also contained in table1.
I tried
select * from table1
intersect
select * from table2
But this only gives me unique rows that are in both tables. But I don’t want unique rows, are want to see all rows in table2 that are in table1…
Keep in mind!! I cannot do
select *
from table1 a, table b
where a.table1col = b.table2col
…because I don’t know the number of columns of the tables at runtime.
Sure I could do something with dynamic SQL and iterate over the column numbers but I’m asking this precisely because it seems too simple a query for that kind of stuff..
Example:
create table table1 (table1col int)
create table table2 (table2col int)
insert into table1 values (8)
insert into table1 values (7)
insert into table2 values (1)
insert into table2 values (8)
insert into table2 values (7)
insert into table2 values (7)
insert into table2 values (2)
insert into table2 values (9)
I want my query then to return:
8
7
7
If the amount of columns is not know, you will have to resort to a value computed over a row to make a match.
One such function is
CHECKSUM.SQL Statement
Note that
CHECKSUMmight introduce collisions. You will have to test for that before doing any operation on your data.Edit
In case you are using
SQL Server 2005, you might make this a bit more robust by throwing inHASH_BYTES.The downside of
HASH_BYTESis that you need to specify the columns on which you want to operate but for all the columns you do known up-front, you could use this to prevent collisions.