I am assuming this is the most common scenario with everyone who is working with SQL Server.
Scenario:
I have these tables tabSRC_A(id,date,data1), tabSRC_B(id,Date,data2) and tabDEST
Now my task is to get the data from tableSRC_A, tableSRC_B apply some filtering and cleanup on them and insert them into tabDEST.
I am doing this using the following code
insert into tabDest(id, Date, Data1, Data2)
Select id, date, Data1, Data2
from tabSRC_A A
inner join tabSRC_B B on A.id = B.id and A.date = B.date
where not exists
(select * from tabDest Dest
where Dest.id = B.id and Dest.date = B.date)
and I am updating if already exists
Is this the best solution for this operation?
The size of the tables are 10 million rows each
I was also thinking about creating a view with surrogate key and perform a check based on the id instead of checking every row using the above method
Something like this
insert into tabDest(id, Date, Data1, Data2)
Select id, date, Data1, Data2
from view_Created_From_TabA_TabB_adding_a_SurrogateKey_Kid SV
where SV.Kid > select (max(id) from tabDest)
I am assuming this would be much faster.
Please guide me with any suggestions you have.
(I’m using SQL Server 2000, I know its very old)
Have you tried a LEFT join to detect not exists?