I have a merge statement that needs to compare on many columns. The source table has 26,000 rows. The destination table has several million rows. The desintation table only has a typical Primary Key index on an int-type column.
I did some selects with group by to count the number of unique values in the source.
The test part of the Merge is
Merge Into desttable
Using #temptable
On
(
desttable.ColumnA = #temptable.ColumnA
and
desttable.ColumnB = #temptable.ColumnB
and
desttable.ColumnC = #temptable.ColumnC
and
desttable.ColumnD = #temptable.ColumnD
and
desttable.ColumnE = #temptable.ColumnE
and
desttable.ColumnF = #temptable.ColumnF
)
When Not Matched Then Insert Values (.......)
-- ColumnA: 167 unique values in #temptable
-- ColumnB: 1 unique values in #temptable
-- ColumnC: 13 unique values in #temptable
-- ColumnD: 89 unique values in #temptable
-- ColumnE: 550 unique values in #temptable
-- ColumnF: 487 unique values in #temptable
-- ColumnA: 3690 unique values in desttable
-- ColumnB: 3 unique values (plus null is possible) in desttable
-- ColumnC: 1113 unique values in desttable
-- ColumnD: 2662 unique values in desttable
-- ColumnE: 1770 unique values in desttable
-- ColumnF: 1480 unique values in desttable
The merge right now takes a very, very long time. I think I need to change my primary key but am not sure what the best tactic might be. 26,000 rows can be inserted on the first merge, but subsequent merges might only have ~2,000 inserts to do. Since I have no indexes and only a simple PK, everything is slow. 🙂
Can anyone point out how to make this better?
Thanks!
Well, an obvious candidate would be an index on the columns you use to do your matching in the
MERGEstatement – do you have an index on(ColumnA, ColumnB, ColumnC, ColumnD, ColumnE, ColumnF)on your destitation table??This tuple of columns is being used to determine whether or not a row from your source table already exists in the database. If you don’t have that index nor any other usable index in place, you get a table scan on the large destination table for each row in your source table, basically.
If not: I would try to add it and then see how the runtime behavior changes. Does the
MERGEnow run a little less than a very, very long time??