There is quite often situation when you need to execute INSERT, UPDATE or DELETE statement based on some condition. And my question is whether the affect on the performance of the query add IF EXISTS before the command.
Example
IF EXISTS(SELECT 1 FROM Contacs WHERE [Type] = 1)
UPDATE Contacs SET [Deleted] = 1 WHERE [Type] = 1
What about INSERTs or DELETEs?
I’m not completely sure, but I get the impression that this question is really about upsert, which is the following atomic operation:
UPDATEthe target;INSERTthe row into the target;DELETEthe row from the target.Developers-turned-DBAs often naïvely write it row-by-row, like this:
This is just about the worst thing you can do, for several reasons:
It has a race condition. The row can disappear between
IF EXISTSand the subsequentDELETEorUPDATE.It’s wasteful. For every transaction you have an extra operation being performed; maybe it’s trivial, but that depends entirely on how well you’ve indexed.
Worst of all – it’s following an iterative model, thinking about these problems at the level of a single row. This will have the largest (worst) impact of all on overall performance.
One very minor (and I emphasize minor) optimization is to just attempt the
UPDATEanyway; if the row doesn’t exist,@@ROWCOUNTwill be 0 and you can then “safely” insert:Worst-case, this will still perform two operations for every transaction, but at least there’s a chance of only performing one, and it also eliminates the race condition (kind of).
But the real issue is that this is still being done for each row in the source.
Before SQL Server 2008, you had to use an awkward 3-stage model to deal with this at the set level (still better than row-by-row):
As I said, performance was pretty lousy on this, but still a lot better than the one-row-at-a-time approach. SQL Server 2008, however, finally introduced MERGE syntax, so now all you have to do is this:
That’s it. One statement. If you’re using SQL Server 2008 and need to perform any sequence of
INSERT,UPDATEandDELETEdepending on whether or not the row already exists – even if it’s just one row – there is no excuse not to be usingMERGE.You can even
OUTPUTthe rows affected by aMERGEinto a table variable if you need to find out afterward what was done. Simple, fast, and risk-free. Do it.