I have a table ‘Objects’ present in SQL Server DB. It contains the names (string) of objects.
I have a list of names of new objects that need to be inserted in the ‘Objects’ table, in a separate table ‘NewObjects’. This operation will be referred as ‘import’ henceforward.
I need to generate a unique name for each record to be imported to ‘Objects’ from ‘NewObjects’, if the record name is already present in ‘Objects’. This new name will be stored in ‘NewObjects’ table against the old name.
DECLARE @NewObjects TABLE
(
...
Name varchar(20),
newName nvarchar(20)
)
I have implemented a stored procedures which generates unique name for each record to be imported from ‘NewObjects’. However, I am not happy with the performance for 1000 records (in ‘NewObjects’.)
I want help to optimize my code. Below is the implementation:
PROCEDURE [dbo].[importWithNewNames] @args varchar(MAX)
-- Sample of @args is like 'A,B,C,D' (a CSV string)
...
DECLARE @NewObjects TABLE
(
_index int identity PRIMARY KEY,
Name varchar(20),
newName nvarchar(20)
)
-- 'SplitString' function: this is a working implementation which is right now not concern of performance
INSERT INTO @NewObjects (Name)
SELECT * from SplitString(@args, ',')
declare @beg int = 1
declare @end int
DECLARE @oldName varchar(10)
-- get the count of the rows
select @end = MAX(_index) from @NewObjects
while @beg <= @end
BEGIN
select @oldName = Name from @NewObjects where @beg = _index
Declare @nameExists int = 0
-- this is our constant. We cannot change
DECLARE @MAX_NAME_WIDTH int = 5
DECLARE @counter int = 1
DECLARE @newName varchar(10)
DECLARE @z varchar(10)
select @nameExists = count(name) from Objects where name = @oldName
...
IF @nameExists > 0
BEGIN
-- create name based on pattern 'Fxxxxx'. Example: 'F00001', 'F00002'.
select @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
while EXISTS (select top 1 1 from Objects where name = @newName)
OR EXISTS (select top 1 1 from @NewObjects where newName = @newName)
BEGIN
select @counter = @counter + 1
select @newName = 'F' + REPLACE(STR(@counter, @MAX_NAME_WIDTH, 0), ' ', '0')
END
select top 1 @z = @newName from Objects
update @NewObjects
set newName = @z where @beg = _index
END
select @beg = @beg + 1
END
-- finally, show the new names generated
select * from @NewObjects
DISCLAIMER: I am in no position to test these recommendations therefore there may be syntax errors that you’ll have to work out on your own as you implement them. They are here as a guide to both fix this procedure but also aid you in growing your skill set for future projects.
One optimization just skimming through, that would become more prevalent as you iterated over larger sets, is this code here:
consider changing it to this:
Also, rather than doing this:
consider this:
that will ensure that you’re not iterating and doing multiple queries just to find the maximum integer value of the generated name.
Further, based on what little context I have, you should also be able to make one more optimization that will ensure you only have to do the aforementioned one time, ever.
The reason I say you can make that optimization is because unless you have to worry about other entities inserting records during this time that look like the ones you are (e.g. Fxxxxx), then you only have to find the MAX one time and can simply iterate
@counterover the loop.In fact, you could actually pull this entire piece out of the loop. You should be able to extrapolate that pretty easily. Just pull the
DECLAREandSETof@counterout along with the code inside theIF (@beg = 1). But take it one step at a time.Also, change this line:
to this:
because you are literally running a query to
SETtwo local variables. This is likely a huge cause for the performance issues. A good practice for you to get into is unless you’re actually setting a variable from aSELECTstatement, use theSEToperation for local variables. There are some other places in your code where this applies, consider this line:use this instead:
Finally, as stated above regarding simply iterating
@counter, at the end of the loop where you have this line:just add a line:
and you’re golden!
So to recap, you can gather the maximum conflicting name just one time so you’ll be getting rid of all those iterations. You’re going to start using
SETto get rid of performance ridden lines likeselect top 1 @z = @newName from Objectswhere you’re actually querying a table to set two local variables. And you’re going to leverage theEXISTSmethod instead of setting a variable that leveraged anAGGREGATEfunctionCOUNTto do that work.Let me know how these optimizations work.