I want a split function in SQL server. I came across this thread: Cannot find either column "dbo" or the user-defined function or aggregate "dbo.Splitfn", or the name is ambiguous
and I feel it is doing too many calculations using index etc. I wrote this function:
ALTER FUNCTION [dbo].[Split]
(
@Data varchar(8000),
@Delimter char(1) = ','
)
RETURNS @RetVal TABLE
(
Data varchar(8000)
)
AS
Begin
Set @Data = RTrim(Ltrim(IsNull(@Data,'')))
Set @Delimter = IsNull(@Delimter,',')
If Substring(@Data,Len(@Data),1) <> @Delimter
Begin
Set @Data = @Data + @Delimter
End
Declare @Len int = Len(@Data)
Declare @index int = 1
Declare @Char char(1) = ''
Declare @part varchar(8000) = ''
While @index <= @Len
Begin
Set @Char = Substring(@Data,@index,1)
If @Char = @Delimter And @part <> ''
Begin
Insert into @RetVal Values (@part)
Set @part = ''
End
Else
Begin
Set @part = @part + @Char
End
Set @index = @index + 1
End
RETURN;
End
Can anybody comment which one is efficient? I will be using this function too much for splitting the data for one of my scrapping application and I want this to be efficient. Also please mention how did you measure it’s efficiency.
For some discussions on different string splitting methods and their efficiency, I tend to try to get people to stop trying to do this in T-SQL. You can spend hours fighting with inefficient functions to try and squeeze a few extra microseconds out of them, but it’s an exercise in futility. T-SQL is inherently slow at this task and it’s much better to go outside of T-SQL – either by using CLR (2005) or Table-Valued Parameters (TVPs) (2008+). I recently published a three-part series on this that is likely worth a read, and I suspect you’ll come to the same conclusions I did (CLR is good, TVPs are better, and all T-SQL methods just look silly in comparison):
http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings
http://www.sqlperformance.com/2012/08/t-sql-queries/splitting-strings-follow-up
http://www.sqlperformance.com/2012/08/t-sql-queries/splitting-strings-now-with-less-t-sql
Well, you can do what I did in those articles, select
SYSDATETIME()before and after you run each test, and then calculate the difference. You can also log to a table before and after each test, or use Profiler to capture , or surround your test with:You’ll get output in the messages pane like:
Finally, you can use our free tool, SQL Sentry Plan Explorer. (Disclaimer: I work for SQL Sentry.)
You can feed any query into Plan Explorer, generate an actual plan, and in addition to a graphical plan that is much more readable than the showplan put out my Management Studio, you also get runtime metrics such as duration, CPU and reads. So you can run two queries and compare them side by side without doing any of the above: